Minimax strategy and single-step adversarial training #70

zixuan-go · 2024-08-26T05:41:00Z

I noticed that the code calculates loss1 and loss2 in the same computational graph and updates them simultaneously, which is intuitively difficult to correspond to the Minimax strategy. Moreover, the test indicators based on single-step training cannot correspond to those in the paper (F1 -0.0035).

I tried to use asynchronous alternating updates, updating loss1 first and then loss2, i.e. the max-min strategy (Explanation here). I got similar experimental results to those in the paper (F1 +0.0001), which is better than single-step training.

Also trying to update loss2 first and then loss1, i.e. the min-max strategy, I got results that were better than the single step (F1 +0.0008) but worse than the max-min strategy (F1 -0.0028).

*All training was manually interrupted after only two epochs and dataset is SMD. It is not ruled out that the single-step training effect may be better after multiple epochs, but it is currently observed that the max-min strategy converges faster but takes longer to train.

Can someone explain this observation and the intuition behind using single-step training?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Minimax strategy and single-step adversarial training #70

Minimax strategy and single-step adversarial training #70

zixuan-go commented Aug 26, 2024 •

edited

Loading

Minimax strategy and single-step adversarial training #70

Minimax strategy and single-step adversarial training #70

Comments

zixuan-go commented Aug 26, 2024 • edited Loading

zixuan-go commented Aug 26, 2024 •

edited

Loading