Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Minimax strategy and single-step adversarial training #70

Open
zixuan-go opened this issue Aug 26, 2024 · 0 comments
Open

Minimax strategy and single-step adversarial training #70

zixuan-go opened this issue Aug 26, 2024 · 0 comments

Comments

@zixuan-go
Copy link

zixuan-go commented Aug 26, 2024

I noticed that the code calculates loss1 and loss2 in the same computational graph and updates them simultaneously, which is intuitively difficult to correspond to the Minimax strategy. Moreover, the test indicators based on single-step training cannot correspond to those in the paper (F1 -0.0035).

I tried to use asynchronous alternating updates, updating loss1 first and then loss2, i.e. the max-min strategy (Explanation here). I got similar experimental results to those in the paper (F1 +0.0001), which is better than single-step training.

Also trying to update loss2 first and then loss1, i.e. the min-max strategy, I got results that were better than the single step (F1 +0.0008) but worse than the max-min strategy (F1 -0.0028).

*All training was manually interrupted after only two epochs and dataset is SMD. It is not ruled out that the single-step training effect may be better after multiple epochs, but it is currently observed that the max-min strategy converges faster but takes longer to train.

Can someone explain this observation and the intuition behind using single-step training?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants
@zixuan-go and others