Not able to see the updated best_model.pth or best_model_x.pth after using continue_path #340
Replies: 1 comment 4 replies
-
In theory it should restore the previous best loss and continue saving those checkpoints, but I'd have to double check that it's working correctly. In the meantime you can use the regular |
Beta Was this translation helpful? Give feedback.
4 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I continued the training the fastpitch model that I started from scratch on a custom dataset using --continue_path argument. A best model was created at step 1325347 during first training from scratch and since then it is the only best model although the training is now at step 1492000 (~90 epoch's later). I'm using
"save_step": 10000,
The overall evaluation loss (avg_loss) seems to be reducing i.e., 2.84 at step 1325347 to 2.76 at step 1492000. I have observed the overriding of best_model_x.pth files while training the model from scratch, but that behavior is not observed after resuming the training using --continue_path argument.
FYI, while resuming the training, i made few changes in the configuration file i.e.,
Had i made any mistakes? Am i missing anything @eginhard ?
Beta Was this translation helpful? Give feedback.
All reactions