Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove hook torch.nan_to_num(x) #8826

Merged
merged 2 commits into from
Aug 1, 2022
Merged

Remove hook torch.nan_to_num(x) #8826

merged 2 commits into from
Aug 1, 2022

Conversation

glenn-jocher
Copy link
Member

@glenn-jocher glenn-jocher commented Aug 1, 2022

Observed erratic training behavior (green line) with the nan_to_num hook (introduced in #8598) in classifier branch. I'm going to remove it from master.

🛠️ PR Summary

Made with ❤️ by Ultralytics Actions

🌟 Summary

Improvement to the training stability by adjusting layer freeze behavior in the YOLOv5 model.

📊 Key Changes

  • The hook that converts NaN (not a number) values to zero during training has been commented out.

🎯 Purpose & Impact

  • Purpose: To address and possibly resolve erratic training results that may be caused by the previous method of handling NaN values.
  • Impact: Users should experience more stable and reliable training, although they might need to monitor the training process for NaN values which are no longer being automatically converted to zeros.

Observed erratic training behavior (green line) with the nan_to_num hook in classifier branch. I'm going to remove it from master.
@glenn-jocher glenn-jocher linked an issue Aug 1, 2022 that may be closed by this pull request
2 tasks
@glenn-jocher glenn-jocher merged commit f3c78a3 into master Aug 1, 2022
@glenn-jocher glenn-jocher deleted the glenn-jocher-patch-1 branch August 1, 2022 19:39
ctjanuhowski pushed a commit to ctjanuhowski/yolov5 that referenced this pull request Sep 8, 2022
* Remove hook `torch.nan_to_num(x)`

Observed erratic training behavior (green line) with the nan_to_num hook in classifier branch. I'm going to remove it from master.

* Update train.py
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

NaNs and INFs in gradient values
1 participant