-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FEATURE]: gemini能否支持梯度累计 #4590
Comments
Title: [FEATURE]: Can gemini support gradient accumulation Describe the featureTo train a large model, increasing the batchsize is helpful for training stability. For a particularly large model, the batchsize is limited, and the purpose of increasing the batchsize can only be achieved through gradient accumulation. Can gemini support gradient accumulation? |
您好,这个月有计划开始实现这个功能 |
Hello, there is a plan to start implementing this feature this month |
您好,gemini对梯度累积的支持已经完成。 |
Hello, gemini’s support for gradient accumulation has been completed. |
Describe the feature
训练大模型,增大batchsize有助于训练的稳定性,对于特别大的模型batchsize大小受限,只能通过梯度累积的方式实现增大batchsize的目的,gemini能否实现支持梯度累积呢?
The text was updated successfully, but these errors were encountered: