Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEATURE]: gemini能否支持梯度累计 #4590

Closed
zryowen123 opened this issue Sep 1, 2023 · 5 comments
Closed

[FEATURE]: gemini能否支持梯度累计 #4590

zryowen123 opened this issue Sep 1, 2023 · 5 comments
Labels
enhancement New feature or request

Comments

@zryowen123
Copy link

Describe the feature

训练大模型,增大batchsize有助于训练的稳定性,对于特别大的模型batchsize大小受限,只能通过梯度累积的方式实现增大batchsize的目的,gemini能否实现支持梯度累积呢?

@zryowen123 zryowen123 added the enhancement New feature or request label Sep 1, 2023
@Issues-translate-bot
Copy link

Bot detected the issue body's language is not English, translate it automatically. 👯👭🏻🧑‍🤝‍🧑👫🧑🏿‍🤝‍🧑🏻👩🏾‍🤝‍👨🏿👬🏿


Title: [FEATURE]: Can gemini support gradient accumulation

Describe the feature

To train a large model, increasing the batchsize is helpful for training stability. For a particularly large model, the batchsize is limited, and the purpose of increasing the batchsize can only be achieved through gradient accumulation. Can gemini support gradient accumulation?

@ver217
Copy link
Member

ver217 commented Sep 5, 2023

您好,这个月有计划开始实现这个功能

@Issues-translate-bot
Copy link

Bot detected the issue body's language is not English, translate it automatically. 👯👭🏻🧑‍🤝‍🧑👫🧑🏿‍🤝‍🧑🏻👩🏾‍🤝‍👨🏿👬🏿


Hello, there is a plan to start implementing this feature this month

@Fridge003
Copy link
Contributor

Fridge003 commented Oct 17, 2023

您好,gemini对梯度累积的支持已经完成。
使用方法可以参考 docs/source/en/features/gradient_accumulation_with_booster.md(英文文档)/ docs/source/zh-Hans/features/gradient_accumulation_with_booster.md (中文文档),我们的在线tutorials也会在几天后更新

@Issues-translate-bot
Copy link

Bot detected the issue body's language is not English, translate it automatically. 👯👭🏻🧑‍🤝‍🧑👫🧑🏿‍🤝‍🧑🏻👩🏾‍🤝‍👨🏿👬🏿


Hello, gemini’s support for gradient accumulation has been completed.
For usage methods, please refer to docs/source/en/features/gradient_accumulation_with_booster.md (English document) / (docs/source/zh-Hans/features/gradient_accumulation_with_booster.md) (Chinese document). Our online tutorials will also be available in a few days. renew

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants