[FEATURE]: Can gradient accumulation be used in the pretraining of llama2-70b? #4785

Pty72 · 2023-09-23T12:30:57Z

Describe the feature

Can gradient accumulation be used in the pretraining of llama2-70b?
And if so, how can it be enabled？

Fridge003 · 2023-10-17T06:13:53Z

Hello, we have just supported gradient accumulation on Gemini plugin.
Usage can be referred to docs/source/en/features/gradient_accumulation_with_booster.md, our online tutorials will also be updated within a few days.

Pty72 added the enhancement New feature or request label Sep 23, 2023

Pty72 changed the title ~~[FEATURE]: "Can gradient accumulation be used in the pretraining of llama2-70b?"~~ [FEATURE]: Can gradient accumulation be used in the pretraining of llama2-70b? Sep 23, 2023

Fridge003 mentioned this issue Sep 27, 2023

[gemini] support gradient accumulation #4806

Closed

Fridge003 mentioned this issue Oct 8, 2023

[gemini] support gradient accumulation #4869

Merged

10 tasks

Fridge003 closed this as completed Oct 17, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FEATURE]: Can gradient accumulation be used in the pretraining of llama2-70b? #4785

[FEATURE]: Can gradient accumulation be used in the pretraining of llama2-70b? #4785

Pty72 commented Sep 23, 2023

Fridge003 commented Oct 17, 2023 •

edited

Loading

[FEATURE]: Can gradient accumulation be used in the pretraining of llama2-70b? #4785

[FEATURE]: Can gradient accumulation be used in the pretraining of llama2-70b? #4785

Comments

Pty72 commented Sep 23, 2023

Describe the feature

Fridge003 commented Oct 17, 2023 • edited Loading

Fridge003 commented Oct 17, 2023 •

edited

Loading