Skip to content
This repository has been archived by the owner on Oct 16, 2023. It is now read-only.

refactor tp load checkpoint #114

Merged
merged 3 commits into from
Aug 22, 2022
Merged

Conversation

ver217
Copy link
Member

@ver217 ver217 commented Aug 22, 2022

Serialize a large state dict is slow. Instead of scatter object list, we scatter tensor directly. This can speed up loading state dict. When tp=4, pp=1, loading time can be reduced to about 40s.

@dujiangsu dujiangsu merged commit 46bae67 into hpcaitech:example/opt Aug 22, 2022
@ver217 ver217 deleted the refactor/tp-ckpt branch August 22, 2022 08:27
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants