Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Functionality for partial model loading #228

Open
blester125 opened this issue Feb 8, 2024 · 3 comments
Open

Functionality for partial model loading #228

blester125 opened this issue Feb 8, 2024 · 3 comments
Labels
enhancement New feature or request

Comments

@blester125
Copy link
Collaborator

The other day @dptam was trying to use Git-Theta to partially load models (only loading/saving a subset of parameters). This is easy to do with git-theta but there isn't a user-facing way to do it. We should add that. The question is mostly what should the API be for that? Should there be a list of parameter names to load, a regex, etc? Thoughts @nkandpa2 @craffel?

Also, the merge script currently loads parameters as needed during merges, but it keeps them around in memory after the merge. Thus, we can merge to really large models if only a few parameters are changed, but not if all the parameters are going to be merged. We should add the ability to save merged parameters to disk (with the ability to clean them up if the merge is aborted) and free parameter memory so enable the merging of really big models.

@blester125 blester125 added the enhancement New feature or request label Feb 8, 2024
@blester125 blester125 changed the title Functionality to partial model loading Functionality for partial model loading Feb 8, 2024
@craffel
Copy link
Contributor

craffel commented Feb 8, 2024

This seems like it'd be checkpoint format-dependent, right? I.e. whether a given checkpoint format supports reading only a subset of parameters? Git-Theta's native format would allow this, but I'm not sure there would be a clean way to do this from the command line (since the command line interface generally assumes that we are going to operate on a whole checkpoint file). If the goal is just to do it from Python, we could use the new save_to_git/load_from_git functions for this?

@blester125
Copy link
Collaborator Author

Yeah, sorry, I should have been more specific, I was imagining this as something from python. Basically trying to add light-weight functionality that's like "I have my model saved in Git-Theta, what does that get me?"

@craffel
Copy link
Contributor

craffel commented Feb 10, 2024

I see, do you think there would be a lightweight way to add this to the new save_to_git/load_from_git functions?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants