Support all types of GGUF metadata #1525
Open
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This is a new PR after I had to force-push. See the previous PR for more details: #1497
@abetlen I would appreciate your input on this PR and any further work that may be required
Relevant issue: #1495:
_LlamaModel.metadata()
does not returntokenizer.ggml.tokens
This PR implements support for reading arrays from GGUF metadata from GGUFv2 and GGUFv3 files according to Georgi Gerganov's format spec for GGUF. I've tested it and it works with GGUFv2 and GGUFv3 models and all types of metadata.
The API is unchanged - just call the
_LlamaModel.metadata()
method the same as before.I also changed the way metadata is displayed when loading a model with
verbose=True
, because some arrays in metadata can be hundreds of thousands of items long (vocabulary, etc). So now each key and value is printed on its own line, and any value over 60 characters is truncated with...
.Example output when loading a model:
Click to expand full output