convert : refactor rope_freqs generation #9396

compilade · 2024-09-10T00:59:15Z

This isolates handling of generated tensors like rope_freqs for Llama3, Phi-3 and Exaone. This should also fix --vocab-only conversion for Phi-3-128k and Phi-3.5 (which previously generated invalid GGUF files because they included a non-zero tensor count while not including any tensor data).

Note that this will also be relevant for MiniCPM3 (#9322), which re-uses the misbehaving Phi-3 rope tensors insertion.

TODO

Test Phi-3-mini-128k LoRA
Test Phi-3-mini-128k vocab-only validity (with llama-tokenize too)
Test Phi-3-mini-128k conversion
Test MiniCPM3 conversion
Test MiniCPM3 vocab-only validity
Test Llama-3.1 LoRA
Test Llama-3.1 conversion
- includes rope_freqs.weight, and has the same checksum as a previous conversion I did a while ago with the same checkout of the upstream model
Test Llama-3.1 vocab-only (also with llama-tokenize)

I have read the contributing guidelines
Self-reported review complexity:
- Low

This should also fix vocab-only conversion for Phi-3.

ngxson

LGTM. Thanks for the implementation!

ngxson · 2024-09-10T07:56:39Z

Btw, you can use scripts/test-lora-conversion-inference.sh. It will perform end-to-end conversion-inference test for lora gemma2, phi3 and llama arch (note: these are toy models, so they don't have rope_freqs)

ThiloteE · 2024-09-10T11:17:45Z

Related to issue Support for Phi-3 models #6849

ggerganov · 2024-09-16T06:49:37Z

Should we merge this, or wait for the rest of the tests in OP to be confirmed?

ngxson · 2024-09-16T07:08:50Z

I ran the test locally and can confirm that it passes. Let's wait for final confirmation from @compilade to merge this.

compilade · 2024-09-16T11:17:31Z

Should we merge this, or wait for the rest of the tests in OP to be confirmed?

Since #9322 was merged, MiniCPM3's conversion also has to be updated before merging this. I'll update it today.

MiniCPM3's tokenizer is treated as a SentencePiece tokenizer to avoid having to run its custom Python code which mixes tokenization in the same file as tool calls. gguf-py : add long and short RoPE factors to tensor mappings Empty, but the key names are used to populate the mappings.

bioinformatist · 2024-09-18T07:25:06Z

@compilade Hey bro, when I try to convert Minicpm3 to .gguf format, it says:

The repository for /models contains custom code which must be executed to correctly load the model. You can inspect the repository content at https://hf.co//models.
You can avoid this prompt in future by passing the argument `trust_remote_code=True`.

Will it be implemented by this PR? Thanks!

compilade · 2024-09-18T10:39:34Z

@compilade Hey bro, when I try to convert Minicpm3 to .gguf format, it says:
The repository for /models contains custom code which must be executed to correctly load the model. You can inspect the repository content at https://hf.co//models.
You can avoid this prompt in future by passing the argument `trust_remote_code=True`.
Will it be implemented by this PR? Thanks!

@bioinformatist

Yes, actually, in e83d270 I've changed how MiniCPM3's tokenizer is loaded to exactly avoid that custom code. It uses SentencePiece directly instead. I think it results in the same model files, but I didn't test that yet because I can't really run the custom tokenization code since it repends on datamodel_code_generator which is not in Nixpkgs.

That was a single line change in set_vocab.

bioinformatist · 2024-09-18T14:04:39Z

Got that, as we have been turning to NixOS (especially for production use) these days. Hope everything goes well! ❤️

convert : refactor rope_freqs generation

141dd55

This should also fix vocab-only conversion for Phi-3.

compilade added refactoring Refactoring bugfix fixes an issue or bug python python script changes labels Sep 10, 2024

compilade requested a review from ngxson September 10, 2024 00:59

compilade added the Review Complexity : Low Trivial changes to code that most beginner devs (or those who want a break) can tackle. e.g. UI fix label Sep 10, 2024

ngxson approved these changes Sep 10, 2024

View reviewed changes

ThiloteE mentioned this pull request Sep 10, 2024

Bug: phi 3.5 mini produces garbage past 4096 context #9127

Open

ggerganov approved these changes Sep 12, 2024

View reviewed changes

bioinformatist mentioned this pull request Sep 12, 2024

Support MiniCPM3. #9322

Merged

4 tasks

compilade added 2 commits September 16, 2024 12:01

Merge branch 'master' into compilade/convert-separate-extra-tensors

ed0f2c4

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

convert : refactor rope_freqs generation #9396

convert : refactor rope_freqs generation #9396

compilade commented Sep 10, 2024 •

edited

Loading

ngxson left a comment

ngxson commented Sep 10, 2024 •

edited

Loading

ThiloteE commented Sep 10, 2024

ggerganov commented Sep 16, 2024

ngxson commented Sep 16, 2024

compilade commented Sep 16, 2024 •

edited

Loading

bioinformatist commented Sep 18, 2024 •

edited

Loading

compilade commented Sep 18, 2024

bioinformatist commented Sep 18, 2024

convert : refactor rope_freqs generation #9396

Are you sure you want to change the base?

convert : refactor rope_freqs generation #9396

Conversation

compilade commented Sep 10, 2024 • edited Loading

TODO

ngxson left a comment

Choose a reason for hiding this comment

ngxson commented Sep 10, 2024 • edited Loading

ThiloteE commented Sep 10, 2024

ggerganov commented Sep 16, 2024

ngxson commented Sep 16, 2024

compilade commented Sep 16, 2024 • edited Loading

bioinformatist commented Sep 18, 2024 • edited Loading

compilade commented Sep 18, 2024

bioinformatist commented Sep 18, 2024

compilade commented Sep 10, 2024 •

edited

Loading

ngxson commented Sep 10, 2024 •

edited

Loading

compilade commented Sep 16, 2024 •

edited

Loading

bioinformatist commented Sep 18, 2024 •

edited

Loading