Skip to content

Commit

Permalink
docs: Add cuda 12.5 to README.md (updated).
Browse files Browse the repository at this point in the history
Under the Chat Completion section, about the fourth line under it, we have: 'The model will will format the messages into a single prompt using the following order of precedence:'...
'will' appeared twice. This update fix that.
  • Loading branch information
Victoran0 authored Oct 9, 2024
1 parent 7c4aead commit 5883f5f
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -337,7 +337,7 @@ The high-level API also provides a simple interface for chat completion.
Chat completion requires that the model knows how to format the messages into a single prompt.
The `Llama` class does this using pre-registered chat formats (ie. `chatml`, `llama-2`, `gemma`, etc) or by providing a custom chat handler object.

The model will will format the messages into a single prompt using the following order of precedence:
The model will format the messages into a single prompt using the following order of precedence:
- Use the `chat_handler` if provided
- Use the `chat_format` if provided
- Use the `tokenizer.chat_template` from the `gguf` model's metadata (should work for most new models, older models may not have this)
Expand Down

0 comments on commit 5883f5f

Please sign in to comment.