From 5883f5fe1064104f855b5770ddad675745611679 Mon Sep 17 00:00:00 2001 From: Victor Oluwadare <111367022+Victoran0@users.noreply.github.com> Date: Wed, 9 Oct 2024 21:13:21 +0100 Subject: [PATCH] docs: Add cuda 12.5 to README.md (updated). Under the Chat Completion section, about the fourth line under it, we have: 'The model will will format the messages into a single prompt using the following order of precedence:'... 'will' appeared twice. This update fix that. --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index dbaec5077..a4744564e 100644 --- a/README.md +++ b/README.md @@ -337,7 +337,7 @@ The high-level API also provides a simple interface for chat completion. Chat completion requires that the model knows how to format the messages into a single prompt. The `Llama` class does this using pre-registered chat formats (ie. `chatml`, `llama-2`, `gemma`, etc) or by providing a custom chat handler object. -The model will will format the messages into a single prompt using the following order of precedence: +The model will format the messages into a single prompt using the following order of precedence: - Use the `chat_handler` if provided - Use the `chat_format` if provided - Use the `tokenizer.chat_template` from the `gguf` model's metadata (should work for most new models, older models may not have this)