Fix Cuda offloading in llava #3621

monatis · 2023-10-14T00:16:01Z

I simply forgot to set n_gpu_layers when loading the model. This should fix it.

KerfuffleV2

This looks pretty straightforward. Tested, seems to work (with ROCM even). The text generates much faster when offloading, as expected.

"The image features a white fox sitting on the ground, with its mouth wide open, possibly yawning or growling. The fox appears to be in a forest setting, surrounded by grass and trees. The scene is depicted in a black and white style, giving it a classic and timeless feel." About my profile picture. It's supposed to be a wolf cub, but still pretty impressive!

* 'master' of github.com:ggerganov/llama.cpp: fix embeddings when using CUDA (ggerganov#3657) llama : avoid fprintf in favor of LLAMA_LOG (ggerganov#3538) readme : update hot-topics & models, detail windows release in usage (ggerganov#3615) CLBlast: Fix temporary buffer size for f16 conversion (wsize) train-text-from-scratch : fix assert failure in ggml-alloc (ggerganov#3618) editorconfig : remove trailing spaces server : documentation of JSON return value of /completion endpoint (ggerganov#3632) save-load-state : fix example + add ci test (ggerganov#3655) readme : add Aquila2 links (ggerganov#3610) tokenizer : special token handling (ggerganov#3538) k-quants : fix quantization ranges (ggerganov#3646) llava : fix tokenization to not add bos between image embeddings and user prompt (ggerganov#3645) MPT : support GQA for replit-code-v1.5 (ggerganov#3627) Honor -ngl option for Cuda offloading in llava (ggerganov#3621)

Honor -ngl option for Cuda offloading in llava

932589c

KerfuffleV2 mentioned this pull request Oct 14, 2023

multimodal - Improve LLaVA model accuracy and performance #3602

Closed

KerfuffleV2 approved these changes Oct 14, 2023

View reviewed changes

KerfuffleV2 merged commit 11dc109 into master Oct 14, 2023
35 of 40 checks passed

aiaicode mentioned this pull request Nov 20, 2023

Implement multimodal models (LLaVA) #3436

Merged

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix Cuda offloading in llava #3621

Fix Cuda offloading in llava #3621

monatis commented Oct 14, 2023 •

edited

Loading

KerfuffleV2 left a comment

Fix Cuda offloading in llava #3621

Fix Cuda offloading in llava #3621

Conversation

monatis commented Oct 14, 2023 • edited Loading

KerfuffleV2 left a comment

Choose a reason for hiding this comment

monatis commented Oct 14, 2023 •

edited

Loading