GGUF support #441

BruceMacD · 2023-08-29T22:02:08Z

This change adds support for running GGUF models which are currently in beta with llama.cpp. We will continue to run GGML models and this transition will be seamless to users.

Adds a llama.cpp mainline submodule which runs GGUF models
Dynamically select the right runner for the model type
Moved a some code to different files

./ollama run gguf-codellama hello world

This is your first interaction with me. I am a bot, and I am created by you. Please ask me any questions you would like answered.

As mentioned in #423

BruceMacD · 2023-08-29T22:04:16Z

llm/llm.go

-	case ModelFamilyLlama:
+	switch mf.Name() {
+	case "gguf":
+		opts.NumGQA = 0 // TODO: remove this when llama.cpp runners differ enough to need separate newLlama functions


This is the one difference in the interface between the current mainline llama.cpp and the ggml runner. Im keeping them on the same running logic in hopes that the interface will stay roughly the same. We will see if that changes.

llm/ggml.go

llm/model.go

llm/ggml.go

pdevine

Looks good... it was slightly confusing trying to figure out the differences between GGUF and GGML. I'm guessing there aren't a lot of differences between the two though.

pdevine · 2023-09-01T22:47:44Z

llm/ggml.go

 )

 type ModelFamily string

+const ModelFamilyUnknown ModelFamily = "unknown"


Are these (and the constants below here) generic, or specific to GGML?

https://github.com/ggerganov/llama.cpp/blob/d59bd97065cd7ded6c4ecab54b1d5e0b1b11e318/llama.cpp#L1661

Its the number of layers, I dont believe this is GGML specific

pdevine · 2023-09-01T22:57:58Z

llm/ggml.go

 	}

 	c.version = version
-	return nil
+	return nil, nil
 }

 type containerGGJT struct {


I just went down the rabbit hole on the GGJT controversy. Wow.

spill the tea?

llm/llama.cpp/generate.go

llm/llama.go

f0rodo · 2023-09-06T07:38:39Z

🚀

llm/llama.cpp/generate_darwin_amd64.go

llm/llama.cpp/generate_darwin_arm64.go

jmorganca · 2023-09-07T03:48:54Z

Overall looks great! Left a small comment on the cmake generate commands

BruceMacD commented Aug 29, 2023

View reviewed changes

BruceMacD mentioned this pull request Aug 30, 2023

wip: decode gguf #400

Closed

BruceMacD force-pushed the brucemacd/gguf branch from 53efb91 to 44d53db Compare August 30, 2023 16:02

BruceMacD force-pushed the brucemacd/server-shell branch from acb4d2a to d6ca778 Compare August 30, 2023 20:31

Base automatically changed from brucemacd/server-shell to main August 30, 2023 20:35

BruceMacD force-pushed the brucemacd/gguf branch from 44d53db to 78bca49 Compare August 30, 2023 20:49

mxyng reviewed Aug 31, 2023

View reviewed changes

llm/ggml.go Outdated Show resolved Hide resolved

BruceMacD force-pushed the brucemacd/gguf branch from 78bca49 to 5d3007a Compare August 31, 2023 15:01

mxyng reviewed Aug 31, 2023

View reviewed changes

llm/model.go Outdated Show resolved Hide resolved

BruceMacD force-pushed the brucemacd/gguf branch from 0b6c1e0 to 647678f Compare August 31, 2023 15:10

mxyng force-pushed the brucemacd/gguf branch from 1d6842c to c7cbde7 Compare August 31, 2023 21:01

mxyng reviewed Aug 31, 2023

View reviewed changes

llm/ggml.go Outdated Show resolved Hide resolved

pdevine reviewed Sep 1, 2023

View reviewed changes

mxyng reviewed Sep 2, 2023

View reviewed changes

llm/llama.cpp/generate.go Show resolved Hide resolved

mxyng reviewed Sep 2, 2023

View reviewed changes

llm/llama.go Show resolved Hide resolved

BruceMacD force-pushed the brucemacd/gguf branch 2 times, most recently from beca647 to a6f17b1 Compare September 5, 2023 20:27

BruceMacD force-pushed the brucemacd/gguf branch from 03f4738 to 4c8bb99 Compare September 6, 2023 17:41

jmorganca reviewed Sep 7, 2023

View reviewed changes

llm/llama.cpp/generate_darwin_amd64.go Outdated Show resolved Hide resolved

jmorganca reviewed Sep 7, 2023

View reviewed changes

llm/llama.cpp/generate_darwin_amd64.go Outdated Show resolved Hide resolved

jmorganca reviewed Sep 7, 2023

View reviewed changes

llm/llama.cpp/generate_darwin_arm64.go Outdated Show resolved Hide resolved

jmorganca approved these changes Sep 7, 2023

View reviewed changes

BruceMacD and others added 6 commits September 7, 2023 11:09

gguf support

36bd33a

pr feedback

f90a543

Delete model.go

5edf9b6

remove unneeded function

1d904a9

update gguf verison

a038d21

refactor process lifecycle

f545015

mxyng and others added 10 commits September 7, 2023 11:09

update gguf decoder for v2

5b2a258

s/mf/ggml/

6b407d8

disable gpu for q8_0 on ggml only

59aaf85

shallow submodules

305c39a

install metal only for gpu

7221b87

redirect server stdout/stderr

b733ce4

ggufv1 fixes

f867b85

multi-arch

1d52399

Update llama.go

0ac2269

big sur support

15c3af2

BruceMacD force-pushed the brucemacd/gguf branch from 4c8bb99 to 15c3af2 Compare September 7, 2023 15:11

jmorganca approved these changes Sep 7, 2023

View reviewed changes

BruceMacD merged commit 09dd2ae into main Sep 7, 2023

BruceMacD deleted the brucemacd/gguf branch September 7, 2023 17:55

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GGUF support #441

GGUF support #441

BruceMacD commented Aug 29, 2023 •

edited

Loading

BruceMacD Aug 29, 2023

pdevine left a comment

pdevine Sep 1, 2023

BruceMacD Sep 5, 2023

pdevine Sep 1, 2023

f0rodo Sep 6, 2023

f0rodo commented Sep 6, 2023

jmorganca commented Sep 7, 2023

GGUF support #441

GGUF support #441

Conversation

BruceMacD commented Aug 29, 2023 • edited Loading

BruceMacD Aug 29, 2023

Choose a reason for hiding this comment

pdevine left a comment

Choose a reason for hiding this comment

pdevine Sep 1, 2023

Choose a reason for hiding this comment

BruceMacD Sep 5, 2023

Choose a reason for hiding this comment

pdevine Sep 1, 2023

Choose a reason for hiding this comment

f0rodo Sep 6, 2023

Choose a reason for hiding this comment

f0rodo commented Sep 6, 2023

jmorganca commented Sep 7, 2023

BruceMacD commented Aug 29, 2023 •

edited

Loading