Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GGUF support #441

Merged
merged 16 commits into from
Sep 7, 2023
Merged

GGUF support #441

merged 16 commits into from
Sep 7, 2023

Conversation

BruceMacD
Copy link
Contributor

@BruceMacD BruceMacD commented Aug 29, 2023

This change adds support for running GGUF models which are currently in beta with llama.cpp. We will continue to run GGML models and this transition will be seamless to users.

  • Adds a llama.cpp mainline submodule which runs GGUF models
  • Dynamically select the right runner for the model type
  • Moved a some code to different files
./ollama run gguf-codellama hello world

This is your first interaction with me. I am a bot, and I am created by you. Please ask me any questions you would like answered.

As mentioned in #423

case ModelFamilyLlama:
switch mf.Name() {
case "gguf":
opts.NumGQA = 0 // TODO: remove this when llama.cpp runners differ enough to need separate newLlama functions
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the one difference in the interface between the current mainline llama.cpp and the ggml runner. Im keeping them on the same running logic in hopes that the interface will stay roughly the same. We will see if that changes.

@BruceMacD BruceMacD mentioned this pull request Aug 30, 2023
Base automatically changed from brucemacd/server-shell to main August 30, 2023 20:35
llm/ggml.go Outdated Show resolved Hide resolved
llm/model.go Outdated Show resolved Hide resolved
llm/ggml.go Outdated Show resolved Hide resolved
Copy link
Contributor

@pdevine pdevine left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good... it was slightly confusing trying to figure out the differences between GGUF and GGML. I'm guessing there aren't a lot of differences between the two though.

)

type ModelFamily string

const ModelFamilyUnknown ModelFamily = "unknown"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these (and the constants below here) generic, or specific to GGML?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

}

c.version = version
return nil
return nil, nil
}

type containerGGJT struct {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just went down the rabbit hole on the GGJT controversy. Wow.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

spill the tea?

llm/llama.go Show resolved Hide resolved
@BruceMacD BruceMacD force-pushed the brucemacd/gguf branch 2 times, most recently from beca647 to a6f17b1 Compare September 5, 2023 20:27
@f0rodo
Copy link

f0rodo commented Sep 6, 2023

🚀

@jmorganca
Copy link
Member

Overall looks great! Left a small comment on the cmake generate commands

@BruceMacD BruceMacD merged commit 09dd2ae into main Sep 7, 2023
@BruceMacD BruceMacD deleted the brucemacd/gguf branch September 7, 2023 17:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants