Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is it possible to run the harness against API hosted models? #148

Open
pnewhook opened this issue Oct 16, 2023 · 3 comments
Open

Is it possible to run the harness against API hosted models? #148

pnewhook opened this issue Oct 16, 2023 · 3 comments

Comments

@pnewhook
Copy link

I have a model that's only available through a RESTful API, and need to get some benchmarks. I'd like to run MultiPL-E benchmarks with a few languages. Has any work gone into using bigcode-evaluation-harness to perform generation with an API instead of on the local machine?

lm-evaluation-harness has the ability to run against commercial APIs, especially OpenAI.

@loubnabnl
Copy link
Collaborator

loubnabnl commented Oct 20, 2023

Hello, we currently don't support external APIs, only generations with transformers. Feel free to open a PR if you have something in mind.

If you're interested in HumanEvalPack benchmarks and OpenAI models there's a task that supports it(docs here)https://github.com/bigcode-project/bigcode-evaluation-harness/blob/main/lm_eval/tasks/humanevalpack_openai.py

@krrishdholakia
Copy link

is there a way i could 'fake' a local model and have it call a hosted API endpoint? @pnewhook @loubnabnl

@loubnabnl
Copy link
Collaborator

I don't think that's possible with current setup which uses transformers loading that assumes you have the model checkpoint

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants