Integrate OpenAI API Structured Generation #1142

lapp0 · 2024-09-11T01:45:39Z

Improvements

Enables use of OpenAI's structured generation endpoint via `generate.choice` and `generate.json`

Enable use with arbitrary OpenAI-compatible endpoints

tokenizer was removed as a requirement for models.openai because the logit bias filter no longer needs to be constructed.

This has the side effect of fixing a number of issues users were having with using arbitrary OpenAI-compatible endpoints.

Verified by author of #1135 and in a brief smoke test at the bottom of this description.

Fixes Web server adapter #326
Fixes Support for llama.cpp remote server #513

Changes

Update models.openai to support OpenAI standard response_format + json_schema
Update outlines.generate.json and outlines.generate.choice to use json_schema for models.openai generation.
Update docs: remove references to gpt-3.5 and gpt-4.
Remove tiktoken requirement in models.openai, and remove dead code from old generate.choice handling

Fix `main`

Unrelated to this PR, but pre-commit was failing due to home.html and main.html.

Smoke Testing

Testing via CI with real API keys isn't feasible. Mocked endpoint tests are implemented, but don't provide strong guarantees.

Here are smoke tests that I ran to verify behavior:

OpenAI Structured Generation

import outlines.models as models
from outlines import generate
from pydantic import BaseModel, ConfigDict


model = models.openai("gpt-4o-mini")

# smoke test stop_at, truncating the close square bracket
generator = generate.text(model)
print(generator("Produce an array of the first 10 primes, starting with [ ", stop_at=["]"]))
# 'Here is an array of the first 10 prime numbers:\n\n```plaintext\n[2, 3, 5, 7, 11, 13, 17, 19, 23, 29'


# smoke test json schema model creation
model_for_json_schema = model.new_with_replacements(response_format={"type": "json_schema"})
print(model.config)
# OpenAIConfig(model='gpt-4o-mini', frequency_penalty=0, logit_bias={}, max_tokens=None, n=1, presence_penalty=0, response_format=None, seed=None, stop=None, temperature=1.0, top_p=1, user='')
print(model_for_json_schema.config)
# OpenAIConfig(model='gpt-4o-mini', frequency_penalty=0, logit_bias={}, max_tokens=None, n=1, presence_penalty=0, response_format={'type': 'json_schema'}, seed=None, stop=None, temperature=1.0, top_p=1, user='')


# smoke test pydantic usage
class Person(BaseModel):
    model_config = ConfigDict(extra='forbid')  # required for openai
    first_name: str
    last_name: str
    age: int

generator = generate.json(model, Person)
print(generator("dict for the current indian prime minister on january 1st 2023"))
# Person(first_name='Narendra', last_name='Modi', age=72)


# smoke test json string
person_schema = '{"additionalProperties": false, "properties": {"first_name": {"title": "First Name", "type": "string"}, "last_name": {"title": "Last Name", "type": "string"}, "age": {"title": "Age", "type": "integer"}}, "required": ["first_name", "last_name", "age"], "title": "Person", "type": "object"}'
generator = generate.json(model, person_schema)
print(generator("chairman of the IMF in 2023"))
# {'first_name': 'Kristalina', 'last_name': 'Georgieva', 'age': 71}


generator = generate.choice(model, ["Enron", "FTX", "Silicon Valley Bank"])
print(generator("What will be the biggest company in the world in 2025?"))
# FTX

`generate.text()` with any openai-compatible endpoint

import outlines.models as models
from outlines import generate


model = models.openai(
    "meta-llama/llama-3.1-8b-instruct",
    base_url="https://openrouter.ai/api/v1",
    api_key="hunter12",
)

generator = generate.text(model)
print(generator("hi"))
# "Hello! How are you doing? It's great to meet you. Let me know if there's anything I can help with."

`python3 examples/react.py`

Updated this to use new interface. Works as expected.

lapp0 · 2024-09-15T19:13:02Z

outlines/models/openai.py

@@ -313,81 +219,6 @@ async def call_api(prompt, system_prompt, config):
    return results, usage["prompt_tokens"], usage["completion_tokens"]


We should consider removing the @cache decorator in

@cache() async def call_api(prompt, system_prompt, config): responses = await client.chat.completions.create( messages=system_message + user_message, **asdict(config), # type: ignore ) return responses.model_dump()

Users might call choice multiple times to get multiple samples, and can manage their own cache if needed.

…nces to gpt-3 and gpt-4

lapp0 force-pushed the openai-structured-generation branch 3 times, most recently from 6fa0c87 to 1e2e389 Compare September 14, 2024 19:11

lapp0 marked this pull request as ready for review September 14, 2024 19:14

lapp0 force-pushed the openai-structured-generation branch 2 times, most recently from 78662ac to 3893746 Compare September 15, 2024 16:59

lapp0 marked this pull request as draft September 15, 2024 17:02

lapp0 force-pushed the openai-structured-generation branch 4 times, most recently from 51dd2a7 to a2879e1 Compare September 15, 2024 19:08

lapp0 commented Sep 15, 2024

View reviewed changes

lapp0 marked this pull request as ready for review September 15, 2024 19:20

lapp0 force-pushed the openai-structured-generation branch from 7f01601 to 08010a5 Compare September 15, 2024 20:39

lapp0 added 2 commits September 17, 2024 15:24

Use OpenAI API For Structured Generation (json, choice)

a24a4be

reflect in docs: models.openai with new interface, also remove refere…

b2d5473

…nces to gpt-3 and gpt-4

rlouf force-pushed the openai-structured-generation branch from 08010a5 to b2d5473 Compare September 17, 2024 13:24

rlouf merged commit 289ef5d into dottxt-ai:main Sep 17, 2024
6 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Integrate OpenAI API Structured Generation #1142

Integrate OpenAI API Structured Generation #1142

lapp0 commented Sep 11, 2024 •

edited

Loading

lapp0 Sep 15, 2024

		@@ -313,81 +219,6 @@ async def call_api(prompt, system_prompt, config):
		return results, usage["prompt_tokens"], usage["completion_tokens"]

Integrate OpenAI API Structured Generation #1142

Integrate OpenAI API Structured Generation #1142

Conversation

lapp0 commented Sep 11, 2024 • edited Loading

Improvements

Enables use of OpenAI's structured generation endpoint via generate.choice and generate.json

Enable use with arbitrary OpenAI-compatible endpoints

Changes

Fix main

Smoke Testing

OpenAI Structured Generation

generate.text() with any openai-compatible endpoint

python3 examples/react.py

lapp0 Sep 15, 2024

Choose a reason for hiding this comment

lapp0 commented Sep 11, 2024 •

edited

Loading

Enables use of OpenAI's structured generation endpoint via `generate.choice` and `generate.json`

Fix `main`

`generate.text()` with any openai-compatible endpoint

`python3 examples/react.py`