Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integrate OpenAI API Structured Generation #1142

Merged
merged 2 commits into from
Sep 17, 2024

Conversation

lapp0
Copy link
Collaborator

@lapp0 lapp0 commented Sep 11, 2024

Rendered Docs

Improvements

Enables use of OpenAI's structured generation endpoint via generate.choice and generate.json

Enable use with arbitrary OpenAI-compatible endpoints

tokenizer was removed as a requirement for models.openai because the logit bias filter no longer needs to be constructed.

This has the side effect of fixing a number of issues users were having with using arbitrary OpenAI-compatible endpoints.

Verified by author of #1135 and in a brief smoke test at the bottom of this description.

Changes

  • Update models.openai to support OpenAI standard response_format + json_schema
  • Update outlines.generate.json and outlines.generate.choice to use json_schema for models.openai generation.
  • Update docs: remove references to gpt-3.5 and gpt-4.
  • Remove tiktoken requirement in models.openai, and remove dead code from old generate.choice handling

Fix main

Unrelated to this PR, but pre-commit was failing due to home.html and main.html.

Smoke Testing

Testing via CI with real API keys isn't feasible. Mocked endpoint tests are implemented, but don't provide strong guarantees.

Here are smoke tests that I ran to verify behavior:

OpenAI Structured Generation

import outlines.models as models
from outlines import generate
from pydantic import BaseModel, ConfigDict


model = models.openai("gpt-4o-mini")

# smoke test stop_at, truncating the close square bracket
generator = generate.text(model)
print(generator("Produce an array of the first 10 primes, starting with [ ", stop_at=["]"]))
# 'Here is an array of the first 10 prime numbers:\n\n```plaintext\n[2, 3, 5, 7, 11, 13, 17, 19, 23, 29'


# smoke test json schema model creation
model_for_json_schema = model.new_with_replacements(response_format={"type": "json_schema"})
print(model.config)
# OpenAIConfig(model='gpt-4o-mini', frequency_penalty=0, logit_bias={}, max_tokens=None, n=1, presence_penalty=0, response_format=None, seed=None, stop=None, temperature=1.0, top_p=1, user='')
print(model_for_json_schema.config)
# OpenAIConfig(model='gpt-4o-mini', frequency_penalty=0, logit_bias={}, max_tokens=None, n=1, presence_penalty=0, response_format={'type': 'json_schema'}, seed=None, stop=None, temperature=1.0, top_p=1, user='')


# smoke test pydantic usage
class Person(BaseModel):
    model_config = ConfigDict(extra='forbid')  # required for openai
    first_name: str
    last_name: str
    age: int

generator = generate.json(model, Person)
print(generator("dict for the current indian prime minister on january 1st 2023"))
# Person(first_name='Narendra', last_name='Modi', age=72)


# smoke test json string
person_schema = '{"additionalProperties": false, "properties": {"first_name": {"title": "First Name", "type": "string"}, "last_name": {"title": "Last Name", "type": "string"}, "age": {"title": "Age", "type": "integer"}}, "required": ["first_name", "last_name", "age"], "title": "Person", "type": "object"}'
generator = generate.json(model, person_schema)
print(generator("chairman of the IMF in 2023"))
# {'first_name': 'Kristalina', 'last_name': 'Georgieva', 'age': 71}


generator = generate.choice(model, ["Enron", "FTX", "Silicon Valley Bank"])
print(generator("What will be the biggest company in the world in 2025?"))
# FTX

generate.text() with any openai-compatible endpoint

import outlines.models as models
from outlines import generate


model = models.openai(
    "meta-llama/llama-3.1-8b-instruct",
    base_url="https://openrouter.ai/api/v1",
    api_key="hunter12",
)

generator = generate.text(model)
print(generator("hi"))
# "Hello! How are you doing? It's great to meet you. Let me know if there's anything I can help with."

python3 examples/react.py

Updated this to use new interface. Works as expected.

@lapp0 lapp0 force-pushed the openai-structured-generation branch 3 times, most recently from 6fa0c87 to 1e2e389 Compare September 14, 2024 19:11
@lapp0 lapp0 marked this pull request as ready for review September 14, 2024 19:14
@lapp0 lapp0 force-pushed the openai-structured-generation branch 2 times, most recently from 78662ac to 3893746 Compare September 15, 2024 16:59
@lapp0 lapp0 marked this pull request as draft September 15, 2024 17:02
@lapp0 lapp0 force-pushed the openai-structured-generation branch 4 times, most recently from 51dd2a7 to a2879e1 Compare September 15, 2024 19:08
@@ -313,81 +219,6 @@ async def call_api(prompt, system_prompt, config):
return results, usage["prompt_tokens"], usage["completion_tokens"]
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should consider removing the @cache decorator in

    @cache()
    async def call_api(prompt, system_prompt, config):
        responses = await client.chat.completions.create(
            messages=system_message + user_message,
            **asdict(config),  # type: ignore
        )
        return responses.model_dump()

Users might call choice multiple times to get multiple samples, and can manage their own cache if needed.

@lapp0 lapp0 marked this pull request as ready for review September 15, 2024 19:20
@lapp0 lapp0 force-pushed the openai-structured-generation branch from 7f01601 to 08010a5 Compare September 15, 2024 20:39
@rlouf rlouf force-pushed the openai-structured-generation branch from 08010a5 to b2d5473 Compare September 17, 2024 13:24
@rlouf rlouf merged commit 289ef5d into dottxt-ai:main Sep 17, 2024
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants