-
Notifications
You must be signed in to change notification settings - Fork 431
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adds option for JSON schema optimization #863
base: main
Are you sure you want to change the base?
Conversation
There seems to be another potential bug here. Given the function def test_add(a: int, b: int | None = None):
if b is None:
return a
return a + b the function Perhaps I should raise this in a separate issue. |
@eitanturok I don't think this is a bug cuz we don't use the title field when building the FSM (& when generating outputs) Can you provide an example where this breaks something? |
I'm using outlines to make my models better at function calling and this current setup causes me some issues. At a high level, I take the generated schema and use it 1) for the system prompt and 2) to create a regex. I input this schema into the system prompt so it knows which functions it has access to. But if the json schema does NOT contain the function's name, the model won't know how to call it. Here is an example: def test_add(a: int, b: int | None = None):
if b is None:
return a
return a + b
schema_json = get_schema_from_signature(tool)
schema_str = json.dumps(schema_json).strip()
schema_regex = build_regex_from_schema(schema_str, whitespace_pattern)
system_prompt = f"You are an expert at function calling and have access to the following tools: {function_schema}."
system_prompt += "Please call one of these functions."
system_prompt = system_prompt.format(schema_str)
generator = generate.regex(model, schema_regex) If the function name is not included in the schema generated from |
@eitanturok, we should raise this as a separate issue I'm thinking of replacing this line in model = create_model("Arguments", **arguments) with model = create_model(fn.__name__, **arguments) or try:
fn_name = fn.__name__
except Exception as e:
fn_name = "Arguments"
model = create_model(fn_name, **arguments) just to be safer what do you think? |
I was thinking the same thing. I'll raise this a separate issue. |
Raised the issue in #878. Future discussions should take place there. |
0a4b076
to
dbf193e
Compare
Pydantic's
.model_json_schema()
andget_schema_from_signature
don't actually make optional fields/arguments optional in the json schema. This forces the model to output the keys even when the values arenull
anyway--slowing down inference the larger the schema is & the more optional fields there is.For example, for this Pydantic class:
.model_json_schema()
builds this schema:optimize_schema
in this PR reduces this to:Likewise,
get_schema_from_signature
converts this function:to this schema:
optimize_schema
reduces this to:I decided to add a flag,
enable_schema_optimization
, and set it toFalse
by default because it further restricts the support distribution and thus might break models finetuned without this setting.