Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Fix Various JSON-Schema Generation Bugs #88

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open

Conversation

lapp0
Copy link
Owner

@lapp0 lapp0 commented Aug 31, 2024

Overview

The repetition problem of language models combined with patterns allowing for infinite-length fields results in broken JSON Schema outputs.

This was addressed previously for infinite whitespaces issues by setting a safe whitespace pattern as the default. In this PR, the safety of whitespaces is extended to Integer and String patterns.

Behavior

json_schema.to_regex now includes an kwarg safe_subset=True.

safe_subset=False

  • Whitespace: r"[\n\t ]*"
  • Integer: any number
  • String: any string

safe_subset=True (default)

  • Whitespace: r"[ ]?"
  • Integer: (-1e19, 1e19)
  • String: Any string of length (0, 256)

Fixes

Safe Integer

Safe String

Further Work

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
1 participant