Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce Numba-based FSM utilities #272

Merged

Conversation

brandonwillard
Copy link
Member

@brandonwillard brandonwillard commented Sep 6, 2023

This PR introduces Numba JITed FSM utilities with 20x speed-ups over the current pure Python implementations.

It also introduces a more memory efficient "end-to-end" means of producing FSM indices. I avoided implementing it this way originally because it involves multiple iterations through a vocabulary, but, in order to address some memory-related shortcomings of the CFG indexing approaches tested in #178, this might be the better approach for now. It's a clear trade-off between processing and memory—now leaning toward processing—but, with the JIT speed-ups, it's reasonable.

Closes #226 (for now), closes #239, and should help with #192.

  • Tie this into the Regex implementation.
  • Map end states to tokens in the index, so that there's no need to re-walk the FSM after sampling.
    This was always how it was supposed to work, but our previous prototype didn't implement it. Since we're updating/replacing that prototype, it might be best to add it now.
  • Cache computed masks.
  • Investigate Numba compilation and caching.
    We need to make sure that caching works exactly as expected (i.e. only once for all the index-building code).
  • Test out some multi-threading approaches to the end-to-end indexing.
  • Consider using uints for the states, instead of int64.
  • Consider Python packaging/Numba AOT options.

@brandonwillard brandonwillard self-assigned this Sep 6, 2023
@brandonwillard brandonwillard added enhancement optimization Related to performance optimizations structured generation Linked to structured generation labels Sep 6, 2023
@brandonwillard brandonwillard marked this pull request as draft September 6, 2023 22:06
@brandonwillard brandonwillard force-pushed the numba-fsa-implementation branch 5 times, most recently from 27feb9d to 7ae64b3 Compare September 9, 2023 23:56
@brandonwillard brandonwillard marked this pull request as ready for review September 10, 2023 00:00
@brandonwillard brandonwillard mentioned this pull request Sep 10, 2023
@brandonwillard brandonwillard force-pushed the numba-fsa-implementation branch 5 times, most recently from 9f416a5 to d79967e Compare September 16, 2023 03:40
@brandonwillard brandonwillard force-pushed the numba-fsa-implementation branch 4 times, most recently from 9a186fe to c7b3cc8 Compare September 16, 2023 20:26
@brandonwillard brandonwillard force-pushed the numba-fsa-implementation branch 2 times, most recently from ab84dc5 to b4d4b2b Compare September 26, 2023 17:10
@brandonwillard brandonwillard force-pushed the numba-fsa-implementation branch 2 times, most recently from 87c97cd to fb37a1c Compare September 27, 2023 18:57
@brandonwillard brandonwillard merged commit 38b0b10 into dottxt-ai:main Sep 29, 2023
5 checks passed
@brandonwillard brandonwillard deleted the numba-fsa-implementation branch September 29, 2023 00:32
@AL-377
Copy link
Contributor

AL-377 commented Nov 2, 2023

will it help to speed up the "self.regex_fsm = regex_pattern.to_fsm().reduce()" in outlines 0.0.8,i found when the set the constrain field long like maxLength=1000, it takes very long in regex_fsm construction

@rlouf
Copy link
Member

rlouf commented Nov 2, 2023

Did you try with 0.0.9?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement optimization Related to performance optimizations structured generation Linked to structured generation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Cache masks computed during regex guidance Slow Index Building
3 participants