-
Notifications
You must be signed in to change notification settings - Fork 431
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add benchmark suite #542
Add benchmark suite #542
Conversation
): | ||
"""Benchmark converting regex to FSM""" | ||
regex_str = regex_samples[regex_name] | ||
benchmark.pedantic( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any particular reason why you are using the pedantic mode here and in the other benchmarks?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO, it's cleaner than
create_rfsm = lambda: RegexFSM(regex_str, tokenizer)
benchmark(create_rfsm)
Additionally it allows for fine-grained control of the number of runs.
Thank you for opening a PR! Don't you think it would be best to always benchmark the end-to-end index computation? This is the quantity we care about. |
Could you please clarify? We are benchmarking the computation of the |
I mean not separating the Numba compilation from the rest of the index compilation. Total time is what we care about. Does that make sense? |
Numba initial compilation is a one time occurrence and takes ~9,400 ms per the posted benchmark. Specifically it's benchmarking the generation of
After compilation generating a Otherwise optimizations (or performance degredation) of |
I can get on board with that. Would you mind rebasing on |
ea1a0df
to
33583a9
Compare
33583a9
to
a0289f0
Compare
Thank you! |
outlines/fsm/regex.py
numba function compilationoutlines/fsm/json_schema.py
json to regex (build_regex_from_object
)outlines/fsm/regex.py
to FSMNo benchmark coverage for generation. Have some code written to do so, but would like this merged as a baseline before continuing.
Generate benchmarks
pytest --benchmark-only --benchmark-columns=mean,max
Comparing two branches
Generate initial results:
Generate comparisons:
py.test-benchmark compare --sort=fullname --columns=mean,max main-profile_regex_fsm.json trie-profile_regex_fsm.json
py.test-benchmark compare --sort=fullname --columns=mean,max main-profile_compile_numba.json trie-profile_compile_numba.json