-
-
Notifications
You must be signed in to change notification settings - Fork 114
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Benchmark is not an apples to apples comparison (sse4.2) #85
Comments
I've also been experimenting with getting inlining to work across the FFI, and succeeded using Rusts' 'linker-plugin-lto', clang-12, and lld-12. This improved the benchmarks for pico a little more and put both pico benchmarks in the lead, the full pico benchmark hitting ~2900 MB/s vs httparse at 1751 MB/s on my ancient laptop. |
Ah yea good point. Originally httparse didn't have SIMD support either, so it was more similar. |
I haven't looked all that far into it but I'm interested in your thoughts on why Pico is faster. Is it doing some memory management tricks or something? ..I'm working on a pet project and am trying to figure out if I should just write it in C, or if there is a way to get comparable results with unsafe Rust |
How do you run the Rust benchmarks? Do you set the target CPU so it doesn't have to do runtime checks? https://rust-lang.github.io/packed_simd/perf-guide/target-feature/rustflags.html |
I run these flags globally in my config.toml:
|
I'm going to dump some info here for reproducibility purposes The speed improvements came primarily from two areas, both involved modifying the underlying Pico bindings crate
Full cc command from Pico bindings crate:
|
Updated the above comment as the steps it described were incorrect. The above steps work as expected. Here are the results of my latest test: |
Alright, the adventure is coming to an end with this final update:
|
Hello, first, thanks for making this tool.
I wanted to point out your benchmark is a bit unfair as you compare httparse sse4 against picohttpparser without sse4. The reason picohttpparser doesn't have sse4 is because your dependency 'pico-sys' does not compile picohttpparser with sse4 enabled.
Your benchmark showed a ~60% improvement in performance for 'bench_pico' once sse4 was enabled in the underlying crate.
I forked the underlying crate 'pico-sys' and made a few modifications if you want to verify my results:
https://github.com/errantmind/rust-pico-sys
The text was updated successfully, but these errors were encountered: