-
Notifications
You must be signed in to change notification settings - Fork 8.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve parser performance by reducing tracing overhead #10533
Conversation
I believe it might be worthwhile to re-evaluate in the future whether we still need such detailed tracing profiles. |
Insisting you leave a comment on the method headers or bodies explaining the unsafe looking pointer is for perf reasons and justified so no future good-doer replaces them back the other way thinking they're doing something righteous for memory safety. Also as we discussed offline, I am good with you compressing the csi trace to a string for further perf or even compiling out all of this for release builds as the intricacies of the parser are probably not useful for in-the-wild diagnostics anyway. I'm not sold on total removal or asymmetric removal of only one path's tracing. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Incredible!
@miniksa I'm sorry, I must've misunderstood your message somehow... |
Alright that works for me. I just thought that yesterday you briefly asked if we could dump the Csi ones completely and leave the rest and that I didn't like. I do like the compression idea (behind an IsTraceloggingEnabled check on the provider of course) And I offered "Def out in release" as an option if the structure overhead thing turned out to be a major player with the "is ETW on?" test portion of the logging macros as the instance variable of the ETW channel is a structure of sorts when I looked in the code behind and I thought could have been subject to your "Microsoft is bad perf-wise at passing structs on x64" assertion. |
Hello @lhecker! Because this pull request has the Do note that I've been instructed to only help merge pull requests of this repository that have been opened for at least 8 hours, a condition that will be fulfilled in about 6 hours 1 minute. No worries though, I will be back when the time is right! 😉 p.s. you can customize the way I help with merging this pull request, such as holding this pull request until a specific person approves. Simply @mention me (
|
Passing structures larger than the register size is very expensive due to Microsoft's x64 calling convention. We could reduce the overhead by passing the string-view by reference, but this forces us to allocate the parameters as static string-views on the data segment of our binary. I've found that passing them as classic C-strings is more ergonomic instead and fits the need for high performance in this particular code. This improves performance for VT-heavy output by 15-20%. ## PR Checklist * [x] I work here * [x] Tests added/passed (cherry picked from commit ee32598)
🎉 Handy links: |
🎉 Handy links: |
Passing structures larger than the register size is very expensive
due to Microsoft's x64 calling convention. We could reduce the
overhead by passing the string-view by reference, but this forces us
to allocate the parameters as static string-views on the data
segment of our binary. I've found that passing them as classic
C-strings is more ergonomic instead and fits the need for
high performance in this particular code.
This improves performance for VT-heavy output by 15-20%.
PR Checklist