Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[C++] Performance improvement in finally #3144

Merged
merged 1 commit into from
Apr 4, 2021
Merged

Conversation

xTachyon
Copy link
Contributor

@xTachyon xTachyon commented Apr 3, 2021

Before, to use finally, the compiler had to:

  • set up the lambda;
  • construct a std::function that could allocate if it wants to;
  • call finaly();
  • the dtor of the std::function would call the lambda;

Neither MSVC nor GCC could inline any of the functions of std::function, the finally function, nor the lambda, even with LTO.

With this patch both MSVC and GCC can see through the object and inline the lambda directly into the function code, making it essentially free.
MSVC would occasionally still set up the FinalAction object even if it doesn't use it, but would inline the lambda anyway.

Should we migrate to a non-movable FinalAction wrapper that wouldn't need the bool field so MSVC would get the idea to not set up the object?

Anyway, this should still be faster and generate smaller code because the simpler logic.
@mike-lischke

@mike-lischke
Copy link
Member

Do I see that right, the real optimization is the std::move call in the FinalAction c-tor? Do you have any numbers that can prove the performance improvement? I'm not against that patch, just wanna know how much it brings.

@xTachyon
Copy link
Contributor Author

xTachyon commented Apr 4, 2021

The move has little to do with the optimization. The real optimization is taking the lambda as a template instead wrapping it in a std::function. This allows the compiler to inline it directly in the calling function.

I don't have any numbers right now; what I've done is check what asm is generated by msvc and gcc.
See https://godbolt.org/z/nsvnYx3eY .

With std::function, there's multiple function calls, including virtual method calls and the posibility for dynamic allocation or thrown exceptions.
With the template version, the resulting function is 4 lines long and it includes 2 simple calls(that could be inlined further) with the functions I needed to make the example.

Given that finally() is used in a lot of autogenerated code, I'd say the optimization is significant for how easy it is.

@mike-lischke
Copy link
Member

Makes sense, thanks.

@parrt Another C++ patch to merge. As usual, some of the tests fail due to timeouts.

@parrt parrt merged commit 4dfacf6 into antlr:master Apr 4, 2021
@xTachyon xTachyon deleted the on_exit_opt branch April 4, 2021 16:06
@parrt parrt added this to the 4.9.3 milestone Oct 11, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants