Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve python3 performance by adding __slots__ #3017

Merged
merged 1 commit into from
Dec 31, 2020
Merged

Improve python3 performance by adding __slots__ #3017

merged 1 commit into from
Dec 31, 2020

Conversation

amykyta3
Copy link
Contributor

@amykyta3 amykyta3 commented Dec 29, 2020

Summary

Python isn't exactly known for its performance, but this PR is my attempt at squeezing out a few more percent of speedup from the Python3 runtime.
This change adds the __slots__ variable to most classes in the Antlr runtime in order to explicitly declare class members and avoid the use of the slower __dict__-based member storage mechanism. This results in faster Python class instance creation, as well as member access.
A benchmark I ran showed roughly 10% improvement in performance.

Compatibility considerations

For normal "sane" uses of the Antlr runtime, this change is backwards compatible with the current runtime API and should have no effect on behavior.
The side effect of __slots__ usage is that one is no longer able to dynamically assign new variables to class instances. There are also some caveats with Python weakref usage. I can't quite imagine why someone may need to stuff their own variables on-top of runtime classes, but then again, you never know what other users will do 😄.

This limitation can be removed by adding __dict__ and __weakref__ back into the __slots__ definition. (possibly at a small performance cost)
The current implementation omits this for now.

See python docs for more details: https://docs.python.org/3/reference/datamodel.html#slots

Other changes made

The following other changes were made to correct issues and allow __slots__ to be used:

  • In ATN.py, fixed capitalization of an assignment to <IntervalSet>.readOnly
    • The class IntervalSet initializes a member readOnly, but this was being assigned as readonly in ATN.py
    • Looks like a typo.
  • In LexerATNSimulator.py, removed the match_calls counter.
    • This was initialized as a class variable but incremented in an instance variable which is incompatible with __slots__.
    • Looks like this was used while debugging at one point and not actually necessary, as it is not referenced anywhere.

Benchmark

I ran some comparison benchmarks on some sample code from one of my projects. Also including stats using the speedy-antlr-tool accelerator.
Inputs are using the lexer/parser for SystemRDL as well as one of the testcase files. Measured time elapsed for 1000 iterations via timeit.

Benchmark seconds
Baseline 134.42
Baseline + slots 121.65
"speedy-antlr" 14.58
"speedy-antlr" + slots 13.47

It isn't much, but the addition of __slots__ improves performance by roughly 10% in both the baseline Python3 parser, as well as the accelerated version. I expect this also depends somewhat on the grammar used.

Other notes

I ran the unit-tests in runtime/Python3/test/run.py without issues. Not sure how exhaustive these are.
My name is already signed in contributors.txt which is why you don't see it in this PR's diff.

@@ -652,6 +652,7 @@ CaptureNextTokenType(d) ::= "<d.varName> = self._input.LA(1)"

StructDecl(struct,ctorAttrs,attrs,getters,dispatchMethods,interfaces,extensionMembers) ::= <<
class <struct.name>(<if(contextSuperClass)><contextSuperClass><else>ParserRuleContext<endif>):
__slots__ = ('parser', '__dict__')
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding __dict__ here as a catch-all for any other class members (labeled grammar rule members?)
Ideally these would be emitted here too, but I'm not familiar enough with how the parser generator templates work to implement this, so using __dict__ is sufficient.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should be as simple as:
slots = ('parser', <attrs:{a | ''}; separator=",">)
can you try ?then we can leverage your perf improvement measures

@@ -58,7 +63,7 @@ def nextTokensNoContext(self, s:ATNState):
if s.nextTokenWithinRule is not None:
return s.nextTokenWithinRule
s.nextTokenWithinRule = self.nextTokensInContext(s, None)
s.nextTokenWithinRule.readonly = True
s.nextTokenWithinRule.readOnly = True
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is fixing what appears to be a capitalization typo. The IntervalSet class initializes this using camelCase instead of lowercase. Not sure which is preferred since I see both usages in the runtime.

@@ -89,7 +92,6 @@ def copyState(self, simulator:LexerATNSimulator ):
self.startIndex = simulator.startIndex

def match(self, input:InputStream , mode:int):
self.match_calls += 1
Copy link
Contributor Author

@amykyta3 amykyta3 Dec 29, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removing a debug counter that conflicts with the ability to use __slots__ since it was initialized as a class variable but referenced as an instance variable. Nothing else references this.

@ericvergnaud
Copy link
Contributor

Hi,

looks great but it currently breaks the Travis CI build...

@ericvergnaud
Copy link
Contributor

am I correct in thinking this improvement would also work for Python 2.x?

@amykyta3
Copy link
Contributor Author

Yeah. Slots have been around since early python 2: https://docs.python.org/2.7/reference/datamodel.html#__slots__
Is it worth implementing for 2.x? Use of py2 has been plummeting since it was deprecated.

@ericvergnaud ericvergnaud merged commit 47e9734 into antlr:master Dec 31, 2020
@ericvergnaud
Copy link
Contributor

I've merged this PR to have it part of 4.9.1 being released this weekend.
Python2 runtime is still widely used, see https://pypistats.org/api/packages/antlr4-python2-runtime/recent
Also would be great if you could try out my suggestion for slots in the stg file

@amykyta3
Copy link
Contributor Author

Will do! I'll try to prep another PR today

@parrt parrt added this to the 4.9.1 milestone Jan 3, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants