Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix slow PEX boot time when there are many extras. #1929

Merged
merged 1 commit into from
Oct 4, 2022

Conversation

jsirois
Copy link
Member

@jsirois jsirois commented Oct 4, 2022

Large extras sets lead to an exponentially scaled collection of
fingerprinted distribution objects that need to be de-duped. The hash
codes calculated in doing so are expensive when the distribution
metadata contains a large number of requirements. Cache these hash codes
to improve boot time by two orders of magnitude.

In order to enable ergonomic caching of hash codes via attrs built in
feature for doing this, some new plumbing is added to the third party
vendored importer plumbing.

Fixes #1928

Large extras sets lead to an exponentially scaled collection of
fingerprinted distribution objects that need to be de-deduped. The hash
codes calculated in doing so are expensive when the distribution
metadata contains a large numer of requirements. Cache these hash codes
to improve boot time by two orders of magnitude.

In order to enable ergonomic caching of hash codes via attrs built in
feature for doing this, some new plumbing is added to the third party
vendored importer plumbing.

Fixes pex-tool#1928
@attr.s(frozen=True)
# N.B.: DistributionMetadata can have an expensive hash when a distribution has many requirements;
# so we cache the hash. See: https://github.com/pantsbuild/pex/issues/1928
@attr.s(frozen=True, cache_hash=True)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the fix. All the rest just allows this to work. The attrs internals do magic when you enable this option and that magic requires being able to import attrs.

@jsirois
Copy link
Member Author

jsirois commented Oct 4, 2022

The results for the issue repro case:

$ pex "boto3-stubs[batch,codeartifact,codebuild,codepipeline,cognito-idp,dynamodb,kms,logs,ram,rds,route53,s3,secretsmanager,ssm,stepfunctions,sts]==1.24.78" -oslow.pex
$ python -mpex "boto3-stubs[batch,codeartifact,codebuild,codepipeline,cognito-idp,dynamodb,kms,logs,ram,rds,route53,s3,secretsmanager,ssm,stepfunctions,sts]==1.24.78" -ofast.pex
$ ls -l {slow,fast}.pex
-rwxr-xr-x 1 jsirois jsirois 1475456 Oct  4 10:40 fast.pex
-rwxr-xr-x 1 jsirois jsirois 1471547 Oct  4 10:40 slow.pex
$ time ./slow.pex -c ''

real    1m10.222s
user    1m10.160s
sys     0m0.061s
$ time ./fast.pex -c ''

real    0m0.918s
user    0m0.878s
sys     0m0.041s

@jsirois jsirois merged commit ff8ca0b into pex-tool:main Oct 4, 2022
@jsirois jsirois deleted the issues/1928 branch October 4, 2022 19:23
@jsirois jsirois mentioned this pull request Oct 4, 2022
1 task
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Large numbers of extras lead to exponential growth in PEX internal dependency resolution time.
2 participants