Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rework PyPy installation #1518

Merged
merged 1 commit into from
Sep 16, 2023
Merged

Rework PyPy installation #1518

merged 1 commit into from
Sep 16, 2023

Conversation

mayeut
Copy link
Member

@mayeut mayeut commented Aug 13, 2023

PyPy installation is just a tar extract.

As such, it might be interesting to install PyPy at runtime rather than at build time.
This is especially true for EOL versions of PyPy which would allow to reduce the image size, and thus, overall build time for users not using EOL PyPy versions.

This PR does not modify default installed PyPy versions yet.

With this revisited installation method, it will be much easier to add GraalPy.

The argument to manylinux-interpreters ensure shall be the most specific PEP425 {python_tag}-{abi_tag} for a given interpreter to ease cibuildwheel integration in a first step (given cibuildwheel uses /opt/python/{python_tag}-{abi_tag}, should be easy to add a call to manylinux-interpreters ensure there).

@mayeut
Copy link
Member Author

mayeut commented Aug 13, 2023

@pypa/cibuildwheel-maintainers,

Any thoughts on this manylinux-python ensure way of making sure a given interpreter is made available ?

xref pypa/cibuildwheel#1538

Copy link
Contributor

@joerick joerick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the approach generally seems good. Probably most packagers are getting manylinux only for CPython, so deferring the other interpreters to runtime makes sense, and will cut down on size and download time.

Any numbers on how long the install of an interpreter takes? The interpreters themselves look like just tarfile extracts, but there's also the pip install bit.

We'd want to get manylinux-python support into cibuildwheel before the rollout of this feature I suppose. Within cibuildwheel, if pypy was excluded, I can see a potential issue in our integration test suite, because a few tests do use pypy. The pypy install would happen on each test run, rather than being reused from previous runs. This would increase the CI time quite a bit, I think (and be wasteful on bandwidth).

(Just thinking.... another idea could be a manylinux-slim series of images, which contain cpython only, and a set of manylinux-full images, which use manylinux-slim as a BASE and install all the interpreters on top. No idea how much work it would be to make that happen, but would be quite neat. cibuildwheel could choose the slim variants if it only needs cpython, but use the full ones in the tests)

docker/build_scripts/python_versions.json Outdated Show resolved Hide resolved
docker/build_scripts/python_versions.json Show resolved Hide resolved
docker/build_scripts/manylinux-python.py Outdated Show resolved Hide resolved
@mayeut
Copy link
Member Author

mayeut commented Aug 15, 2023

Any numbers on how long the install of an interpreter takes? The interpreters themselves look like just tarfile extracts, but there's also the pip install bit.

I don't have them right now but you're right, there's also the pip install bit, it's worth figuring this out now. I'll try this out on GHA (my connection is so slow that it wouldn't be representative of CI usage).

We'd want to get manylinux-python support into cibuildwheel before the rollout of this feature I suppose. Within cibuildwheel, if pypy was excluded, I can see a potential issue in our integration test suite, because a few tests do use pypy. The pypy install would happen on each test run, rather than being reused from previous runs. This would increase the CI time quite a bit, I think (and be wasteful on bandwidth).

Indeed, we'd want to get manylinux-python support into cibuildwheel before removing any pre-installed PyPy versions.
This order would thus be:

  1. merge this PR to roll-out manylinux-python
  2. get manylinux-python support into cibuildwheel
  3. remove EOL PyPy versions from the default image

As for cibuildwheel tests, this can be worked around quite easily I think (e.g. re-commit all images with manylinux-python ensure-all called as a fixture).

Just thinking.... another idea could be a manylinux-slim series of images, which contain cpython only, and a set of manylinux-full images, which use manylinux-slim as a BASE and install all the interpreters on top. No idea how much work it would be to make that happen, but would be quite neat. cibuildwheel could choose the slim variants if it only needs cpython, but use the full ones in the tests.

I don't think it would be too difficult to add full images but I'd rather not multiply images if the "fix" is just to call manylinux-python ensure-all or manylinux-python ensure ... and the time is similar to pulling the image.

The only issue I can see would be local usage, where, not to re-install everytime, one would need to re-commit all images with manylinux-python ensure-all (as proposed for cibuildwheel tests).

@mayeut mayeut force-pushed the rework-pypy-install branch 2 times, most recently from 4b86663 to d1fc534 Compare August 15, 2023 17:25
@mayeut
Copy link
Member Author

mayeut commented Aug 15, 2023

Any numbers on how long the install of an interpreter takes? The interpreters themselves look like just tarfile extracts, but there's also the pip install bit.

Indeed the pip install bit was longer than I thought, this was mostly due to python -m ensurepip and I ended up using cpython pip to bootstrap pip installation rather than python -m ensurepip.
I reworked this as well as compute sha256 / extract while downloading.

The initial timings where somewhere around 20s for 1 PyPy install / 1m 45s for GraalPy
They're now at around 12s for 1 PyPy install / 55s for GraalPy

tests/run_tests.sh Outdated Show resolved Hide resolved
tests/run_tests.sh Outdated Show resolved Hide resolved
@mayeut
Copy link
Member Author

mayeut commented Aug 19, 2023

I'm going to split the PR given there are still some things to workout throughout the ecosystem for GraalPy.

@mayeut mayeut changed the title Rework PyPy installation & add GraalPy Rework PyPy installation Aug 19, 2023
@mayeut mayeut mentioned this pull request Aug 19, 2023
@mayeut
Copy link
Member Author

mayeut commented Aug 19, 2023

There's still some doc missing for manylinux-interpreters, otherwise, I think all comments have been addressed.

@joerick, am I missing something in my answers or do you have more comments ?

@joerick
Copy link
Contributor

joerick commented Aug 26, 2023

Sorry for the delay, all good from my perspective. I'm not fully understanding how we make this fast in the cibuildwheel CI context, but that's downstream it sounds like you have an idea :)

PyPy installation is just a tar extract.
As such, it might be interesting to install PyPy at runtime rather than at build time.
This is especially true for EOL versions of PyPy which would allow to reduce the image size,
and thus, overall build time for users not using EOL PyPy versions.

This commit does not modify default installed PyPy versions yet.

This commit will also be useful for GraalPy installation.
@mayeut mayeut marked this pull request as ready for review September 16, 2023 09:40
@mayeut mayeut merged commit 066b049 into pypa:main Sep 16, 2023
9 checks passed
@mayeut mayeut deleted the rework-pypy-install branch September 16, 2023 09:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants