Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add parallel install with fall-back to serial install when no multiprocessing available #8215

Closed
wants to merge 20 commits into from

Conversation

bmartinn
Copy link

@bmartinn bmartinn commented May 9, 2020

Based on the discussion in #8187 and #8169
First part of pip install parallelization, is installing the wheels (wheels only) using a multiprocessing Pool.

The reason for choosing Processes over Threads is the limiting factor here is actually CPU cycles in the installation process. Also the Wheel unpacking and copying is totally independent from one another and the rest of the process, and lastly order has no meaning in this stage as the requirements order is only important when building packages (and we are just unzipping the wheels).

Implementation details:
Run the wheel install in a multiprocessing Pool, this has x1.5 speedup factor when installing cached packages.
Packages that could not be installed (exception raised), will be installed serially once the Pool is done.
If multiprocessing.Pool is not supported by the platform, fall-back to serial installation.

@bmartinn bmartinn force-pushed the issue-8187-multiprocess-install branch from 6d26d2e to cede64d Compare May 9, 2020 23:31
@bmartinn
Copy link
Author

bmartinn commented May 10, 2020

Hi Guys,
CI seems to keep failing on:
test_install_subprocess_output_handling any hints on why / how?
When I run it manually with :

python -m pip install --verbose --global-option=--fail ../tests/data/src/chattymodule/ 2>&1 | grep DIE

I get this single line:
I DIE, I DIE
and the CI tests fail on count == 2.
The thing is, this is exactly what I'm getting when I'm running the master branch as well.

p.s.
This behavior repeats itself with a few other tests, when stdout is not what is expected, but seems to consistent with the behavior I see on the master branch

EDIT:
Never-mind :)
I think I figured it out, since the new implementation first tries to install in Pool, and if failing (exception) tries again in serial mode, you might end up with two exception messages.
I'll push a fix ...

@bmartinn bmartinn force-pushed the issue-8187-multiprocess-install branch 3 times, most recently from 3090c27 to 68f673a Compare May 10, 2020 02:43
Copy link
Contributor

@McSinyx McSinyx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it'd be nice to have the multiprocess utils in a separate module cover the following:

  • freeze_support() as import phase
  • Import guard of Pool
  • Abstracted map_multiprocess which handles errors like that raised during Pool creation

src/pip/_internal/req/__init__.py Outdated Show resolved Hide resolved
src/pip/_internal/req/__init__.py Outdated Show resolved Hide resolved
@bmartinn
Copy link
Author

@McSinyx at the end I reverted the freeze_support() and disabled the multiprocessing Pool on Windows Python 2.7, since python 2.7 is EOL anyhow, and the issue itself seems to be Windows' fork incompatibility in 2.7 only, I opted for a quick win. What do think?

Copy link
Contributor

@McSinyx McSinyx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

disabled the multiprocessing Pool on Windows Python 2.7, since python 2.7 is EOL anyhow, and the issue itself seems to be Windows' fork incompatibility in 2.7 only, I opted for a quick win. What do think?

I think it's OK, long as the behavior is correct for both w/ and w/o multiprocessing.

src/pip/_internal/req/__init__.py Outdated Show resolved Hide resolved
src/pip/_internal/req/__init__.py Outdated Show resolved Hide resolved
src/pip/_internal/req/__init__.py Outdated Show resolved Hide resolved
@bmartinn bmartinn force-pushed the issue-8187-multiprocess-install branch from 93093f9 to 48a2a8d Compare May 11, 2020 11:25
@pradyunsg
Copy link
Member

Heads up: This is on my "take a look at" list, but I'll have my hands full right now at least for the next few days, with pip 20.1.1.

@bmartinn
Copy link
Author

Hi @pradyunsg, since 21.1.1 is out, any chance you can take look :)

@BrownTruck
Copy link
Contributor

Hello!

I am an automated bot and I have noticed that this pull request is not currently able to be merged. If you are able to either merge the master branch into this pull request or rebase this pull request against master then it will be eligible for code review and hopefully merging!

@BrownTruck BrownTruck added the needs rebase or merge PR has conflicts with current master label May 31, 2020
@bmartinn bmartinn force-pushed the issue-8187-multiprocess-install branch from 48a2a8d to 4ddd162 Compare May 31, 2020 14:44
@pypa-bot pypa-bot removed the needs rebase or merge PR has conflicts with current master label May 31, 2020
@bmartinn bmartinn force-pushed the issue-8187-multiprocess-install branch from 4ddd162 to 7e347f5 Compare May 31, 2020 14:55
@deveshks
Copy link
Contributor

Hi @bmartinn

I see that the last few commits contain linting changes which are failing. May I suggest Running linters locally

@bmartinn
Copy link
Author

Thanks @deveshks !
I was looking for that
(not surprisingly running mypy locally did not detected any of these linter errors ...)

@pradyunsg
Copy link
Member

@bmartinn Do the errors not show up with the tox -e lint run, or with a plain mypy run?

@bmartinn
Copy link
Author

@pradyunsg plain mypy passes, tox -e lint is consistent with the CI errors, so I can finally debug it locally. Apologies for the mess here ...

@pradyunsg
Copy link
Member

@bmartinn No worries, and no need to apologize. Looks like you're hitting limitations of typeshed (the source of mypy's type annotations for Python's standard library) and it's definitely possible we're hitting the issue that it's not your code but typeshed annotations that are missing. :)


I'm also curious whether this points to a broader issue with our linting setup, so I'd appreciate if you could answer a couple of questions:

@bmartinn
Copy link
Author

Thanks @pradyunsg , but I missed the "running linters" part, I was assuming my local flake8/mypy was enough.

While diagnosing why the mypy failures were occurring, where did you look for additional information?

Raw log :)

@bmartinn bmartinn force-pushed the issue-8187-multiprocess-install branch 3 times, most recently from 260d4fe to 47d0ccf Compare May 31, 2020 21:25
@bmartinn
Copy link
Author

bmartinn commented Jun 30, 2020

@McSinyx now that your PR is merged, I think we should switch to using your implementation here, WDYT?

@bmartinn bmartinn force-pushed the issue-8187-multiprocess-install branch from 03fdc55 to 5e7332e Compare April 7, 2021 13:27
except KeyError:
install_result = _single_install(
install_args, req, suppress_exception=False)
logger.debug('Successfully installed %s serially', req.name)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This logs a successful install but just after, it may throw an exception: isinstance(install_result, BaseException) ?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sbidoul when calling _single_install with suppress_exception=False it will not suppress the exception but raise it. This means you will not get isinstance(install_result, BaseException) in this section, hence if we passed the _single_install call, the installation was successful.

@sbidoul
Copy link
Member

sbidoul commented Apr 17, 2021

I did a quick test. On a my 4 cores nvme laptop, I tried installing 230 wheels with individual sizes in the few kb to 150 MB range (with --no-index and --no-deps to eliminate network and resolver effects). It took 23s with parallel-install and 35s without. This is significant (but not as much as I was hoping for if the install process is CPU bound).

Now with my release manager hat on, I see this as part of the 21.1 milestone but I'm uncomfortable with merging this right now: although the feature is optional, the PR significantly changes the install process even if the case the feature is not enabled. So I feel we are too close to the release for such a big change to go in.

@bmartinn
Copy link
Author

@sbidoul

... the PR significantly changes the install process even if the case the feature is not enabled.

The idea behind the implementation is not to change the default install process when the the feature is turned off (and as far as I can see it does not), that said I do see why it might look like it does. Basically if the feature is turned off it will just call _single_install one package after the other, equivalent to the current implementation ...
wdyt?

@sbidoul
Copy link
Member

sbidoul commented Apr 18, 2021

@bmartinn I understand the default behaviour is meant to be the same, but the code to achieve that changes a lot and that's what makes me uncomfortable, as the RM, so close to the release.

If I take the point of view of a reviewer (sorry to be late to the party), I'd say that this part of the code base always required me to squint to understand what was going on in the error handling. And the change arguably does not make it easier to reason about. I wonder if there could be some prior refactoring that would make parallel install easier to implement and read.

@sbidoul sbidoul modified the milestones: 21.1, 21.2 Apr 18, 2021
@bmartinn
Copy link
Author

I understand the default behavior is meant to be the same, but the code to achieve that changes a lot and that's what makes me uncomfortable, as the RM, so close to the release.

Thanks @sbidoul totally understood :)
Let's try to merge it as quickly as possible after the 21.1 release, so we do not end up in the same place.
I'm hoping that once we have this PR merged we could start working on parallelizing the resolver, I already have some work done there, and I'm hoping it could be the next major speed improvement.

@bmartinn
Copy link
Author

Hi @sbidoul, as far as I understand milestone 21.1 is behind us, any reason not to merge?

@kashankrm
Copy link

Hey @sbidoul not to be pushy but any chance to get some kind of deadline when will you have time for this?

@sbidoul
Copy link
Member

sbidoul commented May 30, 2021

Hi @bmartinn, I had a look at this again, thanks for your patience.

To be frank my review comment above still applies:

If I take the point of view of a reviewer (sorry to be late to the party), I'd say that this part of the code base always required me to squint to understand what was going on in the error handling. And the change arguably does not make it easier to reason about. I wonder if there could be some prior refactoring that would make parallel install easier to implement and read.

For instance, passing suppress_exception to install_given_reqs is a bit of a code smell to me. It would probably be easier to read if there was an install_given_req function that would be wrapped in another function that catches the exceptions to be used in multiprocessing mode.

Also one thing I don't quite understand is is why there is this intermixing of parallel and sequential installation with this index array of results, and this "except KeyError" that signals that it was a sequential install. That makes the code hard to understand to me (although I may very well have missed a good reason). Would it be leaner if we'd split the requirements first then calling one function for parallel install and another one for sequential installs, and then grouping the results for display ?

@bmartinn
Copy link
Author

Hi @sbidoul

Also one thing I don't quite understand is is why there is this intermixing of parallel and sequential installation

I think this is the major point to stress, not all requirements can be parallelized, only pre-built wheels can be safely installed in parallel as this is basically unzipping. Any local/git package install might actually run code, and hence for those we need to keep the execution order as set by the resolver. This is why we first have to install all wheels in parallel and then go over the other required packages, and install by order. There is no easy way out of it, at least as far as I can currently see...

Now regrading the implementation itself, I have to admit that by now I lost context of your original comment.
Please feel free to point to specific code to change, I'd be more than happy to improve / fix / make it more human readable.

p.s.
I would really love for this year-old PR to finally find itself into pip as it really accelerates the install process on large python environments.
Let's try to get it done :)

@BrownTruck
Copy link
Contributor

Hello!

I am an automated bot and I have noticed that this pull request is not currently able to be merged. If you are able to either merge the master branch into this pull request or rebase this pull request against master then it will be eligible for code review and hopefully merging!

@BrownTruck BrownTruck added the needs rebase or merge PR has conflicts with current master label Jun 7, 2021
@sbidoul
Copy link
Member

sbidoul commented Jun 13, 2021

Sorry for the slow turnaround here.

not all requirements can be parallelized

I understand that. Nevertheless I suspect the code could be made easier to understand and maintain by doing something like

parallel_req, sequential_reqs = split_reqs()
parallel_results = install_parallel(parallel_req)
sequential_results = install_sequential(sequential_reqs)
# merge and print results...

Also, in terms of correctness, installation may imply uninstallation. And in some cases, uninstallation may not be multiprocess-safe ? I'm thinking in particular to the uninstallation of packages that were installed with setup.py develop and are replaced by a parallel wheel installation. I you have several of them there might be concurrency issues on easy-install.pth (see UninstallPthEntries). So we probably want to also exclude packages that are legacy installed from parallel installation.

Pushing this reasoning a bit further I wonder if we should not be even stricter and says that parallel installation can only be enabled when all InstallRequirement are wheels (and are not already legacy installed). This would give us a clean code path without legacy baggage for parallel installation, and we can keep the legacy path untouched.

@pfmoore
Copy link
Member

pfmoore commented Jun 13, 2021

Pushing this reasoning a bit further I wonder if we should not be even stricter and says that parallel installation can only be enabled when all InstallRequirement are wheels (and are not already legacy installed).

I like this idea - it's a lot cleaner, and it emphasises that the legacy code paths are just that - legacy, and people shouldn't expect to see new features/improvements like this unless they move away from the legacy approaches. Encouraging people to use more modern approaches would be an extra benefit of this change, if we took that route.

@uranusjr
Copy link
Member

parallel installation can only be enabled when all InstallRequirement are wheels

Or can be built as a wheel, I persume? In other words, it is still possible to parallelise the installation if some of the requirements are sdists or source archives, as long as they can all be (serially) built into wheels first. I’m +1 if that’s the case.

@bmartinn
Copy link
Author

Also, in terms of correctness, installation may imply uninstallation. And in some cases, uninstallation may not be multiprocess-safe ? ...

@sbidoul actually for this specific reason any package that needs to be reinstalled (i.e. uninstall+install) will be installed serially (from the code: if not req.should_reinstall and req.is_wheel).

Pushing this reasoning a bit further I wonder if we should not be even stricter and says that parallel installation can only be enabled when all InstallRequirement are wheels

@uranusjr I "think" that as part of pip's caching strategy, any source code based install (think installing directly from github for example), is first built into a wheel (before the "actual" install is called), then the temporary/cached wheel is passed to the install function.

@sbidoul, can we agree on this approach? basically if all packages are wheels, install in parallel, otherwise install serially ? (with "reinstalled" packages installed serially, post the parallel process)

@sbidoul
Copy link
Member

sbidoul commented Jun 15, 2021

@sbidoul actually for this specific reason any package that needs to be reinstalled (i.e. uninstall+install) will be installed serially (from the code: if not req.should_reinstall and req.is_wheel).

Good! I had missed that bit.

@sbidoul, can we agree on this approach? basically if all packages are wheels, install in parallel, otherwise install serially ? (with "reinstalled" packages installed serially, post the parallel process)

I'd say if all package are wheels and not installed, install in parallel, otherwise install serially. This can later evolve to "if all packages are wheel and not legacy installed, install in parallel"

@uranusjr
Copy link
Member

I'd say if all package are wheels and not installed, install in parallel, otherwise install serially.

another wrinkle to this is if any package is already installed and needs to be upgraded to another version (or --force-reinstall is supplied, basically if a package needs to be uninstalled first to make way for a new install), the uninstallation part of the upgrade must also happen serially, before parallel installation can happen. I’m not sure whether this can be done trivially, so it could be a good idea to leave this case out (and fallback to serially uninstall-install each package) if it can’t.

@bmartinn
Copy link
Author

... so it could be a good idea to leave this case out (and fallback to serially uninstall-install each package) if it can’t.

Yep, I agree.
Let me see what I can do here :)

@uranusjr uranusjr removed this from the 21.2 milestone Jul 24, 2021
@pradyunsg
Copy link
Member

Closing this out given the rebase conflicts and the lack of activity here! Thanks for filing this @bmartinn; please do feel welcome to pick this back up and file a new PR if you find the time to do so! ^>^

@pradyunsg pradyunsg closed this May 8, 2023
@github-actions github-actions bot locked as resolved and limited conversation to collaborators May 24, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
needs rebase or merge PR has conflicts with current master type: enhancement Improvements to functionality
Projects
None yet
Development

Successfully merging this pull request may close these issues.