Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature request: keep track of how packages are installed (manually, automatically, ...) #9812

Closed
bersbersbers opened this issue Apr 17, 2021 · 5 comments
Labels
resolution: no action When the resolution is to not do anything

Comments

@bersbersbers
Copy link

bersbersbers commented Apr 17, 2021

What's the problem this feature will solve?
Large requirement files with many packages are a pain to install from. I am just migrating from Python 3.8 to 3.9 using a requirements file with 280 entries, and it's not going well. I have already relaxed or removed some requirements manually, and at one point, pip tried to downgrade itself to some version 6.x.

I think the problem is due to the new resolver and the fact that finding a working combination of 280 packages is hard. In addition, there are cases (such as when upgrading all installed packages) when it is unclear if A should be upgraded at the cost of downgrading (or not upgrading) B, or B should be upgraded at the cost of downgrading A. All of this will become more complicated once pip considers installed packages.

What I am missing in pip is some prioritization of packages. One manual way of doing this (in my case) is to include only packages at the top level of pipdeptree in the requirements file. This way, pip could focus on installing (new) versions of packages I am actually interested in using.

However, pipdeptree is only one indication - I may still be using some of the dependent packages directly. A better way would be to keep track of what the user actually installed themselves. Hence, when issuing pip install tensorflow, pip should remember tensorflow as a high-priority/high-interest package, while six is an automatically installed dependency.

How is this useful? When considering package upgrades, the following may happen:
We have A==2 and B==2 installed. When upgrading both A and B, we may have A==3 available which requires B==2, and B==3 which requires A==2. [I notice this is a tricky example due to potential cyclic dependencies, but one can easily construct a similar example using two incompatible packages C and D on which A==3 and B==3 depend, respectively. The point is that ...] ...installing A==3 and B==3 is thus impossible, and the resolver needs to determine which package to upgrade. I have no idea how this is currently done, but knowing that A (tensorflow) was installed manually and B (six) is only a dependency should allow the resolver to prioritize the upgrade of A.

Excuse me if this has been discussed before, but I have not found any such discussion here yet.

Describe the solution you'd like
apt list --installed shows something like this:

x11-utils/focal,now 7.7+5 amd64 [installed]
xauth/focal,now 1:1.1-0ubuntu1 amd64 [installed,automatic]

While I installed x11-utils myself, xauth was installed automatically.

This information should be tracked by pip in my opinion, and used in later operations.

Users should then also have tools to (un)mark an installed package as relevant/prioritized/... (or just "automatic" is that is functionally the same thing).

@uranusjr
Copy link
Member

pip already does keep track of this, see #8026. It turns out the idea of “manually installed” is not really that useful due to how installation works, but feel free to experiement with it and see what can be done.

@bersbersbers
Copy link
Author

Thank you - https://www.python.org/dev/peps/pep-0376/#requested is a valuable resource. I'll use that for some further reading.

@pradyunsg pradyunsg added the resolution: no action When the resolution is to not do anything label Apr 17, 2021
@pradyunsg
Copy link
Member

Closing since I don't think we have anything actionable here.

@bersbersbers
Copy link
Author

bersbersbers commented Apr 17, 2021

@uranusjr

It turns out the idea of “manually installed” is not really that useful due to how installation works

What do you mean by "due to how installation works"? I understand that once you install from a pip freeze requirements file, the whole concept is useless since every single dependency counts as requested. (I for one have some 150 REQUESTED files in my fresh Python installation, probably for that reason.) Is that the problem you are referring to?

If so, maybe pip freeze (and/or pip list) could be amended to, optionally, output only requested packages? pip list --not-required does something similar already, but by a another mechanism (similar to what I described using pipdeptree above).

Maybe pip freeze could even output the requested information, and pip install could use it, when generating and installing from a requirements file.

but feel free to experiement with it and see what can be done.

I will, thank you. Ultimately, I hope this can speed up the resolver, or at least give it some idea from which direction best to approach the vast space of version combination.

For my reference, are there any features which use the requested thing already?

@uranusjr
Copy link
Member

Sorry I was probably too brief in the previous comment. What I meant to say was the feature was implemented for something else, but that thing ended up requiring much more information than simply “did the user specify the package from the command line”, so the REQUESTED file ended up not being unused anywhere, and therefore not really well-tested in practical environments. So yeah, feel free to experiement on it and maybe show the information in pip list etc., but don’t be surprised if you run into some inaccuracies. You should anticipate users complaining about issues once the feature is released, and be ready to either explain to them the subtle difference between your understanding of “manually installed” to theirs (for example, are packages installed from a requirements.txt user-requested or transitive?)

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Sep 29, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
resolution: no action When the resolution is to not do anything
Projects
None yet
Development

No branches or pull requests

3 participants