Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Split find_matches into generation and sorting #8234

Merged
merged 4 commits into from
May 15, 2020

Conversation

pfmoore
Copy link
Member

@pfmoore pfmoore commented May 13, 2020

OK, this refactors find_matches so that the factory is only responsible for generating a list of acceptable matches, and it's the provider's job to sort them. This means that the set of "root" requirements (the ones input by the user) are no longer needed in the factory and can be saved on the provider, restoring the factory to its role of acting as a stateless interface to the finder, preparer, etc.

The code is still pretty messy. I'm fairly sure that this is mainly because the behaviour we're trying to implement is messy, so I don't think it's something we can eliminate totally. But I've been working on this for a while now, and I may well be too close to it to see obvious simplifications, so I'd appreciate any suggestions on that score.

I'm pretty sure that (style aside) this is OK to go, so IMO it's ready for review.

@pfmoore pfmoore added the skip news Does not need a NEWS file entry (eg: trivial changes) label May 13, 2020
# For an ExtrasCandidate, we check the base
if isinstance(cand, ExtrasCandidate):
cand = cand.base
return isinstance(cand, AlreadyInstalledCandidate)
Copy link
Member

@uranusjr uranusjr May 13, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if it’s better to add an is_installed() on candidates… or even abstract out the keep_installed() function into candidate subclasses.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can't abstract out keep_installed() out since it depends on the upgrade_strategy.

+1 on adding is_installed() -- although, as a read-only property feels a lot cleaner.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I considered that, but wasn't sure if it was too intrusive (it felt a bit specialised). But as both of you seem OK with the idea, I'll do this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. I made is_installed an attribute on the class, and added an equivalent property to the Candidate class. That seems to work, although mypy failed to catch that I forgot to add the attribute to RequiresPythonCandidate. I understand why (there's a default implementation in Candidate) but is there really no way to say in mypy "every subclass of this class needs to have an attribute is_installed"? Or to say "every object of type Candidate must have an is_instance attribute" and have class attributes count?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that’s what Protocol is for, but it’s a pain to use Protocol on comment-style annotations. Python 2 😞

@@ -82,6 +81,8 @@ def resolve(self, root_reqs, check_supported_wheels):
else:
constraints[name] = req.specifier
else:
if req.is_direct and req.name:
roots.add(canonicalize_name(req.name))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if ther would be edge cases where an “unnamed” requirement messes up priority logic later. Would it be a good idea if we resolve all names eagerly here? We have to do it soon for resolvelib anyway.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Honestly, I'm not sure. I'm really cautious about preparing too early, because it feels so easy to end up with another mess like canonical names1, where we keep going "let's resolve names now, because we don't know if it's a problem". That's why I like the new candidate approach of having a lazy property. Hopefully when we switch over to the project model, projects will do the same.

I'd like to understand why we need to resolve the name of an unnamed requirement here, if we do need to (we're talking about pip install --upgrade /local/dir some other stuff where we need to know that /local/dir is a root so that we know to upgrade it... uh, what?) Let's wait till something breaks, then we can write a test that clarifies why we need to do this, and then do it. At the moment, I'm not even sure what pip install --upgrade /local/dir should mean...

1 I added that canonicalize_name call there as a result of a test failure, and it was another case of "just canonicalise, it's easier than trying to work out how come it's not already been done" 🙁

Copy link
Member

@uranusjr uranusjr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logic looks great! I have only one doubt regarding how we populate roots, all other comments are about the interface.

Copy link
Member

@pradyunsg pradyunsg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main refactor looks great to me!

A bunch of naming/cleanup suggestions, which I don't feel too strongly about. :)

src/pip/_internal/resolution/resolvelib/provider.py Outdated Show resolved Hide resolved
src/pip/_internal/resolution/resolvelib/resolver.py Outdated Show resolved Hide resolved
# For an ExtrasCandidate, we check the base
if isinstance(cand, ExtrasCandidate):
cand = cand.base
return isinstance(cand, AlreadyInstalledCandidate)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can't abstract out keep_installed() out since it depends on the upgrade_strategy.

+1 on adding is_installed() -- although, as a read-only property feels a lot cleaner.

src/pip/_internal/resolution/resolvelib/provider.py Outdated Show resolved Hide resolved
@pradyunsg
Copy link
Member

Please remind me to file a PR that is the result of running the following on pip:

find ./src/pip/_internal ./tests/unit -name "*.py" -type f -exec sed -i '' 's/\.is_direct/.is_user_provided/' {} \;

InstallRequirement.{is_direct -> is_user_provided}

@pradyunsg
Copy link
Member

Whoops. I accidentally restarted the Travis CI build here, when I was trying to restart the master Travis CI build. Sorry!

@uranusjr
Copy link
Member

I wonder if this is relevant to #7678.

@pfmoore pfmoore requested a review from uranusjr May 14, 2020 14:29
@pfmoore
Copy link
Member Author

pfmoore commented May 14, 2020

@uranusjr Probably. Also relevant to the discussion we had about test_upgrade_to_same_version_from_url. I think the new resolvelib behaviour (where we merge "all the stuff the resolver has seen" in find_matches) will make these easier to handle, so I propose we defer worrying about them until that resolvelib release is merged.

On that basis, is there anything more you want to see done with this PR before I merge it?

@pradyunsg
Copy link
Member

I'm happy with this as is. LGTM! :)

@pradyunsg
Copy link
Member

@pfmoore :)

I'll do the find | sed after this PR is merged.

@pfmoore
Copy link
Member Author

pfmoore commented May 15, 2020

Assuming the tests pass, I'll merge this once @uranusjr gives the OK. It's blocking a couple of things, now, so I'd like to get it in.

@pfmoore pfmoore merged commit d42c448 into pypa:master May 15, 2020
@pfmoore pfmoore deleted the refactor_find_matches branch May 15, 2020 13:03
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Oct 15, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
skip news Does not need a NEWS file entry (eg: trivial changes) type: refactor Refactoring code
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants