Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Split find_matches into generation and sorting #8234

Merged
merged 4 commits into from
May 15, 2020
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 8 additions & 0 deletions src/pip/_internal/resolution/resolvelib/candidates.py
Original file line number Diff line number Diff line change
Expand Up @@ -101,6 +101,14 @@ def make_install_req_from_dist(dist, parent):
return ireq


def is_already_installed(cand):
# type: (Candidate) -> bool
# For an ExtrasCandidate, we check the base
if isinstance(cand, ExtrasCandidate):
cand = cand.base
return isinstance(cand, AlreadyInstalledCandidate)
Copy link
Member

@uranusjr uranusjr May 13, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if it’s better to add an is_installed() on candidates… or even abstract out the keep_installed() function into candidate subclasses.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can't abstract out keep_installed() out since it depends on the upgrade_strategy.

+1 on adding is_installed() -- although, as a read-only property feels a lot cleaner.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I considered that, but wasn't sure if it was too intrusive (it felt a bit specialised). But as both of you seem OK with the idea, I'll do this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. I made is_installed an attribute on the class, and added an equivalent property to the Candidate class. That seems to work, although mypy failed to catch that I forgot to add the attribute to RequiresPythonCandidate. I understand why (there's a default implementation in Candidate) but is there really no way to say in mypy "every subclass of this class needs to have an attribute is_installed"? Or to say "every object of type Candidate must have an is_instance attribute" and have class attributes count?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that’s what Protocol is for, but it’s a pain to use Protocol on comment-style annotations. Python 2 😞



class _InstallRequirementBackedCandidate(Candidate):
def __init__(
self,
Expand Down
79 changes: 27 additions & 52 deletions src/pip/_internal/resolution/resolvelib/factory.py
Original file line number Diff line number Diff line change
Expand Up @@ -42,8 +42,6 @@


class Factory(object):
_allowed_strategies = {"eager", "only-if-needed", "to-satisfy-only"}

def __init__(
self,
finder, # type: PackageFinder
Expand All @@ -52,21 +50,16 @@ def __init__(
force_reinstall, # type: bool
ignore_installed, # type: bool
ignore_requires_python, # type: bool
upgrade_strategy, # type: str
py_version_info=None, # type: Optional[Tuple[int, ...]]
):
# type: (...) -> None
assert upgrade_strategy in self._allowed_strategies

self.finder = finder
self.preparer = preparer
self._python_candidate = RequiresPythonCandidate(py_version_info)
self._make_install_req_from_spec = make_install_req
self._force_reinstall = force_reinstall
self._ignore_requires_python = ignore_requires_python
self._upgrade_strategy = upgrade_strategy

self.root_reqs = set() # type: Set[str]

self._link_candidate_cache = {} # type: Cache[LinkCandidate]
self._editable_candidate_cache = {} # type: Cache[EditableCandidate]
Expand Down Expand Up @@ -118,64 +111,46 @@ def _make_candidate_from_link(
return ExtrasCandidate(base, extras)
return base

def _eligible_for_upgrade(self, dist_name):
# type: (str) -> bool
if self._upgrade_strategy == "eager":
return True
elif self._upgrade_strategy == "only-if-needed":
return (dist_name in self.root_reqs)
return False

def iter_found_candidates(self, ireq, extras):
# type: (InstallRequirement, Set[str]) -> Iterator[Candidate]
name = canonicalize_name(ireq.req.name)
if not self._force_reinstall:
installed_dist = self._installed_dists.get(name)
can_upgrade = self._eligible_for_upgrade(name)
else:
installed_dist = None
can_upgrade = False
seen_versions = set()
pfmoore marked this conversation as resolved.
Show resolved Hide resolved

# Yield the installed version, if it matches, unless the user
# specified `--force-reinstall`, when we want the version from
# the index instead.
if not self._force_reinstall and name in self._installed_dists:
installed_dist = self._installed_dists[name]
installed_version = installed_dist.parsed_version
if ireq.req.specifier.contains(
installed_version,
prereleases=True
):
seen_versions.add(installed_version)
yield self._make_candidate_from_dist(
dist=installed_dist,
extras=extras,
parent=ireq,
)

found = self.finder.find_best_candidate(
project_name=ireq.req.name,
specifier=ireq.req.specifier,
hashes=ireq.hashes(trust_internet=False),
)
for ican in found.iter_applicable():
if (installed_dist is not None and
installed_dist.parsed_version == ican.version):
if can_upgrade:
yield self._make_candidate_from_dist(
dist=installed_dist,
extras=extras,
parent=ireq,
)
continue
yield self._make_candidate_from_link(
link=ican.link,
extras=extras,
parent=ireq,
name=name,
version=ican.version,
)

# Return installed distribution if it matches the specifier. This is
# done last so the resolver will prefer it over downloading links.
if can_upgrade or installed_dist is None:
return
installed_version = installed_dist.parsed_version
if ireq.req.specifier.contains(installed_version, prereleases=True):
yield self._make_candidate_from_dist(
dist=installed_dist,
extras=extras,
parent=ireq,
)
if ican.version not in seen_versions:
seen_versions.add(ican.version)
yield self._make_candidate_from_link(
link=ican.link,
extras=extras,
parent=ireq,
name=name,
version=ican.version,
)

def make_requirement_from_install_req(self, ireq):
# type: (InstallRequirement) -> Requirement
if ireq.is_direct and ireq.name:
self.root_reqs.add(canonicalize_name(ireq.name))

if ireq.link:
# TODO: Get name and version from ireq, if possible?
# Specifically, this might be needed in "name @ URL"
Expand Down
80 changes: 78 additions & 2 deletions src/pip/_internal/resolution/resolvelib/provider.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,26 +3,101 @@

from pip._internal.utils.typing import MYPY_CHECK_RUNNING

from .candidates import is_already_installed

if MYPY_CHECK_RUNNING:
from typing import Any, Dict, Optional, Sequence, Tuple, Union
from typing import Any, Dict, Optional, Sequence, Set, Tuple, Union

from pip._internal.req.req_install import InstallRequirement
from pip._vendor.packaging.version import _BaseVersion

from .base import Requirement, Candidate
from .factory import Factory

# Notes on the relationship between the provider, the factory, and the
# candidate and requirement classes.
#
# The provider is a direct implementation of the resolvelib class. Its role
# is to deliver the API that resolvelib expects.
#
# Rather than work with completely abstract "requirement" and "candidate"
# concepts as resolvelib does, pip has concrete classes implementing these two
# ideas. The API of Requirement and Candidate objects are defined in the base
# classes, but essentially map fairly directly to the equivalent provider
# methods. In particular, `find_matches` and `is_satisfied_by` are
# requirement methods, and `get_dependencies` is a candidate method.
#
# The factory is the interface to pip's internal mechanisms. It is stateless,
# and is created by the resolver and held as a property of the provider. It is
# responsible for creating Requirement and Candidate objects, and provides
# services to those objects (access to pip's finder and preparer).


class PipProvider(AbstractProvider):
def __init__(
self,
factory, # type: Factory
constraints, # type: Dict[str, SpecifierSet]
ignore_dependencies, # type: bool
upgrade_strategy, # type: str
roots, # type: Set[str]
):
# type: (...) -> None
self._factory = factory
self._constraints = constraints
self._ignore_dependencies = ignore_dependencies
self._upgrade_strategy = upgrade_strategy
self.roots = roots
uranusjr marked this conversation as resolved.
Show resolved Hide resolved

def sort_matches(self, matches):
pradyunsg marked this conversation as resolved.
Show resolved Hide resolved
# type: (Sequence[Candidate]) -> Sequence[Candidate]

# The requirement is responsible for returning a sequence of potential
# candidates, one per version. The provider handles the logic of
# deciding the order in which these candidates should be passed to
# the resolver.
uranusjr marked this conversation as resolved.
Show resolved Hide resolved

# The `matches` argument is a sequence of candidates, one per version,
# which are potential options to be installed. The requirement will
# have already sorted out whether to give us an already-installed
# candidate or a version from PyPI (i.e., it will deal with options
# like --force-reinstall and --ignore-installed).

# We now work out the correct order.
#
# 1. If no other considerations apply, later versions take priority.
# 2. An already installed distribution is preferred over any other,
# unless the user has requested an upgrade.
# Upgrades are allowed when:
# * The --upgrade flag is set, and
# - The project was specified on the command line, or
# - The project is a dependency and the "eager" upgrade strategy
# was requested.

def _eligible_for_upgrade(name):
pradyunsg marked this conversation as resolved.
Show resolved Hide resolved
# type: (str) -> bool
if self._upgrade_strategy == "eager":
return True
elif self._upgrade_strategy == "only-if-needed":
return (name in self.roots)
return False

def keep_installed(c):
# type: (Candidate) -> int
"""Give priority to an installed version?"""
if not is_already_installed(c):
return 0

if _eligible_for_upgrade(c.name):
return 0

return 1
pradyunsg marked this conversation as resolved.
Show resolved Hide resolved

def key(c):
# type: (Candidate) -> Tuple[int, _BaseVersion]
return (keep_installed(c), c.version)

return sorted(matches, key=key)

def get_install_requirement(self, c):
# type: (Candidate) -> Optional[InstallRequirement]
Expand All @@ -45,7 +120,8 @@ def get_preference(
def find_matches(self, requirement):
# type: (Requirement) -> Sequence[Candidate]
constraint = self._constraints.get(requirement.name, SpecifierSet())
return requirement.find_matches(constraint)
matches = requirement.find_matches(constraint)
return self.sort_matches(matches)

def is_satisfied_by(self, requirement, candidate):
# type: (Requirement, Candidate) -> bool
Expand Down
19 changes: 11 additions & 8 deletions src/pip/_internal/resolution/resolvelib/resolver.py
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,8 @@


class Resolver(BaseResolver):
_allowed_strategies = {"eager", "only-if-needed", "to-satisfy-only"}

def __init__(
self,
preparer, # type: RequirementPreparer
Expand All @@ -47,30 +49,27 @@ def __init__(
py_version_info=None, # type: Optional[Tuple[int, ...]]
):
super(Resolver, self).__init__()

assert upgrade_strategy in self._allowed_strategies

self.factory = Factory(
finder=finder,
preparer=preparer,
make_install_req=make_install_req,
force_reinstall=force_reinstall,
ignore_installed=ignore_installed,
ignore_requires_python=ignore_requires_python,
upgrade_strategy=upgrade_strategy,
py_version_info=py_version_info,
)
self.ignore_dependencies = ignore_dependencies
self.upgrade_strategy = upgrade_strategy
self._result = None # type: Optional[Result]

def resolve(self, root_reqs, check_supported_wheels):
# type: (List[InstallRequirement], bool) -> RequirementSet

# The factory should not have retained state from any previous usage.
# In theory this could only happen if self was reused to do a second
# resolve, which isn't something we do at the moment. We assert here
# in order to catch the issue if that ever changes.
# The persistent state that we care about is `root_reqs`.
assert len(self.factory.root_reqs) == 0, "Factory is being re-used"

constraints = {} # type: Dict[str, SpecifierSet]
roots = set()
pradyunsg marked this conversation as resolved.
Show resolved Hide resolved
requirements = []
for req in root_reqs:
if req.constraint:
Expand All @@ -82,6 +81,8 @@ def resolve(self, root_reqs, check_supported_wheels):
else:
constraints[name] = req.specifier
else:
if req.is_direct and req.name:
roots.add(canonicalize_name(req.name))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if ther would be edge cases where an “unnamed” requirement messes up priority logic later. Would it be a good idea if we resolve all names eagerly here? We have to do it soon for resolvelib anyway.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Honestly, I'm not sure. I'm really cautious about preparing too early, because it feels so easy to end up with another mess like canonical names1, where we keep going "let's resolve names now, because we don't know if it's a problem". That's why I like the new candidate approach of having a lazy property. Hopefully when we switch over to the project model, projects will do the same.

I'd like to understand why we need to resolve the name of an unnamed requirement here, if we do need to (we're talking about pip install --upgrade /local/dir some other stuff where we need to know that /local/dir is a root so that we know to upgrade it... uh, what?) Let's wait till something breaks, then we can write a test that clarifies why we need to do this, and then do it. At the moment, I'm not even sure what pip install --upgrade /local/dir should mean...

1 I added that canonicalize_name call there as a result of a test failure, and it was another case of "just canonicalise, it's easier than trying to work out how come it's not already been done" 🙁

requirements.append(
self.factory.make_requirement_from_install_req(req)
)
Expand All @@ -90,6 +91,8 @@ def resolve(self, root_reqs, check_supported_wheels):
factory=self.factory,
constraints=constraints,
ignore_dependencies=self.ignore_dependencies,
upgrade_strategy=self.upgrade_strategy,
roots=roots,
)
reporter = BaseReporter()
resolver = RLResolver(provider, reporter)
Expand Down
3 changes: 2 additions & 1 deletion tests/unit/resolution_resolvelib/conftest.py
Original file line number Diff line number Diff line change
Expand Up @@ -55,7 +55,6 @@ def factory(finder, preparer):
force_reinstall=False,
ignore_installed=False,
ignore_requires_python=False,
upgrade_strategy="to-satisfy-only",
py_version_info=None,
)

Expand All @@ -66,4 +65,6 @@ def provider(factory):
factory=factory,
constraints={},
ignore_dependencies=False,
upgrade_strategy="to-satisfy-only",
roots=set(),
)