Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simplify PackageFinder best candidate API some more #6787

Merged
merged 7 commits into from
Aug 21, 2019

Conversation

cjerdonek
Copy link
Member

@cjerdonek cjerdonek commented Jul 24, 2019

This PR makes a few minor changes to PackageFinder's "best candidate" API to simplify things and improve the readability / make it easier to understand. Specifically, the PR does the following--

  • Rename PackageFinder.find_candidates() to PackageFinder.find_best_candidate() because that is what the method is used for. There is also already a PackageFinder.find_all_candidates() method, so the rename makes things less confusing since there are now no longer two methods find_all_candidates() and find_candidates() with similar-sounding names.

  • Change the type of the return value of PackageFinder.find_best_candidate() from FoundCandidates to BestCandidateResult (by renaming FoundCandidates to BestCandidateResult), since the purpose of this class is to expose the best candidate.

  • Compute the best_candidate and pass it to the BestCandidateResult constructor instead of computing the best candidate lazily. The best candidate is now exposed as a best_candidate attribute of the BestCandidateResult object instead of via FoundCandidates.get_best(). This is better for a couple reasons: (1) the applicable candidates and best candidate are now both handled the same way (namely by passing them in when instantiating instead of computing lazily), and (2) we no longer need to pass the whole CandidateEvaluator object into the FoundCandidates object, eliminating a bit of circularity that was of concern before. This completes the process of converting FoundCandidates to a simple data container class, so that it doesn't actually compute anything on its own.

  • Rename the current CandidateEvaluator.get_best_candidate() to CandidateEvaluator.sort_best_candidate() because this is the method that gets the best candidate by sorting, and add a new CandidateEvaluator.compute_best_candidate() method responsible for filtering applicable candidates, calling sort_best_candidate(), and constructing the BestCandidateResult object.

  • Adds to the BestCandidateResult constructor the assertion that @xavfernandez requested here: Simplify FoundCandidates #6684 (comment)

@cjerdonek cjerdonek added C: finder PackageFinder and index related code skip news Does not need a NEWS file entry (eg: trivial changes) type: refactor Refactoring code labels Jul 24, 2019
@xavfernandez
Copy link
Member

I must admit I'm getting kind of confused in the multiple refactors and the new CandidateEvaluator, CandidatePreferences, LinkEvaluator, FoundCandidates, BestCandidateResult.

It might help to write down how it currently works and where we want to go ?

@pfmoore
Copy link
Member

pfmoore commented Jul 26, 2019

Agreed - with all the various refactorings going on at the moment, I think this is a great opportunity to create additional internal documentation explaining how the various utility classes we have work.

The old classes weren't well documented, but people had over time gained a rough feeling for how things hung together. Significant refactoring without adding docs loses that implicit knowledge, so better if we can capture some details in writing while they are still fresh.

@cjerdonek
Copy link
Member Author

For now, where would you like me to document those things — in the comments to this PR or to make a separate documentation PR? These docs would only be for pip maintainers (us) and contributors, as they would document the code and not the public API. Before this refactoring started, the various responsibilities were all handled by one giant PackageFinder class. Now distinct stages of the process are handled by different classes, which makes certain things easier like testing, reasoning about and adding new features to individual stages (e.g. yanked files and preferring hash matches), and knowing which options affect which stages.

@pfmoore
Copy link
Member

pfmoore commented Jul 26, 2019

I'd go for comment blocks in the code (or if you get really verbose, .txt files alongside the source files :-)) But definitely in the actual sources, not in github comments.

I'd assume it's easier to include in this PR while the details are fresh in your mind (these are internal usage notes, so polishing the wording isn't necessary), but it's up to you how you prefer to organise the work - if you find it easier to do as a separate PR, I'm fine with that.

@pradyunsg
Copy link
Member

pradyunsg commented Jul 26, 2019

Internal Documentation. \o/

Here's how I'd planned to structure it while working on the overall architecture docs, at PyCon 2019:

  • add development/architecture/ to the html documentation
  • index is a table of contents + a 1 or 2 paragraph overview + a note that pip's internals are not for external consumers but maintainers and contributors.
  • each component within the codebase can get its own dedicated page/section within that as necessary.
    • think download, finder, vcs, build, install, configuration, uninstall, resolution etc.
    • Example url: development/architecture/download.html
  • A contributor friendly guide that connects the dots for all these parts.

@pradyunsg
Copy link
Member

pradyunsg commented Jul 26, 2019

I'll be happy to initiate the skeleton in a PR for the above -- I think I already have some this stuff written.

If no one thinks that has major issues (it's something we can add to incrementally and then require updating on major refactors), I'll file the PR.

@pfmoore
Copy link
Member

pfmoore commented Jul 26, 2019

@pradyunsg That sounds really nice - but more than I'm talking about here, which is really just "capture the insights that resulted in the current refactoring, so that people who had a feel for the old layout can find their way round the new layout (and ideally, people who didn't know the old layout have a head start understanding the current one)".

@pradyunsg
Copy link
Member

@pfmoore yep yep -- that's the kind of content I imagined the dedicated pages would contain. 🙈

@cjerdonek
Copy link
Member Author

My preference would be if it's something I can work on / write up now rather than having to wait for an architecture document for all of pip to be written, reviewed, and merged. (I already see the scope increasing quite a bit in the few hours since the initial request.)

@pradyunsg
Copy link
Member

@cjerdonek would the following work for you: add a docs/html/development/architecture/ folder and a file named whatever you want, with whatever content you deem necessary (and :orphan: at the top of the file). :)

I'll be happy to do the rest in a follow-up PR after that.

@pfmoore
Copy link
Member

pfmoore commented Jul 26, 2019

@cjerdonek +1 from me. Someone can move the information later if they want to make it more formal. And I apologise if it sounds like scope creep. I was only ever intending to support @xavfernandez request - not to ask for anything more than that.

@cjerdonek
Copy link
Member Author

@cjerdonek would the following work for you:

That would be okay, thanks.

I was only ever intending to support @xavfernandez request - not to ask for anything more than that.

And yes, understood, @pfmoore. I appreciate it.

@cjerdonek
Copy link
Member Author

Okay, I was finally able to finish drafting the architecture document / section for PackageFinder / index.py that a number of you requested. It includes sections for a number of the key classes in index.py (and in particular the ones mentioned in this PR). I'm sure that this can be fine-tuned, but I think it provides a good start.

I also addressed @chrahunt's review comment on the code portion (which was to add an additional assertion to parallel the one that @xavfernandez had requested earlier).

* Rename FoundCandidates to BestCandidateResult.

* Rename CandidateEvaluator's make_found_candidates() to
  compute_best_candidate().

* Rename CandidateEvaluator's get_best_candidate() to
  sort_best_candidate().

* Rename PackageFinder's find_candidates() to find_best_candidate().
@cjerdonek
Copy link
Member Author

PR updated. I moved the architecture section to a separate file as requested.

@atugushev
Copy link
Contributor

Thanks for the documentation! It seems we got a lot of work to do in pip-tools :)

@cjerdonek
Copy link
Member Author

Hi, @atugushev! Most of this documentation is of what was there before. The code changes in the PR are mostly just some renames that may not even affect you.

@pradyunsg
Copy link
Member

Happy to merge this.

@cjerdonek cjerdonek merged commit 9ef8116 into pypa:master Aug 21, 2019
@cjerdonek cjerdonek deleted the best-candidate-result branch August 21, 2019 20:13
@atugushev
Copy link
Contributor

@cjerdonek you were right, that was easy 👍

@cjerdonek
Copy link
Member Author

Great!

atugushev added a commit to atugushev/pip-tools that referenced this pull request Sep 17, 2019
The `get_best_candidate` has been renamed to `compute_best_candidate`,
which now returns an instance of `BestCandidateResult`.

See pypa/pip#6787
@lock lock bot added the auto-locked Outdated issues that have been locked by automation label Sep 20, 2019
@lock lock bot locked as resolved and limited conversation to collaborators Sep 20, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
auto-locked Outdated issues that have been locked by automation C: finder PackageFinder and index related code skip news Does not need a NEWS file entry (eg: trivial changes) type: refactor Refactoring code
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants