-
-
Notifications
You must be signed in to change notification settings - Fork 259
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Handle locking all direct reference URL forms. #2060
Conversation
@@ -213,8 +213,9 @@ def _chmod(self, info, path): | |||
# This magic works to extract perm bits from the 32 bit external file attributes field for | |||
# unix-created zip files, for the layout, see: | |||
# https://www.forensicswiki.org/wiki/ZIP#External_file_attributes | |||
attr = info.external_attr >> 16 | |||
os.chmod(path, attr) | |||
if info.external_attr > 0xFFFF: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was a bug exposed by GitHub zips, they have no perm bits set (signs of MS I think). Their tarballs present no such problems.
components = fname.rsplit("-", 1) | ||
if len(components) == 2: | ||
project_name, version = components | ||
return cls(project_name=project_name, version=version) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will this cause issues if the URL coincidentally happens to have a single -
in it, but isn't an sdist? For instance: given https://example.com/some-package.zip
(or some other suffix that's more version-like), it seems like this'll happily return ProjectNameAndVersion(project_name="some", version="package")
and thus _extract_resolve_data
and its callers might not fall into the correct code path (and/or crash)?
If I'm understanding this correctly, one idea might be not trying to handle sdists specially, and let them fall through to this new code for arbitrary archives, since I imagine the pip logging is similar?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the wrong place to fix and, as you suggest, just not trying to handle non-wheels at all in the locker should work now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, right - no. This is complicated. I'll need to revisit the case of additional artifacts. You can't get additional artifacts for an archive URL direct reference - direct reference requirements are always singular. But for a normal sdist from PyPI, it can be just one of many archives that serve as artifacts for a universal lock of a particular pin - and this is where the current importance of identifying a Pin cheaply lies. More work on this tomorrow am...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, this was indeed complicated, but really only due to organic growth in the Locker logic. I took a clean-slate approach to that and it's now much simpler. The upshot is that parsing name and version from an artifact is now only ever done for wheels - which it's guaranteed to work for, and for links Pip identified as candidates via the file name / index structure. These can include sdists that are never built and we can rely on Pip here and avoid a build just to get name / version it has already blessed.
This last commit represents a very large change. I may close this out after vetting green across the various Pip permutations checked by CI and post a new PR with the changes split up. |
Previously, it was assumed that direct reference URLs were either VCS URLs or else wheel or sdist URLs. This neglected two remaining cases, both of which failed to lock with better or worse error messages. The two missed cases were: 1. URLs of source archives not conforming to the sdist quasi-standard naming convention of `<project name>-<version>.{.tar.gz,.zip}`. 2. Local file:// URLs pointing at project directories. A notable case of the 1st are project archives provided by GitHub. A notable need for the 2nd case comes from Pants where Pip proprietary requirement strings are not handled (e.g.: `/path/to/project`) and a direct reference URL must be used instead (e.g.: `projectname @ /path/to/project`). Fixes pex-tool#2057
Avoid parsing name and version from anything but wheels, where there is a hard standard.
Alright 83583e2 is now solid. PTAL. |
hashing.file_hash(source_archive_path, digest) | ||
source_fingerprint = Fingerprint.from_digest(digest) | ||
self._selected_path_to_pin[selected_path] = build_result.pin | ||
elif "file" == artifact_url.scheme: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hard for me to tell by eyeballing, but are we certain we will never encounter a schemeless URL (i.e., just a file path) in the parsed output, that will eventually get passed here?
In other words, should also check for a None scheme and treat it like a file: ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, certain. ITs cover that case across Pex's closed universe of Pips. Pip records all of .
, $PWD
and file://$PWD
as file://$PWD
in the logs. The only anomaly is in its Saved ...
log lines (which are used by the lock log analyzer to filter out backtracks). In those lines local file paths that are directories never appear. This one anomaly carries a fair bit of supporting code and tests of its own.
Previously, it was assumed that direct reference URLs were either VCS
URLs or else wheel or sdist URLs. This neglected two remaining cases,
both of which failed to lock with better or worse error messages. The
two missed cases were:
naming convention of
<project name>-<version>.{.tar.gz,.zip}
.file://
URLs pointing at project directories.A notable case of the 1st are project archives provided by GitHub.
A notable need for the 2nd case comes from Pants where Pip proprietary
requirement strings are not handled (e.g.:
/path/to/project
) and adirect reference URL must be used instead (e.g.:
projectname @ file:///path/to/project
).Fixes #2057