Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

get sha256 hash from /simple (PEP503) endpoint #120

Open
graingert opened this issue Jun 25, 2020 · 9 comments
Open

get sha256 hash from /simple (PEP503) endpoint #120

graingert opened this issue Jun 25, 2020 · 9 comments

Comments

@graingert
Copy link
Contributor

What's the problem this feature will solve?

when using devpi or other non- pypi.org servers the hashing falls back to downloading the asset and hashing it locally

Describe the solution you'd like

use the sha256 hash from the /simple endpoint pypi.org and devpi both provide sha256 hashes as a fragment in their href

It's optional and may not include the user' preferred hash function, so pip-compile should still fall-back on the JSON api/downloading assets:

The URL SHOULD include a hash in the form of a URL fragment with the following syntax: #=, where is the lowercase name of the hash function (such as sha256) and is the hex encoded digest.
Repositories SHOULD choose a hash function from one of the ones guaranteed to be available via the hashlib module in the Python standard library (currently md5, sha1, sha224, sha256, sha384, sha512). The current recommendation is to use sha256.

for example artifactory's pypi implementation only puts md5 in the fragment of their simple href https://www.jfrog.com/jira/browse/RTFACT-18495

Alternative Solutions

devpi/devpi#801 (comment)

Additional context

/cc @fschulze
jazzband/pip-tools#1109
view-source on: https://m.devpi.net/root/pypi/+simple/devpi-server/
and view-source on: https://pypi.org/simple/devpi-server/

@graingert
Copy link
Contributor Author

Another option that would be standardized across HTTP hosts
https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Want-Digest

@peterbe
Copy link
Owner

peterbe commented Jun 25, 2020

Pardon my ignorance, but what does this mean simplified?
At the moment hashin uses $index_url/pypi/$package_name/json and the default is index_url = https://pypi.org/
So what would it be in your suggestion?

@graingert
Copy link
Contributor Author

graingert commented Jun 25, 2020

@peterbe see the page source of https://m.devpi.net/root/pypi/+simple/devpi-server/
image

each of the urls have a #SHA256= line

@peterbe
Copy link
Owner

peterbe commented Jun 25, 2020

Yeah, or https://pypi.org/simple/hashin/

But that's not JSON. That would require parsing the HTML, no?

@graingert
Copy link
Contributor Author

the simple index api is an html subset, designed to be amenable to simple processing:

https://github.com/pypa/pip/blob/0b5ad47cbfe986335790e728b787c580b0b3c8b1/src/pip/_vendor/distlib/locators.py#L821-L822

@fschulze
Copy link

Another thing that would help is if get_package_data wouldn't use an absolute path, then the index could start with path components required by devpi (/user/index/...) and devpi (or a plugin for it) could provide the json hashin needs. Currently this isn't possible, because regardless of the path in the index_url it will always go to /pypi/%s/json and loose the path from index_url.

@peterbe
Copy link
Owner

peterbe commented Jun 25, 2020

Another thing that would help is if get_package_data wouldn't use an absolute path, then the index could start with path components required by devpi (/user/index/...) and devpi (or a plugin for it) could provide the json hashin needs. Currently this isn't possible, because regardless of the path in the index_url it will always go to /pypi/%s/json and loose the path from index_url.

True. I think that'd need to be part of the patch that "scrapes" instead of JSON.
Or, is this a worthwhile thing to have even if you're not using a index URL that requires HTML scraping?

@fschulze
Copy link

It is worthwhile, because then we could add the necessary json support on the devpi side and you don't have to change anything else. Scraping wouldn't be required anymore.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants
@peterbe @fschulze @graingert and others