Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add distribution spec project proposal #35

Merged
merged 19 commits into from
Apr 4, 2018
Merged
Show file tree
Hide file tree
Changes from 14 commits
Commits
Show all changes
19 commits
Select commit Hold shift + click to select a range
1bde248
Add distribution spec project proposal
caniszczyk Jan 23, 2018
c1a5479
Make updates after community feedback
caniszczyk Jan 23, 2018
6dc227c
Join Maintainers
ArangoGutierrez Jan 23, 2018
e2fc9af
Merge pull request #36 from ArangoGutierrez/patch-2
caniszczyk Jan 23, 2018
d20c56d
README: Link to the distribution proposal
wking Jan 26, 2018
a928e2b
distribution: Add in-scope and out-of-scope wording
wking Jan 26, 2018
6f1f720
Merge pull request #38 from wking/distribution-readme-link
caniszczyk Feb 7, 2018
6587fa0
Merge pull request #37 from wking/docker-bearer-token-spec
caniszczyk Feb 27, 2018
5e0175e
distribution: Copy-edits for the scope table
wking Feb 27, 2018
5dcd80a
Merge pull request #43 from wking/distribution-scope-copy-edits
caniszczyk Feb 27, 2018
75ede78
distribution: Reword scope table to avoid repository/image distinction
wking Mar 1, 2018
f92fdb3
Merge pull request #46 from wking/image-repository-wording
caniszczyk Mar 1, 2018
0c2ad77
distribution: Change from blame to rendered URI for image-index
wking Mar 6, 2018
4798414
distribution: Remove IANA auth-scheme sentence
wking Mar 7, 2018
0cd79c9
Merge pull request #48 from wking/remove-iana-auth-scheme-reference
stevvooe Mar 7, 2018
e07b90e
Merge pull request #47 from wking/distribution-image-index-blame
stevvooe Mar 14, 2018
d4b530c
Clean up language around tag listing and repository naming
dmcgowan Mar 14, 2018
2b4308e
proposals/distribution: remove non-sense about image indexes
stevvooe Mar 14, 2018
b6ec853
Merge pull request #50 from dmcgowan/add-distribution-proposal-cleanu…
crosbymichael Mar 26, 2018
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@ https://groups.google.com/a/opencontainers.org/forum/#!forum/tob (tob@opencontai
## Project Proposals

* [Digest](https://github.com/opencontainers/tob/blob/master/proposals/digest.md)
* [Distribution Spec](https://github.com/opencontainers/tob/blob/master/proposals/distribution.md)
* [Image Format Spec](https://github.com/opencontainers/tob/tree/master/proposals/image-format)
* [SELinux](https://github.com/opencontainers/tob/blob/master/proposals/selinux.md)
* [Tools](https://github.com/opencontainers/tob/blob/master/proposals/tools.md)
Expand Down
154 changes: 154 additions & 0 deletions proposals/distribution.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,154 @@
# Abstract

The Docker registry protocol has become the defacto standard across the container registry world.

In the OCI, having a solid, common distribution specification with conformance testing will ensure long lasting security and interoperability throughout the container ecosystem.

Copy link
Member

@mikebrow mikebrow Jan 29, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See above discussion please add:

This proposal also provides the container ecosystem with a means to discuss and schedule extensions to the distribution specification.```

## Proposal
Copy link
Member

@cyphar cyphar Jan 24, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thing I would like to clarify is whether we will actually be making improvements to the spec over time (as per how we did things in the other specifications) or whether we're just going to freeze on the current v2.0 and do nothing much afterwards -- since according to SemVer we'd need to work on v3.0 for that.

Also, it was my understanding that the submission of Docker Distribution as a specification would be something like "any future distribution spec we agree upon will have support for Docker Distribution" -- so as to unblock the distribution problem without locking us into never making significant progress on Distribution. But maybe I misunderstood?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cyphar I don't think there is anything in this protocol or proposal preventing "significant progress". The specification has been out and in the open since its inception and we have yet to receive any sizable improvement proposals. Most of the perceived limitations of this protocol are limitations in the client implementation that is part of docker.

There are some good changes that we can make in backwards compatible ways:

  1. The listing PRs I mentioned here.
  2. Stateless transfer. There are only a few small changes required to make this work.
  3. p2p already can work on this protocol.

In addition, to this, there is a lot more that can be integrated on top of this protocol as part of client resolution. Most of the issues around naming, signing and mirroring can be taken care of with no API changes.

However, I think it should be clear that the initial specification activity should be about ensuring that the specification matches the current state of the world without breaking compatibility.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think what would be good to see is some clear instructions for the non-expert in terms of writing and working on specifications. Coming from this kind of boat, I have the general sentiment that I want to contribute, and I want to learn about the process to improve expertise in this sort of contribution. I would look for something akin to a CONTRIBUTING.md, or even a simple bullet list in a README that goes through logical steps. Like:

  • Goal: to resolve conflict for technical specifications
  • Procedure:
  1. receiive / link to protocol for discussion
  2. board comments and asks questions as issues
  3. assignment goes to X
  4. issued discussed and resolved, something else...
  5. head maintainer(s) have final sign off on something
  6. publication via... X

I suppose this is a technical standard for reviewing technical standards, haha. For example, a nice parallel is to look at something like JOSS that has complete orchestration via a little robot integration. The workflow is clearly defined for the reviewer and reviewee, and the assigned editor. The parties involved fill in the gaps in terms of opening and resolving issues, but there is always clear definition. I'm not sure how something like this might fit for our group, but it would be great to think about to get greater community involvement and discussion.

Copy link
Contributor

@wking wking Jan 24, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The workflow is clearly defined for the reviewer and reviewee, and the assigned editor.

project-template (referenced from this proposal) is what the OCI uses for this role. See especially here and here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank @wking ! The general roles for a contributor and maintainer are well defined, but I'm wondering about a more specific "hand holding" how to contribute sort of document, even just a bullet list. If it's something that can be observed then I can observe, learn, and write something up. I'm mostly thinking about this to help new contributors such as myself to get into the ropes of the group :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure thing! A specific question before closing the discussion here. For this PR, given that we don't have code (nothing to test or critique), and it's a general text draft, what are the criteria we are using for evaluating it, and how do we know when it's ready to go? I think this is likely something that would be obvious for someone working on a lot of these drafts over time, and I'll just catch on, and it would be helpful if someone could jot down a few notes about these questions.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For this PR, given that we don't have code (nothing to test or critique), and it's a general text draft, what are the criteria we are using for evaluating it, and how do we know when it's ready to go?

Ah, this repo. I thought you meant the coming distribution-spec repo. This repo could use a CONTRIBUTING.md, but the voting semantics are covered in the OCI charter (linked from here). At some point a TOB member will think it ready and put it up for a TOB vote. Until then, we're free to suggest improvements.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@stevvooe

Right. Can you expand on these points (we had a short chat a while ago about them, but I don't think I fully understood what the plan was):

  • Stateless transfer. There are only a few small changes required to make this work.
  • p2p already can work on this protocol.

Also, I would still like to have some sort of .well-known identifier for registries so that you don't need to host a separate subdomain to have a registry. This would also be a good place to pin future extensions if it turns out that we do want to add something.

Just on this point:

The specification has been out and in the open since its inception and we have yet to receive any sizable improvement proposals.

The reason for this may be unrelated to whether people want to make changes, or have worthwhile improvements. As you mentioned, a lot of the percieved issues with Distribution are actually because the Docker client doesn't expose those features -- and so people who want to improve distribution may not want to go through improving all layers of the Docker stack to do so.

I guess what I'm saying is "let's not close the door on any improvement discussions", especially since this is now going to be an OCI spec and not a Docker one anymore.

Copy link
Member

@mikebrow mikebrow Jan 24, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cyphar what sort of language are you looking for to ensure the door is not closed for improvement discussions?

How about after the interoperability statement adding:

This proposal also provides the container ecosystem with a means to discuss and schedule extensions to the distribution specification.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The specification has been out and in the open since its inception and we have yet to receive any sizable improvement proposals.

The reason for this may be unrelated to whether people want to make changes, or have worthwhile improvements.
[...]
I guess what I'm saying is "let's not close the door on any improvement discussions", especially since this is now going to be an OCI spec and not a Docker one anymore.

To add to what @cyphar stated, I think that the pivot from a Docker spec to an OCI spec changes both the set and focus of the contributors; since the spec is no longer being driven by the Docker product's needs it can attract use-cases that would not have made sense for the Docker product.

@mikebrow That language looks fairly reasonable to me.


TL;DR; Move [`api.md`][api.md] to a new [distribution-spec project](https://github.com/opencontainers/distribution-spec).

This proposal covers the distribution API spec, and while it does not cover the code for the docker-registry, that implementation is considered the reference implementation. There are other implementations of this protocol, not all are open-source though (Google gcr.io, Amazon ECR, CoreOS Quay, Gitlab registry, JFrog Artifactory registry, Huawei Dockyard, etc).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is authentication in the scope of the image distribution?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, shall we call out image signing (trusted image)?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Image signing would be very useful for Singularity containers.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yuwaMSFT2 Unless I'm reading the current spec wrong, authentication is currently included.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is authentication in the scope of the image distribution?

The authentication for the registry protocol is just http-based authentication. There is a token authentication specification, which may be in scope, but there a lot of ways to actually implement it that may be context-specific.

Image signing would be very useful for Singularity containers.

This is not in scope. All resources are content-addressable and can be signed in external systems. Early versions of the specification and implementation had integrated signing, but there were a lot of problems with it. In practice, the enforcement needs to be in the hands of the client.

This doesn't mean the registry couldn't be used to store signature blobs, but that would require more thought. In practice, we've found the best approach is to decouple this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the token authentication - the fact that it is challenging suggests it's even more important to figure out. I would guess that more rather than fewer would want to use token auth, and perhaps the specification can have a default and then fall back cases for a client to implement that are guaranteed to work for most. Without that, we are in a situation of needing a custom implementation for each one, and that defeats the purpose of having a standard. Maybe we could try, and see how far we get? We can always fall back to sticking with just the basic.

And agreed about the signing, given that it's likely very different! It would be good to come back to this at some future point (and as it's more commonly done and discussed).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a token authentication specification, which may be in scope, but there a lot of ways to actually implement it that may be context-specific.

I'm in favor of moving over enough of the auth spec to allow clients to authenticate with Docker's Bearer approach without needing to leave the new project's specs. I've filed #37 against this PR with the changes I think we need for that and more detailed motivation.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The issue with this approach is that each auth provider will then have to issue docker-style tokens to interact with the registry. With a more open approach, each registry can choose which providers are integrated. There is also the issue of the access control model: the model used in docker tokens is specific to the way docker implements the registry. Other providers may want to choose to have different token models.

From the perspective of the client, most of this is opaque. They pull an image and authorize the pull through whatever flow makes sense for the context and provide an opaque value for an http header.


In the past when the topic of having an OCI specification around the distribution of container images was discussed, it was deferred as "let’s get the image format defined, meanwhile the industry will settle on a distribution standard". Fast forward, OCI image format is out and adopted, and the Registry v2 is the defacto standard. There is and will be use-cases for alternate methods and the future will likely hold creative ways to push, fetch and share container images, but right now this promotion serves to acknowledge by the OCI the current industry standard of distributing container images.
This proposal also provides the container ecosystem with a means to discuss and schedule extensions to the distribution specification.

There is polish that is needed e.g. broken links to storage-driver docs, as well as making sections more generic regarding the OCI descriptors and media-types, but on the whole this is a lateral move.

## Initial Maintainers

* Stephen Day <stephen.day@docker.com> (@stevvooe)
* Vincent Batts <vbatts@redhat.com> (@vbatts)
* Derek McGowan <derek.mcgowan@docker.com> (@dmcgowan)
Copy link
Contributor

@wking wking Feb 1, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With this proposal referencing project-template for the project's initial rules, we'll need a way to deal with project-template's Chief Maintainer role. Possibilities include:

My recommendation would be to remove the role (slightly more discussion here and here), but that PR has been dangling for 1.4 years, so I don't know how likely it is to land in the next month. Perhaps parallel work can land the Chief Maintainer removal, or opencontainers/project-template#41 may end up with the TOB in charge of project-template. With project-template (currently) outside of direct TOB control, the simplest approach would be to appoint a Chief Maintainer here, as that entirely within the TOB's control.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This happened, so nothing left to do for this specific point.


Additional Maintainers to consider:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The TOB needs to pick an initial maintainer set. I think these suggestions are for the TOB, and should be reviewed and either added to the initial maintainer list or dropped before the vote. A three-person initial maintainer set can add other maintainers on their own, so the TOB can drop tge whole consideration list and punt to the initial maintainers if it wants.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same.

How are TOB members supposed to interpret this list? Maintainers are picked based on current and past contributions to a codebase or area of expertise and usually their maintainer vote comes with justification of why they should be a maintainer in a project.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@caniszczyk, were we just dropping this list, and leaving it up to the above-three maintainers to add additional maintainers?


* Ahmet Alp Balkan (Google)
* Matt Moore (Google)
* Yuwa (MSFT)
* Clayton Coleman (Red Hat)
* Antonio Murdaca (@runcom) (Red Hat)
* Samuel Karp (@samuelkarp) (AWS)
* Mike Brown (IBM)
* Jimmy Zelinskie jimmy@coreos.com (@jzelinskie)
* Liu Genping <[liugenping@huawei.com](mailto:liugenping@huawei.com)>
* Vanessa Sochat (@vsoch) (Stanford) <vsochat@stanford.edu>
* Eduardo Arango (@ArangoGutierrez) (Sylabs) <eduardo@sylabs.io>

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For completeness, choose any/all that is needed:

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

## Code of Conduct

This project would incorporate (by reference) the OCI Code of Conduct ([https://github.com/opencontainers/tob/blob/master/code-of-conduct.md](https://github.com/opencontainers/tob/blob/master/code-of-conduct.md)).

## Governance and Releases

This project would incorporate the Governance and Releases processes from the OCI project template: [https://github.com/opencontainers/project-template](https://github.com/opencontainers/project-template).

## Project Communications

Both of the proposed projects would continue to use existing channels in use by the OCI developer community for communication including:

* GitHub for issues and pull requests
* The dev@opencontainers.org email list
* The monthly OCI developer community conference call
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does starting at v2 (as suggested below) get us out of the pre-1.0 weekly meeting recommendation? I'm not sure if the "we're starting at v2" approach was considered when writing those docs (I certainly hadn't considered it).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was discussed in the google doc

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hey @wking here you go! If you follow the link at the top, click on "Comments" in the top right, you can explore some of the discussion.

image

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine with starting at v2. I'm just not clear on whether monthly meetings are the right target level for a new spec (regardless of the number we put on our initial release). The discussion in this PR suggests folks are going to have lots of ideas. Project-template suggests weekly meetings during pre-1.0 development (presumably because the roadmap is less obvious/established then). And there's no reason that roadmap discussions and such couldn't happen asynchronously (but then maybe we want to drop the meeting recommendation from project-template?). Anyhow, I'd just like to make sure that, if this project starts out with monthly meetings, it was a conscious decision that we didn't expect to need weekly meetings.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right target level for a new spec

The specification is not new. It was already discussed in the open and 100+ comments.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wking sounds like something to discuss on the next monthly meeting :-)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While there is plenty of discussion here, not such an overhaul needed as to change increase from monthly. I am sure we'll adjust the meeting schedule as needed

* The #OpenContainers freenode IRC channel

## Versioning / Roadmap

The API spec is currently considered v2 and we will start the specification at v2.0. Fewer places to change and compare, and it would keep with it being a lateral move.
Copy link
Member

@AkihiroSuda AkihiroSuda Jan 30, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure we should start with v2.0, as the latest Docker-Distribution-API-Version header is set to registry/2.0 (https://docs.docker.com/registry/spec/api/#introduction)

How about starting with V2 Release 1.0?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or just v2.1

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or just v2.1

Somewhat complicating this approach is that the fact that Docker's v2 API has been receiving lettered minor/patch bumps. Presumably those will all be considered part of the 2.0.x series?

And if the goal is to start using SemVer (which sounds like a useful goal to me), we'll need to specify that and wait for some deprecation period before cutting v2.1, or we'll have all the 2.0 clients choke and die when the version-check endpoint sets Docker-Distribution-API-Version to registry/2.1. We want them to all be able to say “I'm 2.0 compliant, so a 2.1 registry is fine”.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wking In general, since the subversions are backwards compatible, we do not expose the subversions. In general, we should move away from path-level versioning and favor type-level versioning.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@AkihiroSuda are you saying for that header, or just tagged releases or both?

Copy link
Member

@jzelinskie jzelinskie Mar 12, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, Docker v2-2 is the most recent protocol -- I hope everyone here knows that and has just been using 2.1 as an example.


## Frequently Asked Questions (FAQ)

**Q: Does this include the code of the docker-registry?**

A: No. This is an API specification discussion.

**Q: Does this change the OCI Charter or Scope Table?**

A: Not the charter, but it does change the scope table.
This project is scoped to specifying per-image client ↔ registry interaction with an [HTTP][rfc7230]-based protocol.
The following scope entries should be removed from the [scope table][scope]:

* “Use of Hash as Content Addressable name for immutable containers”.
This entry is in scope for this project, and a more detailed entry will be added as described below.
* “Creating Reference spec for optional DNS based naming & distribution”.
This entry conflates naming and distribution, which will be separated by this proposal.
* “Standardizing on a particular Distribution method”.
This proposal will provide one (of possibly many) distribution specifications, so the old “There is no current agreement on how to distribute content” no longer applies.

The following entries should be added to the [scope table][scope]:

* “Specifying authentication and authorization schemes”.
Docker's current registry uses an [extension][token] of the [`Bearer`][rfc6750] [auth scheme][rfc7235-s2.1].
Work on specifying Docker's scheme will continue independently, and is orthogonal to the registry API.

* What: Specifying authentication and authorization schemes
* In/Out/Future: Out of scope
* Status: N/A
* Description: Defining protocols for authenticating and authorizing distribution access.
* Why: As an HTTP-based protocol, clients and servers can negotiate authentication via HTTP's [challenge-response authentication framework][rfc7235-s2.1].
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be specified in such a way that the bearer can be opaque to the registry implementation.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be specified in such a way that the bearer can be opaque to the registry implementation.

Can you file a PR against this branch or suggest alternative wording? I think making negotiated authentication out of scope (as I've tried to do with this scope entry) is even stronger than “opaque to the registry implementation”. More on this specific issue in #37, where I initially tried to make the Bearer auth part of the scope, but pivoted to the wording here based on @dmcgowan's comment.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be specified in such a way that the bearer can be opaque to the registry implementation.

The mentioned rfc is broad enough to cover this case. This section is just saying the registry can return a response requesting a specific authentication type (could be bearer, basic, or whatever). The comment below this could probably be more generalized or removed completely as the rfc already describes this behavior in detail.


* “Creating a reference spec for optional DNS based naming and discovery”.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems completely irrelevant.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems completely irrelevant.

I'm saying that it's out of scope for this project and in scope for future OCI work. Do you disagree with either of those? If not, I think we need something in the scope table around discovery/naming to replace the naming portion of the current “Creating Reference spec for optional DNS based naming & distribution” entry (as I say in the next line).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This section is so confusing that its impossible to understand what is intended here. Above it says "FAQ", then it makes a bunch of statements with quotes. What does this effort have to do with naming and discovery?

Discovery and registry protocols are completely separate and do not need to be added together.
This entry replaces part of the previous “Creating Reference spec for optional DNS based naming & distribution” entry.

* What: Creating a reference spec for optional DNS based naming and discovery
* In/Out/Future: In scope for future specification
* Status: Work not yet started
* Description: Define a protocol for resolving an image name to retrieval information.
When we address this, we will also allow for alternative name-to-image discovery protocols in parallel with the OCI-specified protocol.
* Why: It is reasonable to provide a standardized way to use DNS based distribution in conjunction with OCI without requiring its use.
There are many good use cases for DNS based distribution, but not all use cases support this.
Furthermore, encoding the location of a bundle into the bundle can cause issues with downloads from alternate locations other than the origin specified in the name.

* “Specifying a distribution method”.
This entry replaces part of the previous “Creating Reference spec for optional DNS based naming & distribution” and “Standardizing on a particular Distribution method” entries.

Retrieving image indexes covers the current “tag listing” (e.g. “what named manifests are in `library/busybox`?”), because tags are entries in the image format's [`manifests` array][manifests].
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"because tags are entries in the image format's [manifests array][manifests]."

This statement is completely false.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is only true for image layout and has nothing to do with images distributed over this API.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"because tags are entries in the image format…"

This statement is completely false.

Would you accept:

… because tags can be represented as entries in the image format's [manifests array][manifests].

?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, because these are not true statements.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, because these are not true statements.

Image-spec says:

A common use case of descriptors with a "org.opencontainers.image.ref.name" annotation is representing a "tag" for a container image.

Can you go into more detail about how using manifests with org.opencontainers.image.ref.name annotations is not capable of representing image tags?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the addition of ref.name was a poorly thought addition and doesn't have a relationship here. They are separate things. Yes, they could represent image tags, but the tagging in distribution spec is correctly unique (ie points at a manifest list), whereas ref.name tries to embed something. As I've said about a million times, ref.name is only valid on the layout index.json. If we want to use here as a future change, the maintainers can make that decision. Either way, this statement is wholly confusing.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I've said about a million times, ref.name is only valid on the layout index.json.

In that case, I don't see why tags belong in the distribution spec at all. If manifests are the highest-level object that is represented in the API, then why bother with tag-listing at all? Just require consumers to ask for exactly the manifest they need with something like:

$ curl -H 'Accept: application/vnd.oci.image.manifest.v1+json' /v2/library/alpine/3.7

if, instead, you want to continue to provide an endpoint that lists multiple tags, is there really a benefit to returning a list of strings instead of a list of descriptors?

Copy link
Member

@dmcgowan dmcgowan Mar 13, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In that case, I don't see why tags belong in the distribution spec at all

That is something we could consider for enhancing the protocol. Right now the API can mostly be used without tags but manifests cannot be uploaded without a tag. It is best to think of manifests as the highest level object and tags as part of the API for usability. Tags separated from the API is sometimes useful for security, this is what notary was created to solve.

As for this API, can you please remove language linking tags with the manifests and indexes, that is confusing as there is no relationship within the API. You can keep language about tag listing endpoints as being in scope, although as mentioned, there are other ways to get lists of tags from external indexes.

Other tag-listing endpoints needed for backwards-compatibility are therefore in scope as well.

Grouping image indexes in repositories is considered part of distribution policy or content management, which are out of scope for this entry's per-image action.
For example, “what images are under `library/`?” is out of scope for this project.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the perspective of this specification, there is no relationship between library and library/foo. This allows one to arbitrarily map the registry namespace to different schemas. In this specification, they are just treated as a name.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this approach as well - having developed tools that separated some prefix / collection of a namespace (e.g., library) from the rest (e.g., "foo") led to much error, when really we can just call the whole thing a group (name or namespace?) It's a bit like the "file hierarchy" in Google Storage - it just looks like folders / files but it's really just an index with slashes in it. So I agree -
instead of having to classify that "library" is a different thing from "library/foo" I think thus it's better to have them treated as a name, and then tools that want to mimic a filesystem (or search, etc.) can do things akin to name.startswith('library') etc.

Copy link
Contributor

@wking wking Mar 7, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the perspective of this specification, there is no relationship between library and library/foo. This allows one to arbitrarily map the registry namespace to different schemas. In this specification, they are just treated as a name.

That's fine with me. The point I'm trying to make around here is that if library/foo is representable as an image-index, then operations on higher-level objects like library/ are out of scope for this PR. There was a lot of wording debate on this in #44 which was eventually resolved in #46, but if you have different wording preferences for that idea I'm open to changes (and it's @caniszczyk's PR anyway ;).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think thus it's better to have them treated as a name, and then tools that want to mimic a filesystem (or search, etc.) can do things akin to name.startswith('library') etc.

I think this should be out of scope for the distribution API. #46 tries to summarize why I think /v2/_catalog should be out of scope, and there's much more discussion leading up to that change in #44.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wking These are really technical considerations that I don't think belong in the proposal. The only takeaway here is that _ start is reserved in the repo names to avoid overlap with endpoints like /v2/_catalog and to allow space for in-band backend mappings.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me clarify: repository naming is completely in scope here. The API isn't very useful without it. @vsoch's understanding is correct. How repository names are mapped to image names is up to the implementation or defined by convention.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are really technical considerations that I don't think belong in the proposal.

I want to set clear enough scope boundaries that operations like the current /v2/_catalog or the API form of user's image-index listing, while keeping the API form of tags within an image-index in scope. Since there's already a /v2/_catalog in Docker's api.md, I think we need to get technical enough to make that distinction.

The only takeaway here is that _ start is reserved in the repo names to avoid overlap with endpoints like /v2/_catalog and to allow space for in-band backend mappings.

I'm fine with the OCI distribution API to reserve a namespace like /v2/_.* for backend extentions. I don't think that conflicts with leaving /v2/_catalog as a docker/distribution-specific extension.

Let me clarify: repository naming is completely in scope here. The API isn't very useful without it.

That's fine with me. What does it have to do with making “what images are under library/?” out of scope for the distribution API?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The thing you linked in the hub is not, at all, an implementation of this specification. The catalog endpoint is just a raw listing of repositories. It's out of scope because it is a management function of the registry and not required for runtimes to pull an image.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wking Can you fix the wording here? This should only call out that hierarchies aren't a part of this specification.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has been fixed under #50.


* What: Specifying a distribution method
* In/Out/Future: In scope
* Status: In progress (see opencontainers/distribution-spec)
* Description: Define a protocol for creating, retrieving, updating, and deleting objects defined in the [image specification][image-spec].
Listing repositories (like [`/v2/_catalog`][catalog]) is a multi-[image-index][] action, which is out of scope for this entry.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, there is not relation to "multi-image-index" and the catalog endpoint. Please avoid making incorrect statements.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, there is not relation to "multi-image-index" and the catalog endpoint. Please avoid making incorrect statements.

How would you prefer to phrase it? Do you agree that /v2/_catalog is a registry-level operation? Do you agree that it may return multiple repositories? Do you agree that repositories may contain multiple images?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have no clue what is being said here. It literally makes no sense. The specification makes no reference to a "multi-image-index" and we don't have anything called that in the image-spec. It is just made up.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this possibly confusion caused by a misunderstanding of ManifestLists?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are unrelated concepts.

Copy link
Member

@jzelinskie jzelinskie Mar 13, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rereading the sentence, it sounds there's an assumption that the registry implements a 'multi-image index' (e.g. an indexed repository table in a relational database) used to serve the catalog endpoint. This document should avoid making any assumptions about the registry implementation.

Can we rephrase this to say something along the lines of:

Listing repositories is a registry-wide operation; the implementation of which is out of scope of this document

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jzelinskie That sounds great!

I've submitted this change as part of #49.

Managing groups of image indexes requires multi-[image-index][] actions, which are out of scope for this entry.
Listing image indexes within a group is a multi-[image-index][] action, which is out of scope for this entry.
* Why: This specification will provide one (of possibly many) distribution specifications.
Alternative distribution specifications may be developed for uses cases not covered by this specification, but defining them is currently out of scope for the OCI.

* “Retrieving image content by its content-addressable hash”.
Docker's registry API already provides [endpoints for fetching manifest objects by digest][get-manifest].
Docker's registry API does not currently provide endpoints for fetching [image-index][] objects by digest, but this is the project where that will happen.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing link here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing link here.

This is the “implicit link name shortcut”, and GitHub renders it fine.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can forgo the [] and just leave it [image-name] and it'll continue to work, as well.


* What: Retrieving image content by its content-addressable hash
* In/Out/Future: In scope
* Status: In progress (see opencontainers/distribution-spec)
* Description: Specify a protocol for retrieving an [image index][image-index], [manifest][], or other [image specification][image-spec] object from a distribution engine by its content-addressable hash.
* Why: Using a hash as a name is a way to ensure a unique image name without relying on a particular naming authority or system.
Using hashing for name is an acceptable addition as it does not encode any centralized namespace.

The following entries should remain in the [scope table][scope] but not be addressed by this project:

* “Specifying way to attach signatures”.
We don't need to address this as part of distribution, because all resources are content-addressable and can be signed in external systems.

## Related GitHub Issues

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about the proposal of new APIs to the spec? Shall we do it after the vote as future incremental improvements, or shall we raise and discuss them now?

From our experience of running our registry service (Azure Container Registry), there are quite some popular feedbacks/lessons we learned and would like to see if they could be included in the future registry spec:

  1. Richer image management ability, including delete repository, purge unreferenced blobs, etc.
  2. Richer discovery ability where user has lots of images, like list tag by creation time, list manifest by creation time, etc. So basically to support more query parameters in the list APIs.
  3. Richer metadata APIs. Basically user wants to put additional metadata with their image, in additional to a simple tag.
  4. Native multi-tenant support. Something like /v2/tenants/{tenant}/repositories/{name}/manifests/{reference}. Since more and more public cloud services are adopting the registry service, a native support for multi-tenant would be great IMHO.

We would like to learn if there is any opportunity that these feedback could be included in the distribution spec. Thanks!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Image size and format would also be very useful, beyond sniffing with a HEAD request. There is also a distinction between a tag (e.g., latest) and a version (a hash or commit), we've had users asking for both with Singularity Hub/Registry, and then for all these requests, the default should be reasonable of course! Are we going with docker defaults like latest for tag, library for namespace, and other registries should follow suit? @yuwaMSFT2 for tenant would that be akin to the manifests list where you can request a particular OS or architecture? The user client would then always be required to make two calls https://github.com/docker/distribution/blob/master/docs/spec/manifest-v2-2.md#manifest-list (and this is what we are doing currently in Singularity)!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yuwaMSFT2 @vsoch interesting extensions. See above proposal to add a sentence about discussing and scheduling in extensions without going into specific detail.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vsoch no the metadata of the manifest is not what I meant for tenant.
It's more to make the repository hierarchical. Like on dockerhub you can have organization, then under the organization you can have repository. Current uri would be /v2/orgname/reponame/, and the registry treat the whole path component "/orgname/reponame" as name parameter in the route matching. It will be good if we can separate it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mikebrow this looks good to me!
@yuwaMSFT2 this is badly needed, and I strongly +1. Right now we clump the entire thing that comes before the digest (after the @) and tag (after the :) and the image name (e.g., ubuntu) as the namespace. But given registry urls and the potential for local registries to decide to use a custom namespace, parsing this string has been more than challenging). Being able to separate these two things? 🙏 !!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yuwaMSFT2 A lot of the "extensions" you've referenced are really about content management, rather than content distribution. This API has almost always been about content distribution. Adding these extensions to the specification provides little room for vendors to implement these kinds of functionality in a way that fits their platform.

Right now we clump the entire thing that comes before the digest (after the @) and tag (after the :) and the image name (e.g., ubuntu) as the namespace.

@vsoch None of this is a part of this specification. Even the hub behavior @yuwaMSFT2 was referencing is the hubs implementation. This specification allows arbitrary paths. Other things, like latest, are completely up to the client implementation and the registry doesn't care and this specification doesn't even mention.

Let's make sure this exercise is critiquing the specification and not Docker's implementation of it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yuwaMSFT2 agreed about content management vs. distribution, I've never distinguished those two before.

In that the uri is serving as an entry point into a registry, even if it doesn't conform to a specific namespace, tag, arguably I should be able to (knowing the API conforms to the registry) programatically predict the base uri to be calling based on this alone. It could be as general as a general flow of regular expressions to follow, or as specific as a discrete set of different kinds (e.g., abstract uri vs. an arbitrary path). It would be a stronger specification to have some level of predictibility here, otherwise I still need to write a custom thing (after checking!) for each registry endpoint to even interact with most endpoints.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agreed about content management vs. distribution, I've never distinguished those two before.

The two systems have vastly different requirements. For example, even though the Hub looks like its the same system, they are logically separate between the registry and the hub ui, in practice. This prevents the registry deployment from getting complicated. All of the UI and management of images gets implemented in a separate system, allowing them to grow in features and functionality.

I'm still not quite understanding what you're talking about regarding URIs: the format that you're talking about parsing has nothing to do with this specification. Different systems may implement completely different behaviors and formats, depending on their opinion.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vsoch @yuwaMSFT2 I expanded a bit on the divide between content distribution and management in #37 (comment). I'll repost here, as there may have been some confusion.

This API had everything in it required to integrate with a container runtime (ie. pull an image, figure out what tags are in a repo, etc.) and APIs required to integrate with content management systems. This includes listing repositories and tags. There is also a notification API that enables in the registry implementation, but isn't a part of this specification (there may be an argument about pulling this in).


* Simplifies tag listing: docker/distribution#2169
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's add a statement here about what these PRs mean. Are these additions required by the proposal? They have been vetted and implemented, so I would think they are a good candidate.

* Allows listing of manifests: docker/distribution#2199

[api.md]: https://github.com/docker/distribution/blob/5cb406d511b7b9163bff9b6439072e4892e5ae3b/docs/spec/api.md
[catalog]: https://github.com/docker/distribution/blob/5cb406d511b7b9163bff9b6439072e4892e5ae3b/docs/spec/api.md#catalog
[get-manifest]: https://github.com/docker/distribution/blob/5cb406d511b7b9163bff9b6439072e4892e5ae3b/docs/spec/api.md#pulling-an-image-manifest
[image-spec]: https://github.com/opencontainers/image-spec/
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

quick question - some of these links are to versioned specs (the ones with the blob and commit) and others are to specific tags, and then others to the blame view. Is this intentional?

Copy link
Contributor

@wking wking Mar 2, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some of these links are to versioned specs (the ones with the blob and commit) and others are to specific tags, and then others to the blame view. Is this intentional?

Yeah. This spec link floats, so distribution can add more endpoints if the image-spec gains more object types. The Docker links have an explicit commit, because they haven't tagged a release since the last api.md comit (I think). Otherwise I prefer pinning by tag. Blames are for when I need a specific anchor that the rendered Markdown doesn't provide.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok cool! Just wanted to double check, makes perfect sense.

[image-index]: https://github.com/opencontainers/image-spec/blame/v1.0.1/image-index.md
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason for the blame links?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

heyo! I think this was explained a bit up here --> #35 (comment) (I had the same question!)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By that explanation this link does not need to be a blame. The other blame link to the same doc has an anchor, but it might be wise to consider not using blame and rather pointing to the parent section so the link is navigable to the sub sections.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By that explanation this link does not need to be a blame.

Good point. I've filed #47 against this PR to fix this link.

The other blame link to the same doc has an anchor, but it might be wise to consider not using blame and rather pointing to the parent section so the link is navigable to the sub sections.

I prefer linking directly to manifests instead of linking to the whole section (~85 lines of Markdown), but if @caniszczyk and/or the TOB prefer section links I'll survive.

[manifest]: https://github.com/opencontainers/image-spec/blob/v1.0.1/manifest.md
[manifests]: https://github.com/opencontainers/image-spec/blame/v1.0.1/image-index.md#L23
[rfc6750]: https://tools.ietf.org/html/rfc6750
[rfc7230]: https://tools.ietf.org/html/rfc7230
[rfc7235-s2.1]: https://tools.ietf.org/html/rfc7235#section-2.1
[scope]: https://www.opencontainers.org/about/oci-scope-table
[token]: https://github.com/docker/distribution/blob/5cb406d511b7b9163bff9b6439072e4892e5ae3b/docs/spec/auth/token.md