Docker registry allows to serve any blob user wants #9755

miminar · 2016-07-08T15:38:30Z

Previously there was an authorization check comprising of existence check on a layer link inside the repository.

We're skipping layer links completely now. The Stat() and ServeBlob() calls reach directly to global registry blob store (storing blobs pushed to the registry). Because of that, any user, who is authorized to pull from at least one repository, can pull any blob he wants. Thanks to our blobDescriptorService().

The only restriction from pulling any blob a user wants is that he doesn't know what blobs are stored in the registry unless he is able to get somehow an image manifest or image object.

Version

Docker registry 2.4.0.

Steps To Reproduce

push any image as a userA to image stream A/IS
choose one of its layers (oc get -o yaml isimage <imagename>), let's call it layer L
login as a userB
create a new project P and a new image stream in there (oc create imagestream IS)
stat the blob using simple curl:

$ curl -v -X HEAD -u "userB:$btoken" "$registry_url/v2/P/IS/blobs/L"

Current Result

* About to connect() to 172.30.241.183 port 5000 (#0)
*   Trying 172.30.241.183...
* Connected to 172.30.241.183 (172.30.241.183) port 5000 (#0)
* Server auth using Basic with user 'alice'
> HEAD /v2/target/busybox/blobs/sha256:8ddc19f16526912237dd8af81971d5e4dd0587907234be2b83e249518d5b673f HTTP/1.1
> Authorization: Basic YWxpY2U6d1RyTFN6VnQxTjJvREl5ZENrVDEzR1pTekZsMnFYWVlYYmdDQmw4QzRPWQ==
> User-Agent: curl/7.29.0
> Host: 172.30.241.183:5000
> Accept: */*
> 
< HTTP/1.1 200 OK
< Accept-Ranges: bytes
< Cache-Control: max-age=31536000
< Content-Length: 667590
< Content-Type: application/octet-stream
< Docker-Content-Digest: sha256:8ddc19f16526912237dd8af81971d5e4dd0587907234be2b83e249518d5b673f
< Docker-Distribution-Api-Version: registry/2.0
< Etag: "sha256:8ddc19f16526912237dd8af81971d5e4dd0587907234be2b83e249518d5b673f"
< Date: Fri, 08 Jul 2016 15:28:43 GMT

Expected Result

Unauthorized

Additional info

Suggested priority: P1

The text was updated successfully, but these errors were encountered:

miminar · 2016-07-08T15:38:51Z

/cc @liggitt

I started addressing this.

smarterclayton · 2016-07-09T02:14:28Z

Security regressions are always P0.

miminar · 2016-07-11T18:01:38Z

Proposed solution.

Build a cache of blob digests in image stream:

type ImageStream struct {
    // ...
    struct Status {
        // ...

        // 1st cache
        // A cache for existing images
        // blob digest  -> blob size
        ImageBlobs map[string]int64

        // 2nd cache
        // A cache for images just being uploaded
        UploadedBlobs map[string]struct{
            Size int64
            Uploaded time.Time 
        }
    }
}

1st cache

is needed for proper authorization checks. We cannot use layer links stored in registry's storage for this purpose because they don't address a use-case of oc tag source targetrepository.

Without caching layer blobs, during a simple blobStore.Stat() we would need to fetch all images of particular image stream and search them for a presence of the blob. Such an operation happens at least twice during an image pull and at least 3 times during image push for a single blob.

This cache needs to be filled during migration. It needs to be updated during each addition/removal of an image or during image pruning.

The size information will give us a significant boost when fetching missing layer sizes (see #7706 and #9115).

It may be also useful for quota counting and usage display.

2nd cache

is needed for proper authorization check during a push of an image. When a manifest is being uploaded (as a last operation of a push), the registry verifies that all of its referenced blobs are present. This cache is a replacement for layer links that used to serve this purpose.

This cache will also prevent image pruner from removing blobs not referenced by any image.

Uploaded field will allow to determine aborted/unfinished image uploads by habing old-enough time-stamp. Such an entries can be pruned with corresponding blobs on registry's storage.

The cache will be updated during each blob upload, manifest PUT and image pruning.

Why a second cache? In comparison to the 1st, this will be a lot smaller in terms of items. All its entries need additional timestamp attribute though. Also, pruner needs to easily distinguish blob referenced by existing image from the one not yet referenced.

Cache the above inside the registry

To reduce round-trips to etcd, let's cache the information in the registry as well. Use digestToRepositoryCache for this purpose.

Modify stat-ing

On the lowest level, when the local blob is checked for presence inside our blobDescriptorService:

do the existence check for a blob in global registry store (filesystem check)
if not present, err out
see if it's in digestToRepositoryCache prefixed with no registry url (local blob)
if not present, query 1st cache in an image stream in etcd
if not present, query 2nd cache
err-out with known blob if not there
return blob descriptor otherwise

smarterclayton · 2016-07-11T18:17:14Z

Could we use layer links only for the second one?

On Mon, Jul 11, 2016 at 2:01 PM, Michal Minar notifications@github.com
wrote:

Proposed solution.
Build a cache of blob digests in image stream:

type ImageStream struct {
// ...
struct Status {
// ...
    // 1st cache
    // A cache for existing images
    // blob digest  -> blob size
    ImageBlobs map[string]int64

    // 2nd cache
    // A cache for images just being uploaded
    UploadedBlobs map[string]struct{
        Size int64
        Uploaded time.Time
    }
}
}

1st cache

is needed for proper authorization checks. We cannot use layer links
stored in registry's storage for this purpose because they don't address a
use-case of oc tag source targetrepository.

Without caching layer blobs, during a simple blobStore.Stat() we would
need to fetch all images of particular image stream and search them for a
presence of the blob. Such an operation happens at least twice during an
image pull and at least 3 times during image push for a single blob.

This cache needs to be filled during migration. It needs to be updated
during each addition/removal of an image or during image pruning.

The size information will give us a significant boost when fetching
missing layer sizes (see #7706
#7706 and #9115
#9115).

It may be also useful for quota counting and usage display.
2nd cache

is needed for proper authorization check during a push of an image. When a
manifest is being uploaded (as a last operation of a push), the registry
verifies that all of its referenced blobs are present. This cache is a
replacement for layer links that used to serve this purpose.

This cache will also prevent image pruner from removing blobs not
referenced by any image.

Uploaded field will allow to determine aborted/unfinished image uploads
by habing old-enough time-stamp. Such an entries can be pruned with
corresponding blobs on registry's storage.

The cache will be updated during each blob upload, manifest PUT and image
pruning.

Why a second cache? In comparison to the 1st, this will be a lot
smaller in terms of items. All its entries need additional timestamp
attribute though. Also, pruner needs to easily distinguish blob referenced
by existing image from the one not yet referenced.
Cache the above inside the registry

To reduce round-trips to etcd, let's cache the information in the registry
as well. Use digestToRepositoryCache
https://github.com/openshift/origin/blob/master/pkg/dockerregistry/server/digestcache.go#L14
for this purpose.
Modify stat-ing

On the lowest level, when the local blob is checked for presence inside
our blobDescriptorService
https://github.com/openshift/origin/blob/master/pkg/dockerregistry/server/blobdescriptorservice.go#L24
:

do the existence check for a blob in global registry store
(filesystem check)

if not present, err out

see if it's in digestToRepositoryCache prefixed with no registry
url (local blob)

if not present, query 1st cache in an image stream in etcd

if not present, query 2nd cache

err-out with known blob if not there

return blob descriptor otherwise

—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
#9755 (comment),
or mute the thread
https://github.com/notifications/unsubscribe/ABG_p-wmwpOqFQ2DV1pC-jh6TkD6vL2Bks5qUoUEgaJpZM4JIJYK
.

smarterclayton · 2016-07-11T18:20:50Z

So the alternative here for disclosure is to create an LRU cache of layer
ID to array of possible registries in the registry and then use that to do
positive caching. If it fails, we'd have to do the lookup against etcd
anyway (we don't want the cache to contain out of date negative entities).
That's a significant source of load from the api server every cache failure
(1 image stream + N image lookups).

On Mon, Jul 11, 2016 at 2:17 PM, Clayton Coleman ccoleman@redhat.com
wrote:

Could we use layer links only for the second one?

On Mon, Jul 11, 2016 at 2:01 PM, Michal Minar notifications@github.com
wrote:
Proposed solution.
Build a cache of blob digests in image stream:

type ImageStream struct {
// ...
struct Status {
// ...
    // 1st cache
    // A cache for existing images
    // blob digest  -> blob size
    ImageBlobs map[string]int64

    // 2nd cache
    // A cache for images just being uploaded
    UploadedBlobs map[string]struct{
        Size int64
        Uploaded time.Time
    }
}
}

1st cache

is needed for proper authorization checks. We cannot use layer links
stored in registry's storage for this purpose because they don't address a
use-case of oc tag source targetrepository.

Without caching layer blobs, during a simple blobStore.Stat() we would
need to fetch all images of particular image stream and search them for a
presence of the blob. Such an operation happens at least twice during an
image pull and at least 3 times during image push for a single blob.

This cache needs to be filled during migration. It needs to be updated
during each addition/removal of an image or during image pruning.

The size information will give us a significant boost when fetching
missing layer sizes (see #7706
#7706 and #9115
#9115).

It may be also useful for quota counting and usage display.
2nd cache

is needed for proper authorization check during a push of an image. When
a manifest is being uploaded (as a last operation of a push), the registry
verifies that all of its referenced blobs are present. This cache is a
replacement for layer links that used to serve this purpose.

This cache will also prevent image pruner from removing blobs not
referenced by any image.

Uploaded field will allow to determine aborted/unfinished image uploads
by habing old-enough time-stamp. Such an entries can be pruned with
corresponding blobs on registry's storage.

The cache will be updated during each blob upload, manifest PUT and image
pruning.

Why a second cache? In comparison to the 1st, this will be a lot
smaller in terms of items. All its entries need additional timestamp
attribute though. Also, pruner needs to easily distinguish blob referenced
by existing image from the one not yet referenced.
Cache the above inside the registry

To reduce round-trips to etcd, let's cache the information in the
registry as well. Use digestToRepositoryCache
https://github.com/openshift/origin/blob/master/pkg/dockerregistry/server/digestcache.go#L14
for this purpose.
Modify stat-ing

On the lowest level, when the local blob is checked for presence inside
our blobDescriptorService
https://github.com/openshift/origin/blob/master/pkg/dockerregistry/server/blobdescriptorservice.go#L24
:

do the existence check for a blob in global registry store
(filesystem check)

if not present, err out

see if it's in digestToRepositoryCache prefixed with no registry
url (local blob)

if not present, query 1st cache in an image stream in etcd

if not present, query 2nd cache

err-out with known blob if not there

return blob descriptor otherwise

—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
#9755 (comment),
or mute the thread
https://github.com/notifications/unsubscribe/ABG_p-wmwpOqFQ2DV1pC-jh6TkD6vL2Bks5qUoUEgaJpZM4JIJYK
.

legionus · 2016-07-11T20:12:53Z

@miminar We can do the last step and start to store in etcd information about which blobs correspond to which repository. This is the latest information which is not stored in etcd. Now it can be obtained using storage driver. If we put this information in the etcd, it will also help reduce the number of calls to storage driver.

miminar · 2016-07-12T13:51:54Z

Could we use layer links only for the second one?

We could for most of regular cases. The problem case is a push with a cross-repo mount from remote repository where we intercept blobStore.Create(ctx, options) and handle the mount option specially:

if the remote blob exists, we return immediately with distribution.ErrBlobMounted
otherwise we, pass down the request to upstream's blobStore

Unfortunately we cannot make the link ourselves because the link operation is private. We could try negatiate with upstream though.

@smarterclayton I agree that using local layer links is simpler and more efficient. ~~Shall we go that route with a possible risk of carrying linkBlob expose patch?~~ Update: Considering a rework of image pruning, I'd highly recommend tagging just uploaded blobs in etcd in target imagestream.

We can do the last step and start to store in etcd information about which blobs correspond to which repository. This is the latest information which is not stored in etcd. Now it can be obtained using storage driver. If we put this information in the etcd, it will also help reduce the number of calls to storage driver.

This would have to be a top-level resource. So it's a third variant. It addresses all the authorization use cases. Morover it would allow us to easily fetch layer blob sizes when missing in v2->v1 converted manifest if they already exist in any imagestream. Compared to the caches in imagestreams, the downside of this is a loss of imagestream size which would be nice for display and for potential limiting in the future.

miminar · 2016-07-12T14:12:25Z

What I'm worried about now is keeping the cache up to date. Imagine there's a image stream A with an image I under tag T. There's another image stream B having ImageStreamTag pointing to A:T. Now some user builds a new image and pushes it to A:T. Ideally, this operation would result in an update of both A's and B's ImageBlobs caches. The question is, how the registry finds out, it should update B's ImageBlobs.

There are many other cases like this. E.g. replace ImageStreamTag with an ImageStreamImage - this would result no only in additions to B but to removals as well. Images may also be tagged, and untagged. Each such change needs to be reflected on all the places.

To address it, can we also extend TagEventList and TagEvent types for a reverse mapping cache to all of the image stream tags referencing the particular tag or image?

type ImageStream struct {
    // ...
    struct Status {
        // ...
        Tags map[string]TagEventList
    }
}

type TagEventList struct {                                                                                      
    // ...
    Items []TagEvent                                                                                            
    // an array of <image stream>:<tag> entries referencing this tag
    References []string
}                                                                                                               


type TagEvent struct {
    // ...
    // an array of <image stream>:<tag> entries referencing this image
    References []string
}

Each image push, oc tag, oc delete istag would check these entries and
updated references ISTags and their ImageBlobs accordingly.

WDYT?

miminar · 2016-07-12T17:26:08Z

Could we use layer links only for the second one?

I agree that using local layer links is simpler and more efficient. Shall we go that route with a possible risk of carrying linkBlob expose patch?

I changed my mind. From a perspective of image pruner, an orphaned blob, as a result of incomplete push, is imposible to distinguish from a blob just uploaded and waiting for a manifest PUT to be referenced for the first time.

Upstream requires a garbage collector to be run on a registry instance being in a read-only mode because of this problem (deleting blobs while uploads happening at the same time). We plan to run periodic image prunes inside of registry without a possibility of switching to read-only mode.

I'd highly recommend marking just uploaded blobs in target imagestream in etcd instead of relying on repository layer links.

smarterclayton · 2016-07-12T17:38:16Z

I don't think that's practical. For one thing, it means the denormalization has to be maintained transactionally which we can't guarantee. It's reasonable that adding and removing a tagged image from an image stream requires the image to be loaded first, although that could lead to wedging (admin deletes image, can't be removed). It may be easier for right now to pull together the prototype lookup cache in the registry and see how complex that is while we discuss this issue further.

smarterclayton · 2016-07-12T17:39:51Z

On Jul 12, 2016, at 1:26 PM, Michal Minar notifications@github.com wrote:

Could we use layer links only for the second one?

I agree that using local layer links is simpler and more efficient. Shall
we go that route with a possible risk of carrying linkBlob expose patch?

I changed my mind. From a perspective of image pruner, an orphaned blob, as
a result of incomplete push, is imposible to distinguish from a blob just
uploaded and waiting for a manifest PUT to be referenced for the first
time.

Upstream requires a garbage collector to be run on a registry instance
being in a read-only mode because of this problem (deleting blobs while
uploads happening at the same time). We plan to run periodic image prunes
inside of registry without a possibility of switching to read-only mode.

Don't links have timestamps? Pruning and GC already requires a window of
safety based on time, so I don't see how that is being tracked for uploads.

I'd highly recommend marking just uploaded blobs in target imagestream in
etcd instead of relying on repository layer links.

miminar · 2016-07-12T17:50:19Z

Don't links have timestamps? Pruning and GC already requires a window of safety based on time, so I don't see how that is being tracked for uploads.

They have! And actually, it's possible to stat them now that we have storage middleware. Thanks for the suggestion. Problem of pruning uploaded blobs solved with layer links.

miminar · 2016-07-12T17:53:23Z

It may be easier for right now to pull together the prototype lookup cache in the registry and see how complex that is while we discuss this issue further.

OK, I'll start with this prototype which will fetch the images until blob is found.

miminar self-assigned this Jul 8, 2016

miminar added the priority/P1 label Jul 8, 2016

miminar added this to the 1.3.0 milestone Jul 8, 2016

smarterclayton added priority/P0 and removed priority/P1 labels Jul 9, 2016

danmcp added the component/imageregistry label Jul 11, 2016

miminar mentioned this issue Jul 11, 2016

Let's cache layers size globally #9779

Closed

miminar mentioned this issue Jul 13, 2016

User can get only blobs he's able to see #9819

Merged

openshift-bot closed this as completed in #9819 Aug 7, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Docker registry allows to serve any blob user wants #9755

Docker registry allows to serve any blob user wants #9755

miminar commented Jul 8, 2016

miminar commented Jul 8, 2016

smarterclayton commented Jul 9, 2016

miminar commented Jul 11, 2016

smarterclayton commented Jul 11, 2016

smarterclayton commented Jul 11, 2016

legionus commented Jul 11, 2016

miminar commented Jul 12, 2016 •

edited

Loading

miminar commented Jul 12, 2016

miminar commented Jul 12, 2016

smarterclayton commented Jul 12, 2016 via email

smarterclayton commented Jul 12, 2016

miminar commented Jul 12, 2016

miminar commented Jul 12, 2016

Docker registry allows to serve any blob user wants #9755

Docker registry allows to serve any blob user wants #9755

Comments

miminar commented Jul 8, 2016

Version

Steps To Reproduce

Current Result

Expected Result

Additional info

miminar commented Jul 8, 2016

smarterclayton commented Jul 9, 2016

miminar commented Jul 11, 2016

Build a cache of blob digests in image stream:

1st cache

2nd cache

Cache the above inside the registry

Modify stat-ing

smarterclayton commented Jul 11, 2016

smarterclayton commented Jul 11, 2016

legionus commented Jul 11, 2016

miminar commented Jul 12, 2016 • edited Loading

miminar commented Jul 12, 2016

miminar commented Jul 12, 2016

smarterclayton commented Jul 12, 2016 via email

smarterclayton commented Jul 12, 2016

miminar commented Jul 12, 2016

miminar commented Jul 12, 2016

miminar commented Jul 12, 2016 •

edited

Loading