Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update CRI to optionally use transfer service for image pull #8515

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

fangn2
Copy link
Contributor

@fangn2 fangn2 commented May 15, 2023

This PR is to update CRI to optionally use transfer service for image pull.

Part of #8227

  1. Create a new CRI config ImagePullWithTransferService to specify whether to use transfer service for image pull
  2. Move CRI image pull related configs to server side, i.e. transfer service.

The PR is still in progress, put out to for early review and utilize CI for integration tests.

@fangn2 fangn2 marked this pull request as draft May 15, 2023 03:01
@k8s-ci-robot
Copy link

Hi @fangn2. Thanks for your PR.

I'm waiting for a containerd member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@fangn2 fangn2 changed the title [WIP]Update CRI to optional use transfer service for image pull [WIP]Update CRI to optionally use transfer service for image pull May 15, 2023
@mikebrow
Copy link
Member

/ok-to-test

pkg/cri/config/config.go Outdated Show resolved Hide resolved
pkg/cri/config/config.go Outdated Show resolved Hide resolved
@dmcgowan dmcgowan self-assigned this May 30, 2023
return nil, fmt.Errorf("failed to pull and unpack image %q: %w", ref, err)
}

return &runtime.PullImageResponse{ImageRef: ref}, nil
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

need to update the image store.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for review! Will address it in the coming changes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.
Just curious CRI currently maintain an in-memory image cache(there could be situations that the CRI image store and containerd image store are out of sync, like other clients e.g. ctr removed an image). Can we get rid of it by directly using contained image store?

@fangn2 fangn2 force-pushed the cri-image-transfer branch 2 times, most recently from fd0d0c3 to 113293b Compare August 9, 2023 14:41
@fangn2 fangn2 force-pushed the cri-image-transfer branch 6 times, most recently from 45eaa5d to 88e6abe Compare August 16, 2023 16:52
@mxpv mxpv added the status/needs-update Awaiting contributor update label Sep 7, 2023
@estesp
Copy link
Member

estesp commented Oct 3, 2023

CRI test failures seem related to the new (optional) image verifier service merged in #8493; the image verifier plugin is a required plugin for transfer plugin now and the error seems to be related to that? Seems you rebased, so I'm not sure what else has to change to make that work properly?

Error:

2023-10-03T02:31:08.9836244Z === NAME  TestCRIImagePullTimeout/HoldingContentOpenWriter
2023-10-03T02:31:08.9837268Z     build_local_containerd_helper_test.go:109: 
2023-10-03T02:31:08.9838989Z         	Error Trace:	/home/runner/work/containerd/containerd/integration/build_local_containerd_helper_test.go:109
2023-10-03T02:31:08.9840452Z         	            				/home/runner/work/containerd/containerd/integration/image_pull_timeout_test.go:78
2023-10-03T02:31:08.9841049Z         	Error:      	Received unexpected error:
2023-10-03T02:31:08.9842280Z         	            	no plugins registered for io.containerd.image-verifier.v1: not found
2023-10-03T02:31:08.9843017Z         	Test:       	TestCRIImagePullTimeout/HoldingContentOpenWriter
2023-10-03T02:31:08.9843552Z     build_local_containerd_helper_test.go:120: 
2023-10-03T02:31:08.9844427Z         	Error Trace:	/home/runner/work/containerd/containerd/integration/build_local_containerd_helper_test.go:120
2023-10-03T02:31:08.9845822Z         	            				/home/runner/work/containerd/containerd/integration/image_pull_timeout_test.go:78
2023-10-03T02:31:08.9846406Z         	Error:      	Received unexpected error:
2023-10-03T02:31:08.9847983Z         	            	failed to get "io.containerd.transfer.v1" plugin: no plugins registered for io.containerd.image-verifier.v1: not found
2023-10-03T02:31:08.9848807Z         	Test:       	TestCRIImagePullTimeout/HoldingContentOpenWriter
2023-10-03T02:31:08.9861131Z === NAME  TestCRIImagePullTimeout/NoDataTransferred
2023-10-03T02:31:08.9862011Z     build_local_containerd_helper_test.go:109: 
2023-10-03T02:31:08.9863128Z         	Error Trace:	/home/runner/work/containerd/containerd/integration/build_local_containerd_helper_test.go:109
2023-10-03T02:31:08.9864564Z         	            				/home/runner/work/containerd/containerd/integration/image_pull_timeout_test.go:216
2023-10-03T02:31:08.9865157Z         	Error:      	Received unexpected error:
2023-10-03T02:31:08.9866373Z         	            	no plugins registered for io.containerd.image-verifier.v1: not found
2023-10-03T02:31:08.9867085Z         	Test:       	TestCRIImagePullTimeout/NoDataTransferred
2023-10-03T02:31:08.9867605Z     build_local_containerd_helper_test.go:120: 
2023-10-03T02:31:08.9868487Z         	Error Trace:	/home/runner/work/containerd/containerd/integration/build_local_containerd_helper_test.go:120
2023-10-03T02:31:08.9870257Z         	            				/home/runner/work/containerd/containerd/integration/image_pull_timeout_test.go:216
2023-10-03T02:31:08.9871126Z         	Error:      	Received unexpected error:
2023-10-03T02:31:08.9872738Z         	            	failed to get "io.containerd.transfer.v1" plugin: no plugins registered for io.containerd.image-verifier.v1: not found
2023-10-03T02:31:08.9873513Z         	Test:       	TestCRIImagePullTimeout/NoDataTransferred
2023-10-03T02:31:08.9874021Z --- FAIL: TestCRIImagePullTimeout (0.01s)

@dims dims added the area/cri Container Runtime Interface (CRI) label Feb 7, 2024
@fangn2 fangn2 force-pushed the cri-image-transfer branch 2 times, most recently from cdd0772 to cc07653 Compare February 13, 2024 22:04
@@ -130,6 +130,11 @@ func init() {
return nil, fmt.Errorf("no matching diff plugins: %w", errdefs.ErrNotFound)
}

// If CheckPlatformSupported is false, we will match all platforms
if !config.CheckPlatformSupported {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dmcgowan I added this config to transfer plugin to optionally not enforcing strict platform matching in unpackConfiguration platform, snapshotter due to cri-tools pull test that pulls an image digest which is arm64 arch based.
This test failed on the CI with the default platform(amd64) and snapshotter combination.
Does it make sense to have such config here? or should we update containerd config file to pass the test in this case?

My understanding is with the defaultUnpackConfiguration, i.e. amd64,overlayfs, transfer service doesn't allow unpacking different arch(arm64) of image other than the running platform unless we explicitly add that UnpackConfiguration arm64, overlayfs to the supported platform. This is different with the client.pull approach.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We really need a better platform matcher for that case. For unpack configuration it should really be able to just match OS, the architecture does not matter too much for unpacker. Where is it enforcing this check today, if the image itself is pulled by digest.

@fuweid
Copy link
Member

fuweid commented Mar 7, 2024

ping @fangn2 ~ any update here?

@swagatbora90
Copy link
Contributor

@fangn2 Are you still working on this? If not, I can help push this forward.

@swagatbora90 swagatbora90 force-pushed the cri-image-transfer branch 3 times, most recently from bad8d2c to cbad8e6 Compare July 16, 2024 22:51
@swagatbora90
Copy link
Contributor

swagatbora90 commented Jul 16, 2024

Test are passing now.
Added two main changes:

  1. Added a WithResolver option to transfer service registry to pass custom resolver. CRI already configures it own resolver, so we can just pass that to transfer service.
  2. Had to create a client.TransferService() method to use in initLocalCRIImageService which is being used by some of the CRI integration tests. Wondering if there is a better way to do this without exposing a new api just for the integration test?

@fangn2 fangn2 changed the title [WIP]Update CRI to optionally use transfer service for image pull Update CRI to optionally use transfer service for image pull Jul 17, 2024
@fangn2 fangn2 marked this pull request as ready for review July 17, 2024 01:58
@fangn2
Copy link
Contributor Author

fangn2 commented Jul 18, 2024

/retest-required

fangn2 and others added 2 commits July 19, 2024 01:15
Signed-off-by: Tony Fang <nhfang@amazon.com>
Signed-off-by: Swagat Bora <sbora@amazon.com>
@@ -48,6 +48,7 @@ type registryOpts struct {
creds CredentialHelper
hostDir string
defaultScheme string
resolver remotes.Resolver
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Resolver is not serializable, what are the equivalent settings it needs to initialize it? It doesn't need to share resolvers

Copy link
Contributor

@swagatbora90 swagatbora90 Jul 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pointing this out @dmcgowan. I can modify the implementation to ensure the necessary settings are provided directly to initialize the registry.

I ran into issue with one setting where a pull request reporter was passed to the resolver in order to wrap the underlying http client and track active requests and bytes read. This is used to implement image pull progress timeout. I think I should be able to handle this using transfer.Progress. I can give that a try but let me know if you have any suggestion.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/cri Container Runtime Interface (CRI) ok-to-test size/L status/needs-update Awaiting contributor update
Projects
Status: Work In Progress
Development

Successfully merging this pull request may close these issues.

9 participants