-
-
Notifications
You must be signed in to change notification settings - Fork 232
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reduce container image pull time by supporting partial pulls #2715
Comments
See: https://www.redhat.com/sysadmin/faster-container-image-pulls. |
I am still almost a newbie about advanced docker usage... I let the experts give their opinion ^^ cc @Kurt-von-Laven @bdovaz @echoix |
I read just a little on this this weekend by following the links you provided. Do you know if there is any more recent activity in this field since 2021? Most articles available for eStargz seem to be released in the same time frame. In all cases, time to support it would be worth it only if it can be used by GitHub actions (and it works better for a runner with two cores, the most common usage of MegaLinter). Before reading I was thinking that it was another "Docker optimizing SaaS" that was thinking to dynamically remove files, and using a shallow image to pull the rest when already started. But it isn't the case, and seems to be able to be compatible with existing clients (without any advantage) since it is more of a metadata and client workaround. Both options seems to still work as OCI containers. Either way, if you'd like to play a little with it and look at where it could be supported I'll be glad to look back to it. Or other usages of it in the wild |
@echoix Advised is to use See https://www.slideshare.net/KoheiTokunaga/starting-up-containers-super-fast-with-lazy-pulling-of-images for a (somewhat dated) overview of this topic. |
I explored and made a little research during small free times this week. I learned that docker is supposed to use pigz if available when pulling to decompress layers (ie unpigz is available). I tried to see if it could be used in GitHub actions (on my fork), but didn't manage to get a conclusive result since pulling the Docker image by the runner happens before I can execute a step to install pigz. When using composite actions, I don't seem to be able to specify the correct usage to the docker image, and running it manually seems to be missing context, so it seems I didn't feed the full command line and all the features GitHub actions seem to provide, so it maybe not be a good idea for general usage. If you want to test, on GitHub actions, updating apt and installing pigz takes 9 seconds, and the baseline for pulling the megalinter python beta flavor took 43-45 seconds, so the time to beat if pigz ever helped, was to reduce by more than 9 seconds. I think I read in issues and discussions that some people had like 18-23% improvement with multiple cores but I can't find back the source now :(. In that case of the Python flavor, it's no gain (43x0.23=9.89 sec) if it ever works. And decompression in Docker stays serial between layers, but a layer can be decompressed in parallel. Next, I looked at the status of zstd compression. It could be advantageous to have the size of the download and the speed of Other container software implementations for OCI (other than Docker) seem to have it figured out for a while. I saw a lot of promising tricks that could be used by the usage of the containerd container store if the transition of Docker continues as it is going. But now it's beta/experimental and doesn't work by default, and I don't think we can expect it from our users yet. So overall, I'm now aware of what exists, but we might just be a little too early for a mass usage. |
Many thanks @echoix for your great analysis :) |
I think we could close for now, and ping us back in another issue if ever the ecosystem changes and we don't get a word of it yet. But at least for now I know to keep a look at this optimisation in the future. |
@echoix I'm mostly focused on user value in my proposal. If building and pushing a zstd chunked compressed image is possible, the rest doesn't matter so much, does it? If needed at all, it'd be easily possible to run Podman in CI without degrading Docker Engine related GitHub Actions, isn't it? See e.g., https://github.com/redhat-actions/push-to-registry |
Building and pushing such image isn't a problem, I think it's pretty possible to do so. But it wouldn't be consumable by the main utilisation source, GitHub actions. At least for now. |
Oh, we use MegaLinter on GitLab CI. By the way, you can run Podman from a container. Should be possible under GitHub Actions, too, no? |
This issue has been automatically marked as stale because it has not had recent activity. If you think this issue should stay open, please remove the |
A slim-image like the super-linter has could be useful. Of course it would mean making compromises with what is included to the image and what not. |
Not stale. |
There are already multiple flavors for that, including ci_light and cupcake (a medium-small for most commonly used languages). There are flavors for projects in many different ecosystems (ex: Python, documentation, security, JavaScript, go, rust, etc). If you use the full image at first, and a flavor suits your needs for the files found in your repo, it will suggest you to switch to that flavour, or will pre-fill out an issue for you to suggest a new flavor with the linter types. |
Cool. Yes noticed the suggestions, but didn't have time to check them yet. |
@mdrocan you can pick your choice here :) https://megalinter.io/latest/flavors/ |
@nvuillam Yeah I just quickly checked those, have to try some of them out with some software :) |
And tested with couple of images for different projects those seem to work nicely (also reduce the execution time plenty). In comparison with super-linter's slim version and the different images I tested, I think I know how to continue for now ;) |
@mdrocan i'm glad the flavor fits your requirement :) @sanmai-NL > I'm open to build additional images in different format and push them in other registries, as it would not impact the current architecture, would you like to try a PR ? |
Yeah, it/they work, but noticed an issue with Ansible. Most likely need to create a bug for it once I have time to test it still couple of times. |
Ansible-lint behaviour should be the same in all flavors, maybe it is an ansible-lint issue ? |
See: https://github.com/awslabs/soci-snapshotter. A way to reduce startup time without rebuilding images. |
@echoix, was this performance in the case of a cache hit or miss?
I can't find the mention about zstd decompression failing on macOS. Were you referring to actions/runner-images#7770? |
Euhm, that was to run apt get update and apt get install to install pigz. However, I didn't manage at that time to find a combination of action definition that would allow me to install pigz before pulling the docker image. Since the action definition specifies that a image will be pulled, it pulls it before starting, so I didn't manage to influence the run environment of the action.
That's interesting, maybe time to take a new look at the situation, since it's quite a jump from 20.10.25 to 24.0.6. (In the beginning of August 2023/end of July 2023, a runner image with version 23.0.6 was released in between).
The related issue mentioning that 23.0.6 was already included, might mean that it's time to try again, maybe everything is there now. (I don't remember allllll the prerequisites/interdependencies by heart) |
That sounds like the cache miss case then since it sounds like the apt updates and pigz weren't being cached. It probably isn't relevant now, but the easiest approach to testing pigz's performance would be to run MegaLinter via mega-linter-runner or as a pre-commit hook since neither approach pulls down the Docker image in a pre-step.
Agreed! |
Also, actions/runner-images#8205 recently added pigz as a top-level dependency of the GitHub Actions hosted Ubuntu runner images, however according to actions/runner-images#8161, it was previously present as a recommended package of Docker. |
Oh well, that was a very specific issue/PR! Never thought of looking at that repo's issues for that. |
This issue has been automatically marked as stale because it has not had recent activity. If you think this issue should stay open, please remove the |
/remove "O: stale" |
Is your feature request related to a problem? Please describe.
Pulling in MegaLinter container images can take a long time, as noted in the docs.
Describe the solution you'd like
Experiment with eStargz lazy pulling.
Describe alternatives you've considered
All available alternatives have been explored and somewhat implemented, like splitting out images, using stages and caching.
Additional context
None.
The text was updated successfully, but these errors were encountered: