Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Caching more build artifacts (e.g., all of V8) #2949

Open
tniessen opened this issue May 14, 2022 · 6 comments
Open

Caching more build artifacts (e.g., all of V8) #2949

tniessen opened this issue May 14, 2022 · 6 comments

Comments

@tniessen
Copy link
Member

I wonder if we should be trying to cache more build artifacts. (And if there is a tracking issue for this somewhere already, please feel free to close this one.)

I believe we are using ccache on many Jenkins systems, but some build times are still slow. The "Test Linux" GitHub Action takes about two hours, and building alone often accounts for about 75% of those two hours. We can try not to run the job as often, but I'd still like to see build times go down.

I don't have data to back this up, but I have a feeling that we waste a lot of time and other resources by building V8 every time. Locally, ccache helps with that, but switching between branches or fetching commits that update V8 cause extremely long build times still. Could we cache entire V8 builds? Maybe even in a way that allows us to download cached V8 builds for local builds whenever V8 is updated in node core. This could be an opt-in feature as part of the build process (with some necessary security precautions) and maybe greatly speed up builds both locally and in CI.

This seems like something that we have likely considered in the past, so it probably isn't trivial to do or we probably would have done so already. Or maybe it won't give us the build time improvements I am hoping for.

@targos
Copy link
Member

targos commented May 15, 2022

I have thought a little bit about it and the issue I saw with GitHub actions was that the maximum size of the cache for the entire repository is too low. It looks like it was bumped at some point to 10GB but that's probably not enough (the cache depends on build flags and the Node.js branch).

I have a feeling that we waste a lot of time and other resources by building V8 every time.

I'm pretty sure that's the case too.

Could we cache entire V8 builds? Maybe even in a way that allows us to download cached V8 builds for local builds whenever V8 is updated in node core.

Maybe, but on which infrastructure and how? There are many variables that can invalidate the cache (OS, version of the build tools, version of V8, etc.)

@tniessen
Copy link
Member Author

I have thought a little bit about it and the issue I saw with GitHub actions was that the maximum size of the cache for the entire repository is too low. It looks like it was bumped at some point to 10GB but that's probably not enough (the cache depends on build flags and the Node.js branch).

I think we'd have to rely on our own infrastructure (or at least external infrastructure). If that turns out to be slow, due to the download rate within GitHub Actions or so, we could still cache the downloaded artifacts in the GitHub Actions cache as much as possible, in a multi-tiered cache manner.

Could we cache entire V8 builds? Maybe even in a way that allows us to download cached V8 builds for local builds whenever V8 is updated in node core.

Maybe, but on which infrastructure and how? There are many variables that can invalidate the cache (OS, version of the build tools, version of V8, etc.)

I am not sure where we store our builds currently, whether on our own infrastructure or in R2 or S3 or so. If we keep builds for the most recent V8 version on the few active branches and for commonly used platforms and build configurations only, I don't expect the storage requirements to be a major concern (even if that matrix might already be much larger than I anticipate). If individual files are below the 512MB limit, they'll also be cached by Cloudflare.

For security reasons, we'd probably want to have dedicated processes for filling the cache (instead of letting every CI job push to the cache). Maybe an automated CI job that is triggered whenever V8 or the Makefile etc are updated? I'm sure there are many dependencies to consider, but rebuilding V8 every time we change something in lib or test seems incredibly wasteful.

@targos
Copy link
Member

targos commented May 16, 2022

May be interesting: https://github.com/mozilla/sccache

@richardlau
Copy link
Member

In the current Jenkins CI ccache is all local -- if you run a build and it builds on one machine and the next build runs on a different machine the second build will not see the cache from the first. Newer versions of ccache (4.4+) have the ability to point to remote caches (e.g. via Redis or HTTP) but most of our CI is on older versions of ccache (e.g. for the Linux machines we're using the ccache from the package manager) so we haven't explored that.

I am not sure where we store our builds currently, whether on our own infrastructure or in R2 or S3 or so.

Our own infrastructure, all donated. AFAIK we do not use R2 or S3.

The "Test Linux" GitHub Action takes about two hours, and building alone often accounts for about 75% of those two hours. We can try not to run the job as often, but I'd still like to see build times go down.

The GitHub Actions runs are not using ccache at all. @MylesBorins tried to add it to the workflows before but we (surprisingly) didn't see any build time improvements -- perhaps some quirk of how GitHub Actions implements the cache action? If someone has time maybe they could reinvestigate.
(I'm also fairly certain the macOS GitHub Actions workflows take even longer than the Linux ones.)

As with most things build-related the main blocker is people's time. My suggestion would be to chime in/contribute towards nodejs/node#39672 which proposes devcontainers/codespaces.

May be interesting: https://github.com/mozilla/sccache

Has been mentioned before nodejs/node#29663. Would need someone to volunteer to set it up and (perhaps more crucially) commit to maintaining it.

@tniessen
Copy link
Member Author

I'm experimenting with incremental builds instead of (s)ccache. At least for GitHub Actions, that could bring our CI times down drastically.

@github-actions

This comment was marked as outdated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants