Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(gatsby): lower memory pressure in SSR #30793

Merged
merged 2 commits into from
Apr 9, 2021
Merged

Conversation

vladar
Copy link
Contributor

@vladar vladar commented Apr 9, 2021

Description

1. Lower memory pressure

This PR lowers peak memory consumption of gatsby builds up to 2x.

Before this PR we put massive pressure on memory and FS while running SSR in workers. This PR restricts concurrency of html-file writes as well as page-data file reads which greatly reduces this pressure.

The improvement is proportional to the number of CPU cores and the size of your page-data files and HTML. On my test site, memory consumption per SSR worker dropped from 500MB+ to ~200MB.

For a machine with 4 physical cores, it means ~1200MB reduction of peak memory.

2. WorkerPool unification

Also, I've noticed that we create SSR worker pool twice. One worker pool was effectively used for develop and the other - for builds but they both were created for gatsby build.

This PR unifies the build and develop behavior - now both use the same Worker pool and we don't create unnecessary processes.

Note that those two changes are independant.

@gatsbot gatsbot bot added the status: triage needed Issue or pull request that need to be triaged and assigned to a reviewer label Apr 9, 2021
@vladar vladar added topic: scaling builds and removed status: triage needed Issue or pull request that need to be triaged and assigned to a reviewer labels Apr 9, 2021
}
})
},
{ concurrency: 2 }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wouldn't we want the concurrency to match the size of the jest worker pool?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is concurrency within one worker. So this effectively sets total concurrency across all workers to workerPoolSize * 2. I was hesitating if we should set { concurrency: 1 } or { concurrency: 2 } here but 2 seems to keep memory usage as low as 1, so settled on value 2.

To clarify your question - do you mean changing it to 1 here? Because setting it to workerPoolSize will effectively make it workerPoolSize * workerPoolSize which doesn't make much sense to me unless I am missing something.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh my bad. I didn't read this closely and thought this was in the main node process. Yeah 2 makes sense then. SSR is pure CPU so reading in any more page-data.json files wouldn't speed up anything and 1 would very slightly slow it down as the next SSR has to wait for the read to finish.

@KyleAMathews
Copy link
Contributor

Great find!

@vladar vladar merged commit c03e562 into master Apr 9, 2021
@vladar vladar deleted the vladar/fix-ssr-peak-memory branch April 9, 2021 20:04
vladar added a commit that referenced this pull request Apr 13, 2021
* fix(gatsby): lower memory pressure in SSR

* Remove redundant worker pool

(cherry picked from commit c03e562)
vladar added a commit that referenced this pull request Apr 13, 2021
* fix(gatsby): lower memory pressure in SSR

* Remove redundant worker pool

(cherry picked from commit c03e562)

Co-authored-by: Vladimir Razuvaev <vladimir.razuvaev@gmail.com>
This was referenced Apr 13, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants