Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: increased memory in finalization first appearing in v6.16.0 #3445

Merged
merged 3 commits into from
Aug 9, 2024

Conversation

snyamathi
Copy link
Contributor

@snyamathi snyamathi commented Aug 8, 2024

Holding a reference to the stream in the finalization registry causes a memory leak, even when consuming the body.

This relates to...

https://github.com/orgs/nodejs/discussions/54248

and

vercel/next.js#68636

Rationale

I believe we're maintaining a strong reference to innerResponse.body.stream even when the body is consumed.

Please see the above issues where many people are seeing a memory leak when Node is updated from 20.15.1 to 20.16.0

I was able to reproduce the issue (vercel/next.js#68636 (comment)) and use that to test in order to find the commit where things started breaking.

I'm not an expert on memory leaks, but this seems to make sense and fixes the issue.

It seems like this was introduced in 63b7794

Changes

Use a WeakRef instead - if the stream is already GC'd then we have nothing to worry about - otherwise we can do the cleanup as usual.

I'm testing this by building Node + Undici locally and then running load testing on https://github.com/snyamathi/68636/tree/simple

When load testing using a build of the prior commit (c0dc3dd) everything is fine. When using the commit 63b7794 the memory leak can be observed.

Adding this fix to the main branch fixes the issue.

fixed

The chart above shows the memory usage before (main) and after the fix (my branch)

Features

N/A

Bug Fixes

https://github.com/orgs/nodejs/discussions/54248

Breaking Changes and Deprecations

N/A

Status

Holding a reference to the stream in the finalization registry causes a memory leak, even when consuming the body.
Copy link
Contributor

@Uzlopak Uzlopak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@@ -526,7 +527,7 @@ function fromInnerResponse (innerResponse, guard) {
setHeadersGuard(response[kHeaders], guard)

if (hasFinalizationRegistry && innerResponse.body?.stream) {
registry.register(response, innerResponse.body.stream)
registry.register(response, new WeakRef(innerResponse.body.stream))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

registry.register(target, "some value");

If target is reclaimed, your cleanup callback may be called at some point with the held value you provided for it ("some value" in the above). The held value can be any value you like: a primitive or an object, even undefined. If the held value is an object, the registry keeps a strong reference to it (so it can pass it to your cleanup callback later).

https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/FinalizationRegistry

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if no one merges this before you can, adding this as a comment in the code itself would be appreciated

@snyamathi snyamathi changed the title fix: memory leak fix: memory leak in finalization first appearing in v6.16.1 Aug 9, 2024
@snyamathi snyamathi changed the title fix: memory leak in finalization first appearing in v6.16.1 fix: memory leak in finalization first appearing in v6.16.0 Aug 9, 2024
@Uzlopak Uzlopak merged commit 00552bb into nodejs:main Aug 9, 2024
32 checks passed
@snyamathi snyamathi changed the title fix: memory leak in finalization first appearing in v6.16.0 fix: increased memory in finalization first appearing in v6.16.0 Aug 9, 2024
@mcollina
Copy link
Member

mcollina commented Aug 9, 2024

Thanks for sending a fix!

Were you able to reproduce without Next.js? I have a been using this quite a bit and this is the first time I saw this problem.
I have a feeling that Next caching logic is interfering somehow.

I really think we need a test to prevent this from regressing. You can find a test to copy in #3199.

Copy link
Member

@mcollina mcollina left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@mcollina
Copy link
Member

mcollina commented Aug 9, 2024

In the future, please don't land PRs without tests or clear reproduction on undici.

If we do, let's give those a bit more scrutiny.

@mcollina
Copy link
Member

mcollina commented Aug 9, 2024

A few notes.

I've verified that:

  1. this change is essentially harmless, so we can keep it in.
  2. using "plain" undici with Next.js does not trigger there problem, so I have absolutely no idea if this fixing the problem or not.

@mcollina
Copy link
Member

mcollina commented Aug 9, 2024

@snyamathi how were you able to verify that this fixes Next.js issue? How can I tell Next to use the patched undici instead of the global one?

mcollina pushed a commit that referenced this pull request Aug 9, 2024
* fix: memory leak

Holding a reference to the stream in the finalization registry causes a memory leak, even when consuming the body.

* docs: add comment explaining the strong reference

* typo
@snyamathi
Copy link
Contributor Author

snyamathi commented Aug 9, 2024

@mcollina I built everything from source (node+undici) when testing, but you could also

node --max-old-space-size=96 -r ./foo.js ./node_modules/.bin/next start

where foo.js is

globalThis.fetch = require('undici').fetch;

It's difficult to tell what is different with the RSC setup since both Next and React patch the fetch function.

It's hard to separate React/RSC from Next and test them independently (see notes here).

However, I'm leaning toward this issue being more to do with React's patch vs Next's patch

Maybe someone from the React team (cc @sebmarkbage) would be able to chime in? There is some code in React's fetch patch which appears to keep a handle to the response which may be what was causing the strong reference to the stream to stick around? I don't know enough about the React caching mechanism to say though.


It's very hard to disable React's patch since next bundles its own copy of React within its own dist package

https://unpkg.com/next@14.2.5/dist/compiled/next-server/app-page.runtime.prod.js

I was able to edit the above file locally to disable React's patch (basically bail here) and that also seems to alleviate the memory issue. So I think while the fix here is still prudent since it reduces memory pressure with the only downside being the creating/dereference of the WeakRef, the React patch is likely what's causing the memory retention.

@KhafraDev
Copy link
Member

KhafraDev commented Aug 9, 2024

@snyamathi based on your comment, I found that cloned responses can prevent a Response from being collected:

diff --git a/test/fetch/fire-and-forget.js b/test/fetch/fire-and-forget.js
index d8090885..0a5925db 100644
--- a/test/fetch/fire-and-forget.js
+++ b/test/fetch/fire-and-forget.js
@@ -38,7 +38,7 @@ test('does not need the body to be consumed to continue', { timeout: 180_000, sk
     gc(true)
     const array = new Array(batch)
     for (let i = 0; i < batch; i++) {
-      array[i] = fetch(url).catch(() => {})
+      array[i] = fetch(url).then(r => r.clone()).catch(() => {})
     }
     await Promise.all(array)
     await sleep(delay)

running with node --expose-gc --test test/fetch/fire-and-forget.js I get:

RSS 63 MB after 50 fetch() requests
...
...
...
RSS 800 MB after 4800 fetch() requests
RSS 807 MB after 4850 fetch() requests
RSS 815 MB after 4900 fetch() requests
RSS 823 MB after 4950 fetch() requests
RSS 831 MB after 5000 fetch() requests

@snyamathi
Copy link
Contributor Author

snyamathi commented Aug 9, 2024

That sort of makes sense since cloning the response will create a second copy of the body. If you were to consume (or cancel) the body of the clone, I wonder if the memory usage would not increase in the same way.

But then again - the clone should have been garbage collected at some point. And this cloning seems like exactly what React is doing.

@Uzlopak
Copy link
Contributor

Uzlopak commented Aug 9, 2024

commenting out following line prevents the memory leak.

const [out1, out2] = body.stream.tee()

@KhafraDev
Copy link
Member

Actually, that's still an issue considering I tested on the latest commit to undici. I have no idea what the RSC/next patch is doing to cause the fixed issue.

commenting out following line prevents the memory leak.

Sure, but it'll break cloning.

var response = await fetch('...')
var clone = response.clone()

assert.deepStrictEqual(await response.text(), await clone.text())

Note that the response we cloned must still have a readable body. We need tee to make the original body accessible (otherwise using a body mixin on a cloned response would lock the original response's body).

I'm pretty sure the clone issue would be resolved by handling it in the same way we currently do for 'normal' responses.

@KhafraDev
Copy link
Member

var A = await fetch('...') // -> A is readable
var B = response.clone() // -> A, B are readable
await B.text() // -> A is readable; B is locked (and vice-versa)

@Uzlopak
Copy link
Contributor

Uzlopak commented Aug 9, 2024

Of course it will break by commenting it out. I was just investigating if it is something trivial. But the .tee() call makes absolutely sense, and if you clone, you are responsible for it.

@snyamathi
Copy link
Contributor Author

snyamathi commented Aug 12, 2024

More context and a test added in #3450 showing a minimal reproduction (https://github.com/snyamathi/undici-mem-leak)

image

@pepijn-vanvlaanderen
Copy link

Will this be backported to Node v20?

@mrtimp
Copy link

mrtimp commented Aug 28, 2024

@pepijn-vanvlaanderen looks like it will: nodejs/node#54274 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants