Potential Memory Leak Due to Spans #815

dcurletti · 2020-01-04T03:24:30Z

Describe the bug

We've been having a memory leak for some time now and we've started looking for a solution. We enabled the datadog runtime metrics and our growing used memory seems to be inline with the number of finished dd-trace spans that are being kept in memory (picture above).

For a more detailed breakdown of the spans by name.

This previous thread seems to be relevant to us, but upgrading to the latest tracer hasn't fixed it.

Garbage collection seems to be doing its job since our asynchronous resources are relatively flat but could be something we test (manually running GC every X interval).

Haven't seen anything suspicious in the heap dumps, but will give them a second pass soon. Any help would be appreciated and we'll report back ASAP if we find anything on our end.

Environment

Operation system: GNU/Linux
Node version: 12.13.0
Tracer version: 0.16.3
Agent version: 6.13.0

rochdev · 2020-01-06T19:46:40Z

One thing I'm noticing in the above charts is that there are many finished spans in memory, but almost no unfinished spans. This almost for sure rules out an unfinished trace holding all its finished spans in memory, unless you have traces with a lot of spans where just a few traces could hold thousands of spans, which doesn't seem to be the case here but correct me if I'm wrong.

Since the only place that holds spans in memory other than the trace itself is the scope manager, I wouldn't necessarily rule out the asynchronous resources even if it looks unlikely from the graphs. There still seem to be a small increase over time, which could cause this kind of issue, although usually at a much lower scale given the small number of resources.

I think this is a case where we'll need to look into your account. Can you open a support issue and share the ticket ID, or alternatively DM me in our public Slack?

dcurletti · 2020-01-06T20:47:37Z

yup, slacking you now. thanks!

victorboissiere · 2020-01-20T15:49:24Z

We also have the same issue with the same tracer agent 0.16.3. Did you find a fix in the meantime ?

victorboissiere · 2020-01-20T17:34:32Z

@dcurletti which Datadog plugins do you use ? We seem to have a problem with the MongoDB plugin on some cases. Still investigating more

rochdev · 2020-01-20T19:02:20Z

@victorboissiere Do you have runtime metrics enabled? I'd like to see if it's exactly the same behavior.

Also, why do you suspect the Mongo plugin specifically? Are you able to fix the issue when disabling it?

JunkyDeLuxe · 2020-01-21T13:37:30Z

Hello, there is a very large memory leak for the mongodb plugin.
Very easy to reproduce via nodejs streams (Node v12+)

const filename: string = 'test.csv';

res.setHeader('Content-disposition', 'attachment; filename=${filename}');
res.writeHead(200, { 'Content-Type': 'text/csv' });

const csvStream = fastCsv.createWriteStream({ headers: true, delimiter: ';' });

await asyncPipeline(
	ProductsService.getOesByBrand(brand), // the main mongo stream
	new CsvTransformer(), // a stream transformer, nothing unusual here
	csvStream, // csv lib in order to transform a mongo object to a csv line
	res // response piped
).catch((err) => {
	log.error(err);
});

Where ProductsService.getOesByBrand(brand) is just a mongo stream
like this one:

db.collection('collectionName').find({});

rochdev · 2020-01-22T02:01:45Z

@JunkyDeLuxe Thanks for the reproduction snippet. We'll look into this as soon as possible. Is there a reason why you explicitly used a transform? Does the problem still occurs without it? Just trying to see how we can have a reproduction that is as minimal as possible.

victorboissiere · 2020-01-22T09:02:44Z

@rochdev I am trying to find the right time to run the tests on our staging environment. Yes, disabling the mongodb plugin fixed the issue. However, I will try to enable plugin auto discovery, my configuration was:

tracer.init({ plugins: false });
tracer.use('http', {
  blacklist: ['/ping']
});
tracer.use('mongodb-core', {});
tracer.use('amqplib', {});

so I will try using just:

tracer.init({ plugins: true });

and check if it fixes the issue and if not, run another test with runtime metrics enabled.

benjamine · 2020-02-01T15:22:37Z

got bit by this too, apparently when running over a long mongo cursor.
ddTracer.use('mongodb-core', false); solves the issue.

node@12.13.1
dd-trace@0.17.0
mongodb-core@2.1.20

rochdev · 2020-02-18T18:11:39Z

I'm unable to reproduce. At this point I'd need a reproduction snippet that I can run directly for both the stream and the cursor issues (which may or may not be related).

rochdev · 2020-02-18T20:31:30Z

More information about how big the stream/cursor is would help as well. For example, 1 query with 1 million documents is very different than 1000 queries of 1000 documents each.

rochdev · 2020-04-20T18:06:49Z

Is this still an issue? If that's the case, can you provide a reproduction snippet?

andrewsouthard1 · 2020-04-27T13:35:04Z

Hey all, if you're continuing to see this issue please reach out to us at support@datadoghq.com

dcurletti added the bug Something isn't working label Jan 4, 2020

rochdev added the community label Jan 6, 2020

andrewsouthard1 closed this as completed Apr 27, 2020

Throckmortra mentioned this issue May 17, 2020

Terrible memory leak in dd-trace (present in older and latest version) #959

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Potential Memory Leak Due to Spans #815

Potential Memory Leak Due to Spans #815

dcurletti commented Jan 4, 2020 •

edited

Loading

rochdev commented Jan 6, 2020

dcurletti commented Jan 6, 2020

victorboissiere commented Jan 20, 2020

victorboissiere commented Jan 20, 2020

rochdev commented Jan 20, 2020

JunkyDeLuxe commented Jan 21, 2020

rochdev commented Jan 22, 2020

victorboissiere commented Jan 22, 2020

benjamine commented Feb 1, 2020 •

edited

Loading

rochdev commented Feb 18, 2020

rochdev commented Feb 18, 2020

rochdev commented Apr 20, 2020

andrewsouthard1 commented Apr 27, 2020

Potential Memory Leak Due to Spans #815

Potential Memory Leak Due to Spans #815

Comments

dcurletti commented Jan 4, 2020 • edited Loading

rochdev commented Jan 6, 2020

dcurletti commented Jan 6, 2020

victorboissiere commented Jan 20, 2020

victorboissiere commented Jan 20, 2020

rochdev commented Jan 20, 2020

JunkyDeLuxe commented Jan 21, 2020

rochdev commented Jan 22, 2020

victorboissiere commented Jan 22, 2020

benjamine commented Feb 1, 2020 • edited Loading

rochdev commented Feb 18, 2020

rochdev commented Feb 18, 2020

rochdev commented Apr 20, 2020

andrewsouthard1 commented Apr 27, 2020

dcurletti commented Jan 4, 2020 •

edited

Loading

benjamine commented Feb 1, 2020 •

edited

Loading