Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Potential Memory Leak Due to Spans #815

Closed
dcurletti opened this issue Jan 4, 2020 · 13 comments
Closed

Potential Memory Leak Due to Spans #815

dcurletti opened this issue Jan 4, 2020 · 13 comments
Labels
bug Something isn't working community

Comments

@dcurletti
Copy link

dcurletti commented Jan 4, 2020

Describe the bug
image

We've been having a memory leak for some time now and we've started looking for a solution. We enabled the datadog runtime metrics and our growing used memory seems to be inline with the number of finished dd-trace spans that are being kept in memory (picture above).

image
For a more detailed breakdown of the spans by name.

This previous thread seems to be relevant to us, but upgrading to the latest tracer hasn't fixed it.

Garbage collection seems to be doing its job since our asynchronous resources are relatively flat but could be something we test (manually running GC every X interval).

Haven't seen anything suspicious in the heap dumps, but will give them a second pass soon. Any help would be appreciated and we'll report back ASAP if we find anything on our end.

Environment

  • Operation system: GNU/Linux
  • Node version: 12.13.0
  • Tracer version: 0.16.3
  • Agent version: 6.13.0
@dcurletti dcurletti added the bug Something isn't working label Jan 4, 2020
@rochdev
Copy link
Member

rochdev commented Jan 6, 2020

One thing I'm noticing in the above charts is that there are many finished spans in memory, but almost no unfinished spans. This almost for sure rules out an unfinished trace holding all its finished spans in memory, unless you have traces with a lot of spans where just a few traces could hold thousands of spans, which doesn't seem to be the case here but correct me if I'm wrong.

Since the only place that holds spans in memory other than the trace itself is the scope manager, I wouldn't necessarily rule out the asynchronous resources even if it looks unlikely from the graphs. There still seem to be a small increase over time, which could cause this kind of issue, although usually at a much lower scale given the small number of resources.

I think this is a case where we'll need to look into your account. Can you open a support issue and share the ticket ID, or alternatively DM me in our public Slack?

@dcurletti
Copy link
Author

yup, slacking you now. thanks!

@victorboissiere
Copy link

We also have the same issue with the same tracer agent 0.16.3. Did you find a fix in the meantime ?

@victorboissiere
Copy link

@dcurletti which Datadog plugins do you use ? We seem to have a problem with the MongoDB plugin on some cases. Still investigating more

@rochdev
Copy link
Member

rochdev commented Jan 20, 2020

@victorboissiere Do you have runtime metrics enabled? I'd like to see if it's exactly the same behavior.

Also, why do you suspect the Mongo plugin specifically? Are you able to fix the issue when disabling it?

@JunkyDeLuxe
Copy link

Hello, there is a very large memory leak for the mongodb plugin.
Very easy to reproduce via nodejs streams (Node v12+)

const filename: string = 'test.csv';

res.setHeader('Content-disposition', 'attachment; filename=${filename}');
res.writeHead(200, { 'Content-Type': 'text/csv' });

const csvStream = fastCsv.createWriteStream({ headers: true, delimiter: ';' });

await asyncPipeline(
	ProductsService.getOesByBrand(brand), // the main mongo stream
	new CsvTransformer(), // a stream transformer, nothing unusual here
	csvStream, // csv lib in order to transform a mongo object to a csv line
	res // response piped
).catch((err) => {
	log.error(err);
});

Where ProductsService.getOesByBrand(brand) is just a mongo stream
like this one:

db.collection('collectionName').find({});

@rochdev
Copy link
Member

rochdev commented Jan 22, 2020

@JunkyDeLuxe Thanks for the reproduction snippet. We'll look into this as soon as possible. Is there a reason why you explicitly used a transform? Does the problem still occurs without it? Just trying to see how we can have a reproduction that is as minimal as possible.

@victorboissiere
Copy link

@rochdev I am trying to find the right time to run the tests on our staging environment. Yes, disabling the mongodb plugin fixed the issue. However, I will try to enable plugin auto discovery, my configuration was:

tracer.init({ plugins: false });
tracer.use('http', {
  blacklist: ['/ping']
});
tracer.use('mongodb-core', {});
tracer.use('amqplib', {});

so I will try using just:

tracer.init({ plugins: true });

and check if it fixes the issue and if not, run another test with runtime metrics enabled.

@benjamine
Copy link
Contributor

benjamine commented Feb 1, 2020

got bit by this too, apparently when running over a long mongo cursor.
ddTracer.use('mongodb-core', false); solves the issue.

  • node@12.13.1
  • dd-trace@0.17.0
  • mongodb-core@2.1.20

@rochdev
Copy link
Member

rochdev commented Feb 18, 2020

I'm unable to reproduce. At this point I'd need a reproduction snippet that I can run directly for both the stream and the cursor issues (which may or may not be related).

@rochdev
Copy link
Member

rochdev commented Feb 18, 2020

More information about how big the stream/cursor is would help as well. For example, 1 query with 1 million documents is very different than 1000 queries of 1000 documents each.

@rochdev
Copy link
Member

rochdev commented Apr 20, 2020

Is this still an issue? If that's the case, can you provide a reproduction snippet?

@andrewsouthard1
Copy link

Hey all, if you're continuing to see this issue please reach out to us at support@datadoghq.com

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working community
Projects
None yet
Development

No branches or pull requests

6 participants