-
Notifications
You must be signed in to change notification settings - Fork 302
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Potential Memory Leak Due to Spans #815
Comments
One thing I'm noticing in the above charts is that there are many finished spans in memory, but almost no unfinished spans. This almost for sure rules out an unfinished trace holding all its finished spans in memory, unless you have traces with a lot of spans where just a few traces could hold thousands of spans, which doesn't seem to be the case here but correct me if I'm wrong. Since the only place that holds spans in memory other than the trace itself is the scope manager, I wouldn't necessarily rule out the asynchronous resources even if it looks unlikely from the graphs. There still seem to be a small increase over time, which could cause this kind of issue, although usually at a much lower scale given the small number of resources. I think this is a case where we'll need to look into your account. Can you open a support issue and share the ticket ID, or alternatively DM me in our public Slack? |
yup, slacking you now. thanks! |
We also have the same issue with the same tracer agent |
@dcurletti which Datadog plugins do you use ? We seem to have a problem with the MongoDB plugin on some cases. Still investigating more |
@victorboissiere Do you have runtime metrics enabled? I'd like to see if it's exactly the same behavior. Also, why do you suspect the Mongo plugin specifically? Are you able to fix the issue when disabling it? |
Hello, there is a very large memory leak for the mongodb plugin.
Where ProductsService.getOesByBrand(brand) is just a mongo stream
|
@JunkyDeLuxe Thanks for the reproduction snippet. We'll look into this as soon as possible. Is there a reason why you explicitly used a transform? Does the problem still occurs without it? Just trying to see how we can have a reproduction that is as minimal as possible. |
@rochdev I am trying to find the right time to run the tests on our staging environment. Yes, disabling the mongodb plugin fixed the issue. However, I will try to enable plugin auto discovery, my configuration was:
so I will try using just:
and check if it fixes the issue and if not, run another test with runtime metrics enabled. |
got bit by this too, apparently when running over a long mongo cursor.
|
I'm unable to reproduce. At this point I'd need a reproduction snippet that I can run directly for both the stream and the cursor issues (which may or may not be related). |
More information about how big the stream/cursor is would help as well. For example, 1 query with 1 million documents is very different than 1000 queries of 1000 documents each. |
Is this still an issue? If that's the case, can you provide a reproduction snippet? |
Hey all, if you're continuing to see this issue please reach out to us at support@datadoghq.com |
Describe the bug
We've been having a memory leak for some time now and we've started looking for a solution. We enabled the datadog runtime metrics and our growing used memory seems to be inline with the number of finished dd-trace spans that are being kept in memory (picture above).
For a more detailed breakdown of the spans by name.
This previous thread seems to be relevant to us, but upgrading to the latest tracer hasn't fixed it.
Garbage collection seems to be doing its job since our asynchronous resources are relatively flat but could be something we test (manually running GC every X interval).
Haven't seen anything suspicious in the heap dumps, but will give them a second pass soon. Any help would be appreciated and we'll report back ASAP if we find anything on our end.
Environment
The text was updated successfully, but these errors were encountered: