Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

subscriber: poll ops appear to basically be a memory leak #256

Closed
hawkw opened this issue Jan 7, 2022 · 4 comments · Fixed by #311
Closed

subscriber: poll ops appear to basically be a memory leak #256

hawkw opened this issue Jan 7, 2022 · 4 comments · Fixed by #311
Assignees
Labels
C-subscriber Crate: console-subscriber. S-bug Severity: bug

Comments

@hawkw
Copy link
Member

hawkw commented Jan 7, 2022

AFAICT, when poll ops are recorded, they appear to never go away:

// TODO: drop the poll ops for async ops that have been dropped
all_poll_ops: ShrinkVec<proto::resources::PollOp>,

we should probably fix this.

@hawkw hawkw added S-bug Severity: bug C-subscriber Crate: console-subscriber. labels Jan 7, 2022
@zaharidichev
Copy link
Collaborator

True, will fix that.

@Milo123459
Copy link
Contributor

#184 might be related?

@hawkw
Copy link
Member Author

hawkw commented Feb 23, 2022

#184 might be related?

It's certainly possible!

@zaharidichev are you planning to work on a fix for this, or should I start on it at some point?

@ttys3
Copy link

ttys3 commented Mar 14, 2022

Oh, this confused me for days. thanks for confirm the leak problem.

From my Prometheus graph, I see the leak is very slow, but took a day can see it very obviously

below is just a service only has a /healthz api response ok string and with console subscriber enabled:

2022-03-15_00-21

hawkw added a commit that referenced this issue Mar 14, 2022
## Motivation

Currently, `console-subscriber` retains all `PollOp`s that have been
recorded since the application started, and sends all of them in the
first resource update for a new subscription.

I don't think this is correct, for a few reasons. It's necessary to send
all currently live tasks and resources to new subscriptions, because
future events may reference those tasks or resources. This isn't the
case for `PollOp`s --- unlike tasks, resources, and even async ops, a
`PollOp` doesn't represent an *object*, it represents an *event*, a
single time an object was polled. There's no reason to send all previous
poll ops in the first update for a new subscription.

Storing all the poll ops that have ever occurred results in a memory
leak (see #256); unlike other tracked entities, we never clear anything
out of the vector of all poll ops that have ever occurred in the
program.

## Solution

This branch just removes the `all_poll_ops` `Vec` entirely, and changes
the subscriber to only send new poll ops in the first update.

We *could* change this so that poll ops are associated with a timestamp,
and older ones are cleared out of a vec of all poll ops when they time
out. However, the `tokio-console` CLI doesn't currently have a way of
showing historical poll ops _anyway_, so there's no reason to keep this
data. If that changes, we can put them back, but for now, the simplest
solution is to just remove this.

Fixes #256
hawkw added a commit that referenced this issue Mar 14, 2022
## Motivation

Currently, `console-subscriber` retains all `PollOp`s that have been
recorded since the application started, and sends all of them in the
first resource update for a new subscription.

I don't think this is correct, for a few reasons. It's necessary to send
all currently live tasks and resources to new subscriptions, because
future events may reference those tasks or resources. This isn't the
case for `PollOp`s --- unlike tasks, resources, and even async ops, a
`PollOp` doesn't represent an *object*, it represents an *event*, a
single time an object was polled. There's no reason to send all previous
poll ops in the first update for a new subscription.

Storing all the poll ops that have ever occurred results in a memory
leak (see #256); unlike other tracked entities, we never clear anything
out of the vector of all poll ops that have ever occurred in the
program.

## Solution

This branch just removes the `all_poll_ops` `Vec` entirely, and changes
the subscriber to only send new poll ops in the first update.

We *could* change this so that poll ops are associated with a timestamp,
and older ones are cleared out of a vec of all poll ops when they time
out. However, the `tokio-console` CLI doesn't currently have a way of
showing historical poll ops _anyway_, so there's no reason to keep this
data. If that changes, we can put them back, but for now, the simplest
solution is to just remove this.

Fixes #256
4t145 referenced this issue in 4t145/dng-server Mar 31, 2022
it seems like tokio-console lead memory leakage,
since it's running on the unstable version of tokio.
hawkw added a commit that referenced this issue Apr 11, 2022
<a name="0.1.4"></a>
## 0.1.4  (2022-04-11)

#### Bug Fixes

*  fix memory leak from historical `PollOp`s (#311)
   ([9178ecf](9178ecf), closes [#256](256))

#### Features

* **console-api:**  Update `tonic` to `0.7` (#318) ([83d8a87](83d8a87))
*  don't trace tasks spawned through the console server (#314)
   ([0045e9b](0045e9b))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-subscriber Crate: console-subscriber. S-bug Severity: bug
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants