Provide detailed memory metrics via prometheus plugin #11743

der-eismann · 2024-07-18T13:47:19Z

Is your feature request related to a problem? Please describe.

Hey everyone, we are currently working on replacing the soon-to-be EOL https://github.com/kbudde/rabbitmq_exporter with the built-in prometheus plugin. With that exporter it was possible to get detailed memory statistics from the management plugin, which have helped us debug issues:
https://github.com/rabbitmq/rabbitmq-server/blob/main/deps/rabbitmq_management/priv/www/js/tmpl/memory.ejs#L9-L31

Unfortunately I was unable to get these metrics from the prometheus plugin, the only thing that came close was process_resident_memory_bytes (https://github.com/rabbitmq/rabbitmq-server/blob/main/deps/rabbitmq_prometheus/src/collectors/prometheus_rabbitmq_core_metrics_collector.erl#L72)

Describe the solution you'd like

Provide all memory metrics from the management UI via prometheus plugin

Describe alternatives you've considered

No response

Additional context

No response

The text was updated successfully, but these errors were encountered:

michaelklishin · 2024-07-18T13:53:35Z

This is open source software, you are welcome to contribute what you find missing.

The data comes from rabbit_vm:memory/0 and the metrics belong to this group.

Closes #11743.

Closes #11743. (cherry picked from commit 5dad0f8)

(cherry picked from commit c361edd)

Closes #11743. (cherry picked from commit 5dad0f8) (cherry picked from commit d1a7167) # Conflicts: # deps/rabbitmq_prometheus/src/collectors/prometheus_rabbitmq_core_metrics_collector.erl

(cherry picked from commit c361edd) (cherry picked from commit 5396599)

Closes #11743.

der-eismann · 2024-07-19T08:10:51Z

Wow, I wasn't even able to finish my Erlang introductory course in that short time. Thanks for adding these metrics so quickly!

michaelklishin · 2024-07-19T12:47:42Z

These metrics are fairly expensive with many queues and streams, so we will limit this to 4.0 and look for ways to optimize this or make this opt-in.

mkuratczyk · 2024-07-23T12:24:46Z

@der-eismann We have now merged this into main/4.0 (but not 3.13). There's a dedicated endpoint for these metrics: https://www.rabbitmq.com/docs/next/prometheus#memory-breakdown-endpoint

However, I struggle to find a nice Grafana vizualization for these metrics. There are quite a few of them and multiple by the number of nodes in the cluster, you get a lot of data points. Are you currently visualizing these metrics from the exporter?
Can you share wha that looks like? Ideally, if you could contribute a panel for them, that'd be great.

The RabbitMQ Overview dashboard JSON source file is here if you want to give it a try: https://github.com/rabbitmq/rabbitmq-server/blob/main/deps/rabbitmq_prometheus/docker/grafana/dashboards/RabbitMQ-Overview.json

der-eismann · 2024-07-23T13:34:44Z

Hey @mkuratczyk, we used these metrics to figure out why the memory consumption is so high and with them we noticed that a huge chunk is allocated unused. The visualization is more of a quick and dirty kind, but I can try to polish to contribute it.
But these are from the old exporter, we don't have the 4.0 beta running yet. Need to invest some time for that, not sure when I can find that in the next two weeks.

mkuratczyk · 2024-07-23T13:43:39Z

That's ok, no rush. Seems like the external exporter provided fewer metrics and you still presented them separate for each node (which totally makes sense). As usual, the problem for us is that when we provide something, users expect it to "just work everywhere" and some users have 9 nodes in the cluster or more so that's suddenly quite a few new panels. Perhaps a separate dashboard would be useful. Then we can just do it per node and use Grafana's repeat option.

der-eismann added the enhancement label Jul 18, 2024

michaelklishin added a commit that referenced this issue Jul 18, 2024

Prometheus: expose memory breakdown metrics

5dad0f8

Closes #11743.

michaelklishin mentioned this issue Jul 18, 2024

Prometheus: expose memory breakdown metrics #11746

Merged

michaelklishin added a commit that referenced this issue Jul 18, 2024

Assertions for #11743

c361edd

michaelklishin closed this as completed in #11746 Jul 18, 2024

michaelklishin added this to the 3.13.5 milestone Jul 18, 2024

mergify bot pushed a commit that referenced this issue Jul 18, 2024

Prometheus: expose memory breakdown metrics

d1a7167

Closes #11743. (cherry picked from commit 5dad0f8)

mergify bot pushed a commit that referenced this issue Jul 18, 2024

Assertions for #11743

5396599

(cherry picked from commit c361edd)

mergify bot mentioned this issue Jul 18, 2024

Prometheus: expose memory breakdown metrics (backport #11746) #11751

Merged

mergify bot pushed a commit that referenced this issue Jul 18, 2024

Assertions for #11743

cf22ad1

(cherry picked from commit c361edd) (cherry picked from commit 5396599)

mergify bot mentioned this issue Jul 18, 2024

Prometheus: expose memory breakdown metrics (backport #11746) (backport #11751) #11752

Merged

michaelklishin added a commit that referenced this issue Jul 19, 2024

Prometheus: expose memory breakdown metrics

e9b5f52

Closes #11743.

michaelklishin added a commit that referenced this issue Jul 19, 2024

Assertions for #11743

0caea22

michaelklishin modified the milestones: 3.13.5, 4.0.0 Jul 19, 2024

michaelklishin mentioned this issue Jul 23, 2024

Move memory breakdown metrics to new endpoint #11788

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Provide detailed memory metrics via prometheus plugin #11743

Provide detailed memory metrics via prometheus plugin #11743

der-eismann commented Jul 18, 2024

michaelklishin commented Jul 18, 2024

der-eismann commented Jul 19, 2024

michaelklishin commented Jul 19, 2024

mkuratczyk commented Jul 23, 2024

der-eismann commented Jul 23, 2024

mkuratczyk commented Jul 23, 2024

Provide detailed memory metrics via prometheus plugin #11743

Provide detailed memory metrics via prometheus plugin #11743

Comments

der-eismann commented Jul 18, 2024

Is your feature request related to a problem? Please describe.

Describe the solution you'd like

Describe alternatives you've considered

Additional context

michaelklishin commented Jul 18, 2024

der-eismann commented Jul 19, 2024

michaelklishin commented Jul 19, 2024

mkuratczyk commented Jul 23, 2024

der-eismann commented Jul 23, 2024

mkuratczyk commented Jul 23, 2024