Limit policy maximum age didn't cleanup resulting storage fill [v2.10.18] #5795

b2broker-yperfilov · 2024-08-16T10:30:40Z

Observed behavior

We are using Limit policy with maximum age of 15 minutes. However, 1 of 3 nodes didn't cleanup storage in time, resulting in storage filled and crash.

On the screen below, you can see the storage usage stats of 3 nodes. Notice that blue one has much larger storage usage compared to to red and yellow nodes.

The screenshot below is from NATS dashboard, you can see that stream message count also rose significantly

Configuration of the stream provided on the screen below. Stream was recreated during attempt to fix the issue, but it has exactly the same settings. Notice Max age here of 15 minutes, as well as typical bytes size and message count.

In logs, there were errors (repeated several time):

2024-08-15 19:50:32.572	{"time":"2024-08-15T16:50:32.57225584Z","_p":"F","log":"[181] 2024/08/15 16:50:32.572168 [ERR] JetStream resource limits exceeded for server"}

Please let me know if you need any additional details

Expected behavior

Limit policy cleaning as expected

Server and client version

Server 2.10.18

Host environment

K8s

      resources:
        limits:
          cpu: 400m
          memory: 768Mi
        requests:
          cpu: 400m
          memory: 768Mi

Steps to reproduce

not clear

The text was updated successfully, but these errors were encountered:

derekcollison · 2024-08-18T17:01:40Z

When something like that happens, we request the developer capture some profiles for us, specifically cpu, mem (heap), and stacksz / goroutines.

b2broker-yperfilov · 2024-08-19T05:51:24Z

@derekcollison here are screenshot of some metrics. I went through many memory metrics, and all of the looks quite stable

derekcollison · 2024-08-19T23:55:34Z

The stream info shows the only limit you have in place, which is age, appearing to work correctly. What do you think is not working correctly?

Also do you properly set GOMEMLIMIT?

b2broker-yperfilov · 2024-08-20T04:15:27Z

@derekcollison we do not have GOMEMLIMIT set. At the same time, issue is not with memory of the pod, issue with disk storage.

We have a replication on 3 nodes for this stream, that means that message should be copied to 3 nodes, and at any time the same amount of space should be occupied on each node (assuming all other stream also having replicas factor 3). However, one of the nodes didn't follow tis rule, as can be seen from the initial message, resulting in disk leackage.

derekcollison · 2024-08-20T04:32:14Z

Can you share a du -sh from the store directory for the one that has increased disk usage?

b2broker-yperfilov · 2024-08-20T04:37:46Z

@derekcollison
Now it is 1.3G. Another node is 102.0M, another is 97.4M

b2broker-yperfilov added the defect Suspected defect such as a bug or regression label Aug 16, 2024

wallyqs changed the title ~~Limit policy maximum age didn't cleanup resulting storage fill~~ Limit policy maximum age didn't cleanup resulting storage fill [v2.10.18] Sep 5, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Limit policy maximum age didn't cleanup resulting storage fill [v2.10.18] #5795

Limit policy maximum age didn't cleanup resulting storage fill [v2.10.18] #5795

b2broker-yperfilov commented Aug 16, 2024

derekcollison commented Aug 18, 2024

b2broker-yperfilov commented Aug 19, 2024

derekcollison commented Aug 19, 2024

b2broker-yperfilov commented Aug 20, 2024

derekcollison commented Aug 20, 2024

b2broker-yperfilov commented Aug 20, 2024

Limit policy maximum age didn't cleanup resulting storage fill [v2.10.18] #5795

Limit policy maximum age didn't cleanup resulting storage fill [v2.10.18] #5795

Comments

b2broker-yperfilov commented Aug 16, 2024

Observed behavior

Expected behavior

Server and client version

Host environment

Steps to reproduce

derekcollison commented Aug 18, 2024

b2broker-yperfilov commented Aug 19, 2024

derekcollison commented Aug 19, 2024

b2broker-yperfilov commented Aug 20, 2024

derekcollison commented Aug 20, 2024

b2broker-yperfilov commented Aug 20, 2024