Metrics not being persisted in single binary mode #6119

balajisa09 · 2024-07-26T15:49:07Z

Describe the bug
I am running Cortex in single binary mode in kubernetes with pvc, and I have noticed that metrics are not being persisted for more than 5 hours. I have attached the config. I have a prometheus instance sending metrics to cortex via remotewrite. There are enough space in the disk too.

To Reproduce
Steps to reproduce the behavior:

Start Cortex with v1.17.1 version and prometheus with v2.52.0
visualize the metrics via grafana or any other tool.

Expected behavior
The metrics to stay for the give retention period in cortex config.

Environment:

Infrastructure: Kubernetes
Deployment tool: helm

Additional Context

cortex config:

config.yaml: |
  auth_enabled: false

  server:
    http_listen_port: 9009
  
    # Configure the server to allow messages up to 100MB.
    grpc_server_max_recv_msg_size: 104857600
    grpc_server_max_send_msg_size: 104857600
    grpc_server_max_concurrent_streams: 1000

    http_tls_config:
      client_auth_type: RequireAndVerifyClientCert

    grpc_tls_config:
      client_auth_type: RequireAndVerifyClientCert
    log_level: debug
      
  distributor:
    shard_by_all_labels: true
    pool:
      health_check_ingesters: true

  ingester_client:
    grpc_client_config:
      # Configure the client to allow messages up to 100MB.
      max_recv_msg_size: 104857600
      max_send_msg_size: 104857600
      grpc_compression: gzip
  
  ingester:
    lifecycler:
      # The address to advertise for this ingester.  Will be autodiscovered by
      # looking up address on eth0 or en0; can be specified if this fails.
      # address: 127.0.0.1
  
      # We want to start immediately and flush on shutdown.
      min_ready_duration: 0s
      final_sleep: 0s
      num_tokens: 512
  
      # Use an in memory ring store, so we don't need to launch a Consul.
      ring:
        kvstore:
          store: inmemory
        replication_factor: 1
  
  blocks_storage:
    tsdb:
      dir: /data
      retention_period: 168h
  
    bucket_store:
      sync_dir: /data

    backend: filesystem
    filesystem:
      dir: /data/fake

  compactor:
    data_dir: /tmp/cortex/compactor
    sharding_ring:
      kvstore:
        store: inmemory
  
  frontend_worker:
    match_max_concurrent: true

prometheus remotewrite config:

additionalRemoteWrite: 
- url: http://ingest.abc.com/metrics/v1/push
  writeRelabelConfigs:
  - sourceLabels: [__name__]
    regex: '.*'
    action: 'replace'
    targetLabel: 'captain_domain'
    replacement: {{ .Values.captain_domain }}
  - sourceLabels: [__name__]
    regex: '.*'
    action: 'replace'
    targetLabel: 'abc_platform_version'
    replacement: {{ .Chart.Version }}

The text was updated successfully, but these errors were encountered:

danielblando · 2024-07-26T17:37:46Z

Do you know if the data is being deleted or just not being queried? Can you see blocks older than 5h in disk if you check /data?

Also we should have some logs when deleting blocks.

msg="Deleting obsolete block" block=blockId

Can you see those logs? Is it possible to check how old the blocksIds being deleted are? You can try to look for other logs with blockId or if luck still get info from block on disk.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Metrics not being persisted in single binary mode #6119

Metrics not being persisted in single binary mode #6119

balajisa09 commented Jul 26, 2024 •

edited

Loading

danielblando commented Jul 26, 2024

Metrics not being persisted in single binary mode #6119

Metrics not being persisted in single binary mode #6119

Comments

balajisa09 commented Jul 26, 2024 • edited Loading

danielblando commented Jul 26, 2024

balajisa09 commented Jul 26, 2024 •

edited

Loading