Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[YSQL] Prometheus metrics endpoint :13000/prometheus-metrics missing HELP and TYPE metadata #23578

Closed
1 task done
alencar opened this issue Aug 21, 2024 · 2 comments
Closed
1 task done
Assignees
Labels
area/ysql Yugabyte SQL (YSQL) kind/bug This issue is a bug priority/medium Medium priority issue

Comments

@alencar
Copy link

alencar commented Aug 21, 2024

Jira Link: DB-12496

Description

Prometheus metrics endpoint is not consistent with other endpoints, which provides both HELP and TYPE metadata, unless ?show_help=true is added to the request. Without HELP and TYPE metadata, integrations with tools like Datadog Agent 7.x breaks.

Environment

  • YugabyteDB 2024.1.0.0
  • Ubuntu Linux 22.04.4 LTS

Expected response

% curl http://$(hostname -i):13000/prometheus-metrics
# HELP handler_latency_yb_ysqlserver_SQLProcessor_SelectStmt_count Time spent processing a SELECT statement
# TYPE handler_latency_yb_ysqlserver_SQLProcessor_SelectStmt_count counter
handler_latency_yb_ysqlserver_SQLProcessor_SelectStmt_count{metric_id="yb.ysqlserver",metric_type="server",exported_instance="ip-10-27-82-136:9000"} 3543927249 1724259743522
...
%

Response

% curl http://$(hostname -i):13000/prometheus-metrics
handler_latency_yb_ysqlserver_SQLProcessor_SelectStmt_count{metric_id="yb.ysqlserver",metric_type="server",exported_instance="ip-10-27-82-136:9000"} 3543927249 1724259743522
handler_latency_yb_ysqlserver_SQLProcessor_SelectStmt_sum{metric_id="yb.ysqlserver",metric_type="server",exported_instance="ip-10-27-82-136:9000"} 16555250877107 1724259743522
handler_latency_yb_ysqlserver_SQLProcessor_InsertStmt_count{metric_id="yb.ysqlserver",metric_type="server",exported_instance="ip-10-27-82-136:9000"} 8215197 1724259743522
handler_latency_yb_ysqlserver_SQLProcessor_InsertStmt_sum{metric_id="yb.ysqlserver",metric_type="server",exported_instance="ip-10-27-82-136:9000"} 235785792 1724259743522
handler_latency_yb_ysqlserver_SQLProcessor_DeleteStmt_count{metric_id="yb.ysqlserver",metric_type="server",exported_instance="ip-10-27-82-136:9000"} 67701419 1724259743522
handler_latency_yb_ysqlserver_SQLProcessor_DeleteStmt_sum{metric_id="yb.ysqlserver",metric_type="server",exported_instance="ip-10-27-82-136:9000"} 70727904583 1724259743522
handler_latency_yb_ysqlserver_SQLProcessor_UpdateStmt_count{metric_id="yb.ysqlserver",metric_type="server",exported_instance="ip-10-27-82-136:9000"} 67701419 1724259743522
handler_latency_yb_ysqlserver_SQLProcessor_UpdateStmt_sum{metric_id="yb.ysqlserver",metric_type="server",exported_instance="ip-10-27-82-136:9000"} 149636790644 1724259743522
handler_latency_yb_ysqlserver_SQLProcessor_BeginStmt_count{metric_id="yb.ysqlserver",metric_type="server",exported_instance="ip-10-27-82-136:9000"} 314116 1724259743522
handler_latency_yb_ysqlserver_SQLProcessor_BeginStmt_sum{metric_id="yb.ysqlserver",metric_type="server",exported_instance="ip-10-27-82-136:9000"} 12608950 1724259743522
handler_latency_yb_ysqlserver_SQLProcessor_CommitStmt_count{metric_id="yb.ysqlserver",metric_type="server",exported_instance="ip-10-27-82-136:9000"} 314061 1724259743522
handler_latency_yb_ysqlserver_SQLProcessor_CommitStmt_sum{metric_id="yb.ysqlserver",metric_type="server",exported_instance="ip-10-27-82-136:9000"} 1215738 1724259743522
handler_latency_yb_ysqlserver_SQLProcessor_RollbackStmt_count{metric_id="yb.ysqlserver",metric_type="server",exported_instance="ip-10-27-82-136:9000"} 0 1724259743522
handler_latency_yb_ysqlserver_SQLProcessor_RollbackStmt_sum{metric_id="yb.ysqlserver",metric_type="server",exported_instance="ip-10-27-82-136:9000"} 0 1724259743522
handler_latency_yb_ysqlserver_SQLProcessor_OtherStmts_count{metric_id="yb.ysqlserver",metric_type="server",exported_instance="ip-10-27-82-136:9000"} 76540957 1724259743522
handler_latency_yb_ysqlserver_SQLProcessor_OtherStmts_sum{metric_id="yb.ysqlserver",metric_type="server",exported_instance="ip-10-27-82-136:9000"} 771848860260 1724259743522
handler_latency_yb_ysqlserver_SQLProcessor_Single_Shard_Transactions_count{metric_id="yb.ysqlserver",metric_type="server",exported_instance="ip-10-27-82-136:9000"} 0 1724259743522
handler_latency_yb_ysqlserver_SQLProcessor_Single_Shard_Transactions_sum{metric_id="yb.ysqlserver",metric_type="server",exported_instance="ip-10-27-82-136:9000"} 0 1724259743522
handler_latency_yb_ysqlserver_SQLProcessor_SingleShardTransactions_count{metric_id="yb.ysqlserver",metric_type="server",exported_instance="ip-10-27-82-136:9000"} 0 1724259743522
handler_latency_yb_ysqlserver_SQLProcessor_SingleShardTransactions_sum{metric_id="yb.ysqlserver",metric_type="server",exported_instance="ip-10-27-82-136:9000"} 0 1724259743522
handler_latency_yb_ysqlserver_SQLProcessor_Transactions_count{metric_id="yb.ysqlserver",metric_type="server",exported_instance="ip-10-27-82-136:9000"} 3687859065 1724259743522
handler_latency_yb_ysqlserver_SQLProcessor_Transactions_sum{metric_id="yb.ysqlserver",metric_type="server",exported_instance="ip-10-27-82-136:9000"} 16765996310822 1724259743522
handler_latency_yb_ysqlserver_SQLProcessor_AggregatePushdowns_count{metric_id="yb.ysqlserver",metric_type="server",exported_instance="ip-10-27-82-136:9000"} 70 1724259743522
handler_latency_yb_ysqlserver_SQLProcessor_AggregatePushdowns_sum{metric_id="yb.ysqlserver",metric_type="server",exported_instance="ip-10-27-82-136:9000"} 999999045 1724259743522
handler_latency_yb_ysqlserver_SQLProcessor_CatalogCacheMisses_count{metric_id="yb.ysqlserver",metric_type="server",exported_instance="ip-10-27-82-136:9000"} 33128930 1724259743522
handler_latency_yb_ysqlserver_SQLProcessor_CatalogCacheMisses_sum{metric_id="yb.ysqlserver",metric_type="server",exported_instance="ip-10-27-82-136:9000"} 0 1724259743522
yb_ysqlserver_active_connection_total{metric_id="yb.ysqlserver",metric_type="server",exported_instance="ip-10-27-82-136:9000"} 6 1724259743522
yb_ysqlserver_connection_total{metric_id="yb.ysqlserver",metric_type="server",exported_instance="ip-10-27-82-136:9000"} 251 1724259743522
yb_ysqlserver_max_connection_total{metric_id="yb.ysqlserver",metric_type="server",exported_instance="ip-10-27-82-136:9000"} 15000 1724259743522
yb_ysqlserver_connection_over_limit_total{metric_id="yb.ysqlserver",metric_type="server",exported_instance="ip-10-27-82-136:9000"} 0 1724259743522
yb_ysqlserver_new_connection_total{metric_id="yb.ysqlserver",metric_type="server",exported_instance="ip-10-27-82-136:9000"} 313944 1724259743522
ysql_conn_mgr_num_pools{exported_instance="ip-10-27-82-136:9000",metric_type="server",metric_id="yb.ysqlserver"} 0 1724259743522
ysql_conn_mgr_last_updated_timestamp{exported_instance="ip-10-27-82-136:9000",metric_type="server",metric_id="yb.ysqlserver"} 0 1724259743522
%

Relates to #17297

Issue Type

kind/bug

Warning: Please confirm that this issue does not contain any sensitive information

  • I confirm this issue does not contain any sensitive information.
@alencar alencar added area/ysql Yugabyte SQL (YSQL) status/awaiting-triage Issue awaiting triage labels Aug 21, 2024
@yugabyte-ci yugabyte-ci added kind/bug This issue is a bug priority/medium Medium priority issue labels Aug 21, 2024
@alencar
Copy link
Author

alencar commented Aug 21, 2024

Here is an example of what Datadog Agent 7.x does when it finds a metric without a TYPE

  Metadata
  ========
    config.hash: openmetrics:ybdb:b5c41279b0529ca7
    config.provider: file
2024-08-20 22:43:02 UTC | CORE | DEBUG | (pkg/collector/python/check.go:99 in runCheckImpl) | Running python check openmetrics (version: '4.2.2', id: 'openmetrics:ybdb:c92edf462ce2c325')
2024-08-20 22:43:02 UTC | CORE | DEBUG | (pkg/collector/python/datadog_agent.go:133 in LogMessage) | openmetrics:ybdb:c92edf462ce2c325 | (base.py:71) | Scraping OpenMetrics endpoint: http://10.27.82.136:13000/prometheus-metrics
2024-08-20 22:43:02 UTC | CORE | DEBUG | (pkg/collector/python/datadog_agent.go:133 in LogMessage) | - | (connectionpool.py:246) | Starting new HTTP connection (1): 10.27.82.136:13000
2024-08-20 22:43:02 UTC | CORE | DEBUG | (pkg/collector/python/datadog_agent.go:133 in LogMessage) | - | (connectionpool.py:474) | http://10.27.82.136:13000 "GET /prometheus-metrics HTTP/1.1" 200 616
2024-08-20 22:43:02 UTC | CORE | DEBUG | (pkg/collector/python/datadog_agent.go:133 in LogMessage) | openmetrics:ybdb:c92edf462ce2c325 | (transform.py:108) | Metric `handler_latency_yb_ysqlserver_SQLProcessor_SelectStmt_count` has no type, so you must define one in the `metrics` setting
2024-08-20 22:43:02 UTC | CORE | DEBUG | (pkg/collector/python/datadog_agent.go:133 in LogMessage) | openmetrics:ybdb:c92edf462ce2c325 | (transform.py:108) | Metric `handler_latency_yb_ysqlserver_SQLProcessor_SelectStmt_sum` has no type, so you must define one in the `metrics` setting

@yugabyte-ci yugabyte-ci added kind/enhancement This is an enhancement of an existing feature and removed kind/bug This issue is a bug status/awaiting-triage Issue awaiting triage labels Aug 26, 2024
@m-iancu
Copy link
Contributor

m-iancu commented Aug 26, 2024

See also #17016 -- Most likely need to do the same thing for the 13000 endpoint. cc @kai-franz

@yugabyte-ci yugabyte-ci added kind/bug This issue is a bug and removed kind/enhancement This is an enhancement of an existing feature labels Aug 26, 2024
kai-franz added a commit that referenced this issue Sep 11, 2024
Summary:
Similar to D25254, adds the following metadata to the YSQL Prometheus metrics endpoint for each metric:
  - #HELP: A brief description of the metric
  - #TYPE: Type of the metric, in this case either gauge or counter.

Gauge metrics can increase or decrease: for example, the number of YSQL connections
is a gauge metric.

Counters can only increase: for example, the number of select statements executed is
a counter metric.

By default, HELP and TYPE metadata are always shown. If a user wants to get the metrics without the metadata, they can use the `?show_help=false` URL parameter.

Connection manager metric descriptions are taken from D27240.

Also, move `ParseRequestOptions` out of the anonymous namespace in `default-path-handlers.cc` so that the pgsql webserver can use it.
Jira: DB-12496

Test Plan:
```
./yb_build.sh --cxx-test pgwrapper_pg_libpq-test --gtest_filter PgLibPqTest.CatalogCacheIdMissMetricsTest
./yb_build.sh --cxx-test pgwrapper_pg_libpq-test --gtest_filter PgLibPqTest.PrometheusMetricsHelpAndTypeTest
```

Reviewers: yyan, myang

Reviewed By: myang

Subscribers: svc_phabricator, esheng, yql, ybase

Differential Revision: https://phorge.dev.yugabyte.com/D37766
jasonyb pushed a commit that referenced this issue Sep 12, 2024
Summary:
 f5ad1fb [#23855] Fixing the calculation of automatic refactoring count.
 eed826d [#22519] YSQL: Move ExplicitRowLockBuffer class into separate file
 99a27e6 [PLAT-14522] Taking yba-ctl backups with prometheus HTTPS
 e2a84b0 [PLAT-15044] Add preflight check for node addition in provider
 7e40d89 [#20769] XCluster: Dynamically apply cdc_wal_retention_time_secs for XCluster
 aa41478 [#23858] build: fix ./yb_build.sh release --gcc11
 bcf7f47 [PLAT-13910] Improve IAM credentials fetch logging and add retries
 303a202 [#23778] xCluster: Remove the capability to rename xCluster replication groups
 Excluded: 58fd26e [#23652] YSQL: Fix TestPgRegressAnalyze.java timeout / database drop failure on TSAN build
 31da65b [doc] yb_enable_bitmapscan flag (#23854)
 798db14 [#20335] DocDB: Use MonoClock for write query metric
 80779d8 [#23860] xCluster: Add automatic ddl mode proto fields
 afd763d Revert "Revert "[PLAT-14786] Add support to node_agent install to use bind ip and node_external_fqdn""
 e86951a [#23841] docdb: Disable stack trace tracking in TSAN builds
 d600608 [PLAT-15244] Fix schedule not getting updated on edit schedule API call
 3c0df09 [PLAT-15214][PLAT-15232]YBC version upgrade to 2.2.0.0-b6 and enable YBC verbose by default
 aa7372e [#23478] YSQL: fix connection manager session variable case sensitivity issue
 0d53558 [PLAT-14810][PLAT-14811][YBA CLI] Support adding and editing EIT configurations
 ffa537e [PLAT-10706][dr] Support retry-ability of failover and switchover
 Excluded: 5ae4558 [#23578] YSQL: Add HELP and TYPE to :13000/prometheus-metrics
 3aa7459 [PLAT-15016] Handle gflag_group changes for ENHANCED_PG_COMPATIBILITY group in 2024.1.3
 c89356c [PLAT-10592][YBA] Changes to support global tserver/master service in K8s
 7fc3b76 [#3893] YCQL: Align 2 system_schema.* tables with Cassandra
 da6274e [23646] Test Stability: Fix PgMiniTest.FollowerReads
 5fa6dc9 [PLAT-15180][Platform][UI][PITR]Create Restore Backup modal
 be0d1d1 [PLAT-15247][Platform][Backup]Create Backup scheduled policy List

Test Plan: Jenkins: rebase: pg15-cherrypicks

Reviewers: jason, tfoucher

Subscribers: yql

Differential Revision: https://phorge.dev.yugabyte.com/D37981
kai-franz added a commit that referenced this issue Sep 12, 2024
…/prometheus-metrics

Summary:
Original commit: 5ae4558 / D37766
Similar to D25254, adds the following metadata to the YSQL Prometheus metrics endpoint for each metric:
  - #HELP: A brief description of the metric
  - #TYPE: Type of the metric, in this case either gauge or counter.

Gauge metrics can increase or decrease: for example, the number of YSQL connections
is a gauge metric.

Counters can only increase: for example, the number of select statements executed is
a counter metric.

By default, HELP and TYPE metadata are always shown. If a user wants to get the metrics without the metadata, they can use the `?show_help=false` URL parameter.

Connection manager metric descriptions are taken from D27240.

Also, move `ParseRequestOptions` out of the anonymous namespace in `default-path-handlers.cc` so that the pgsql webserver can use it.
Jira: DB-12496

Test Plan:
```
./yb_build.sh --cxx-test pgwrapper_pg_libpq-test --gtest_filter PgLibPqTest.CatalogCacheIdMissMetricsTest
./yb_build.sh --cxx-test pgwrapper_pg_libpq-test --gtest_filter PgLibPqTest.PrometheusMetricsHelpAndTypeTest
```

Reviewers: jason, tfoucher

Reviewed By: jason

Subscribers: ybase, yql, esheng, svc_phabricator

Differential Revision: https://phorge.dev.yugabyte.com/D37995
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/ysql Yugabyte SQL (YSQL) kind/bug This issue is a bug priority/medium Medium priority issue
Projects
None yet
Development

No branches or pull requests

4 participants