Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cloudwatch Agent metrics endpoint unavailable #1310

Open
krisko opened this issue Aug 20, 2024 · 1 comment
Open

Cloudwatch Agent metrics endpoint unavailable #1310

krisko opened this issue Aug 20, 2024 · 1 comment

Comments

@krisko
Copy link

krisko commented Aug 20, 2024

Describe the bug
When running amazon-cloudwatch-observability agent in EKS (installed as AWS addon) the agent has prometheus scrape annotations and creates cloudwatch-agent-monitoring service. This service should return prometheus metrics exposed by the agent but instead the call just fails to connect

$ curl cloudwatch-agent-monitoring.amazon-cloudwatch.svc.cluster.local:8888/metrics

curl: (7) Failed to connect to cloudwatch-agent-monitoring.amazon-cloudwatch.svc.cluster.local port 8888 after 4 ms: Could not connect to server

Annotations on the agent pods:

│                   prometheus.io/path: /metrics
│                   prometheus.io/port: 8888
│                   prometheus.io/scrape: true

Steps to reproduce
Install amazon-cloudwatch-observability addon (addon_version that has been tested "v1.7.0-eksbuild.1" and "v1.10.0-eksbuild.2")

What did you expect to see?
After querying the /metrics endpoint agent should return prometheus compatible metrics. Example below shows sample output from different metrics endpoint:

$ curl opentelemetry-operator.opentelemetry-operator-system.svc.cluster.local:8080/metrics
# HELP certwatcher_read_certificate_errors_total Total number of certificate read errors
# TYPE certwatcher_read_certificate_errors_total counter
certwatcher_read_certificate_errors_total 0
# HELP certwatcher_read_certificate_total Total number of certificate reads
# TYPE certwatcher_read_certificate_total counter
certwatcher_read_certificate_total 1

What did you see instead?
Error message [Could not connect to server](curl: (7) Failed to connect to cloudwatch-agent-monitoring.amazon-cloudwatch.svc.cluster.local port 8888 after 4 ms: Could not connect to server)

What version did you use?
Version: "v1.7.0-eksbuild.1" and "v1.10.0-eksbuild.2"

What config did you use?
Config: default configuration, without any additional values

Environment
OS: EKS cluster v1.29

@jefchien
Copy link
Contributor

Related to aws/amazon-cloudwatch-agent-operator#190

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants