Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Metricbeat downcases host.name #38689

Open
smith opened this issue Mar 29, 2024 · 16 comments
Open

Metricbeat downcases host.name #38689

smith opened this issue Mar 29, 2024 · 16 comments
Labels
bug Metricbeat Metricbeat needs_team Indicates that the issue/PR needs a Team:* label

Comments

@smith
Copy link
Contributor

smith commented Mar 29, 2024

It appears metricbeat (and possibly other beats/agent integrations) converts the host.name to all lowercase. This causes problems when trying to associate with other names.

We would expect the host.name to be unmodified as is the case with APM server.

@smith smith added bug Metricbeat Metricbeat labels Mar 29, 2024
@botelastic botelastic bot added the needs_team Indicates that the issue/PR needs a Team:* label label Mar 29, 2024
@botelastic
Copy link

botelastic bot commented Mar 29, 2024

This issue doesn't have a Team:<team> label.

@willemdh
Copy link

willemdh commented Apr 2, 2024

This discussion has been going on for years now.

host.name needs to be normalized and lowercased, exactly for correlation reasons. There are so many data sources each logging with their own naming conventions. Also if we for example do a reverse dns look up of an ip, it's always lowercase fqdn.

APM should also normalize and lowercase host.name if there is an issues there.

Original host name should be in host.hostname

@MikePaquette ;)

@roshan-elastic
Copy link

Thanks @willemdh

@smith - should we talk to the APM agents team?

@smith
Copy link
Contributor Author

smith commented Apr 3, 2024

@smith - should we talk to the APM agents team?

@roshan-elastic I think so. If we're normalizing data we need to do it for all methods of ingest.

@trentm
Copy link
Member

trentm commented Apr 3, 2024

By APM agent spec (https://github.com/elastic/apm/blob/main/specs/agents/metadata.md#hostname) APM agents should be lowercasing the value they send to APM server (metadata.system.detected_hostname).

This was added to our specs about 9mo ago in elastic/apm#805

THis issue elastic/apm#794 has links to the implementation issues for each of the APM agents. That issue is "closed" for all but the Go APM Agent. We'd have to do some digging to see what version of each APM agent got this change and possible confirm that they are indeed lowercasing.

Do we have any info on which particular APM agents we are talking about here?


Previous discussion(s):

Other possible wrinkles:

@roshan-elastic
Copy link

Thanks for this @trentm - it sounds like our intent is to lower-case host.name collected via APM agent so anything which isn't doing this is either:

  • Old (and will be fixed as users upgrade?)
  • Not deployed yet but in the plans (e.g. Go)
  • Unknown (and may need investigating)

Do we have any info on which particular APM agents we are talking about here?

@smith is this something you or someone in the team can share? I'm only really worried if it's something that isn't going to be addressed eventually.

@trentm
Copy link
Member

trentm commented Apr 4, 2024

Do we have any info on which particular APM agents we are talking about here?

I asked on the originating issue: elastic/kibana#178650 (comment)
Caue said it was the Go APM Agent, so that makes sense.

I'm only really worried if it's something that isn't going to be addressed eventually.

Development focus for the Go Agent is on the OTel side, so I'm not sure how timely any change would be here.

Also I gather we'll have the same issue with OTel APM agents, where the host.name spec differs from the suggestions in ECS's host.name spec. OTel doesn't say anything about normalizing case.

@roshan-elastic
Copy link

Development focus for the Go Agent is on the OTel side, so I'm not sure how timely any change would be here.

That's OK - the main thing is that we're aligned on how to solve it (we can sort 'when' via prioritisation etc).

OTel doesn't say anything about normalizing case.

Great catch.

@AlexanderWert / @mlunadia / @tommyers-elastic - Do you think we can enforce standardisation for OTel data? This issue is showing the pitfalls of mixing cases etc - it leads to dup data/confusing user experiences.

Note : This issue is specifically focusing on lower-casing host.name

@AlexanderWert
Copy link
Member

AlexanderWert commented Apr 5, 2024

All of this is a result of this change in ECS (~a year ago): elastic/ecs#2122

So, now we have a mix of old collectors (that not necessarily do lowercasing) and newer collectors (that do lowercasing).

In OpenTelemetry SemanticConventions host.name is not being lowercased (and we can assume that we won't be able to change that): https://opentelemetry.io/docs/specs/semconv/attributes-registry/host/

I think, the actual problem is that we use host.name to correlate data and use it as an identifier of the host.
Actually, we should use host.id for correlation and identification, because that one is meant to be unique and reliable in both, ECS and SemConv. host.name should be rather used as a display name.

--> I really hope that with Assets / Entities these kind of things will be resolved!

ECS:
image

OTel SemConv:
image

@trentm
Copy link
Member

trentm commented Apr 5, 2024

Using host.id sounds good to me. For the current APM agents, it was only very recently added to APM agent specs. Only the Java APM agent will be producing host.id currently. As well, APM server's intakev2 API (used by the APM agents) does not yet handle host.id from APM agents. That's hopefully being added for 8.14.

@roshan-elastic
Copy link

actual problem is that we use host.name to correlate data and use it as an identifier of the host.
Actually, we should use host.id for correlation and identification, because that one is meant to be unique and reliable in both, ECS and SemConv. host.name should be rather used as a display name.

That's a great point @AlexanderWert. I think that sounds sensible but I'm worried about what % of our customers will be able to supply this with current collection - especially as we want to leverage the host identifier across metricbeat, filebeat and the elastic agent integrations (and OTel).

Looking at one of our own clusters (us-east-1-logging...) internal collection for different agents, host.id looks pretty scarcely populated (e.g. 2-5% for filebeat) so I don't think that's feasible in the short-/medium-term from what I can see?

Filebeat - 2-5% have host.id
image

Metricbeat - around the same
image

It's a similar story on overview-....kb.us-west2.

I believe this is likely representative of our customer base too...we might be able to get telemetry from the BI team if we need more data.

Do you have any thoughts?

@smith not sure if you have an opinion on this?

@smith
Copy link
Contributor Author

smith commented Apr 10, 2024

@roshan-elastic we'll probably have to fall back to attempting to correlate things using host.name for some time, but we should prefer host.id if at all possible.

@willemdh
Copy link

willemdh commented Apr 11, 2024

@roshan-elastic

Using host.id is absolutely not ideal. We have working correlations between datasets containing lowercase fqdn's from logs with datasets where only an ip is known. A reverse dns lookup enables us to correlate network data (which does not contain any hostnames) with host data. Please please let's not go back in time and choose a solution which doesn't make any sense.

Lowercase fqdn in host.name is really tthe primary key you want to correlate on. NOT host.id, as a lot of datasets contain an id like '55de390e-6781-485a-a5c2-463180e52874'. How on earth do we have to correlate that with a lowercase fqdn in a dataset which has absolutely no idea where it whould get this host.id from??

@roshan-elastic
Copy link

roshan-elastic commented Apr 11, 2024

@willemdh ➕ and thanks for the detail.

@ash-darin
Copy link

@smith For your immediate problem: metricbeat sets agent.name with the same value as host.name without domain, but preserving case, if not instructed otherwise AFAIK. Is this also lowercased now? Would that pose as a useful alternative for you?

Personally I agree with this, whenever someone tells me to check a host, I have to doublecheck if it spelled capital or not. The fields are of type "keyword" so that matters. Isn't this a problem that is isolated to windows? I am unaware of Unix-like systems that return mixed-case hostnames.

@willemdh metricbeat (8.11.4) does not generate host.hostname on my system, nor agent.hostname.

@smith
Copy link
Contributor Author

smith commented Apr 11, 2024

Isn't this a problem that is isolated to windows? I am unaware of Unix-like systems that return mixed-case hostnames.

We first diagnosed it with MacOS.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Metricbeat Metricbeat needs_team Indicates that the issue/PR needs a Team:* label
Projects
None yet
Development

No branches or pull requests

6 participants