Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid Pod/container duplicates or missmatch #21842

Closed
exekias opened this issue Oct 15, 2020 · 4 comments · Fixed by #22344
Closed

Avoid Pod/container duplicates or missmatch #21842

exekias opened this issue Oct 15, 2020 · 4 comments · Fixed by #22344
Assignees
Labels
Agent discuss Issue needs further discussion. enhancement Team:Platforms Label for the Integrations - Platforms team v7.11.0

Comments

@exekias
Copy link
Contributor

exekias commented Oct 15, 2020

Composable inputs will launch an input per matching entity from the provider, if the resulting input configuration is different. Kubernetes provider is providing an entity per container in the cluster. This means that for every container in every Pod we will emit an event from the provider.

For the logs use case we want that, as containers log to separate files, this also allow us to provide more specific metadata (kubernetes.container.name), so users are able to differentiate their logs streams per container.

For metrics sometimes it's not possible to differentiate the container within the pod, as we are reaching them through the Pod IP, which is shared by all containers. This is fine in principle, when you only use kubernetes.pod.ip dynamic inputs should be able to generate the same config from several events (one per container in the pod). The problem right now is that these several events have different metadata (kubernetes.container.name) and this spawns in several instances of the input.

Also, we don't really want to add container metadata, as we don't know which container we are talking with. That's why the kubernetes provider also submits an event with Pod info only:

// Emit the pod
// We emit Pod + containers to ensure that configs matching Pod only
// get Pod metadata (not specific to any container)
p.comm.AddOrUpdate(string(pod.GetUID()), mapping, processors)
// Emit all containers in the pod
p.emitContainers(pod, pod.Spec.Containers, pod.Status.ContainerStatuses)
// TODO deal with init containers stopping after initialization
p.emitContainers(pod, pod.Spec.InitContainers, pod.Status.InitContainerStatuses)

We need to discuss our options to avoid this duplication, right now, metadata is the one causing it, I wonder if we should take it out of the algorithm removing duplicates, to only add it after reconciliation.

@exekias exekias added enhancement discuss Issue needs further discussion. Team:Platforms Label for the Integrations - Platforms team Team:Ingest Management Agent labels Oct 15, 2020
@elasticmachine
Copy link
Collaborator

Pinging @elastic/ingest-management (Team:Ingest Management)

@elasticmachine
Copy link
Collaborator

Pinging @elastic/integrations-platforms (Team:Platforms)

@blakerouse
Copy link
Contributor

I think the solution to this is to remove the processor promotion until after all the computation of generating the inputs and calculating the duplicates. That will ensure that only 1 input is created with the match and only promote the processors from the dynamic provider in that one case.

AddOrUpdate does handle the ordering, so as long as the first match only provides kubernetes.pod.* metadata, that is all the generated input would create and all the other matching inputs would not result in the generated configuration.

@exekias
Copy link
Contributor Author

exekias commented Oct 27, 2020

That sounds like a good solution to me!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Agent discuss Issue needs further discussion. enhancement Team:Platforms Label for the Integrations - Platforms team v7.11.0
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants