Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Elastic-agent should not swallows stderr and panics #88

Closed
Tracked by #26930 ...
mtojek opened this issue Jan 27, 2022 · 8 comments · Fixed by #455
Closed
Tracked by #26930 ...

Elastic-agent should not swallows stderr and panics #88

mtojek opened this issue Jan 27, 2022 · 8 comments · Fixed by #455
Assignees
Labels
bug Something isn't working enhancement New feature or request Team:Elastic-Agent-Control-Plane Label for the Agent Control Plane team v8.3.0

Comments

@mtojek
Copy link
Contributor

mtojek commented Jan 27, 2022

Hi Team,

this issue proved that we need a more user-friendly method to collect all panics. Otherwise we won't be able to grab anything from customers.

In the mentioned case I used strace to listen what filebeat is sending to /dev/null, but we need a more trival method.

@mtojek mtojek added the Team:Elastic-Agent-Control-Plane Label for the Agent Control Plane team label Jan 27, 2022
@elasticmachine
Copy link
Collaborator

Pinging @elastic/elastic-agent-control-plane (Team:Elastic-Agent-Control-Plane)

@jlind23 jlind23 added the enhancement New feature or request label Jan 27, 2022
@mtojek
Copy link
Contributor Author

mtojek commented Jan 27, 2022

@jlind23 I suggest giving it a higher priority as it may result in a pile of mysterious SDH issues: the process doesn't run, don't see any panics, no error, no logs collected, etc.

@jlind23
Copy link
Contributor

jlind23 commented Jan 27, 2022

@ruflin @ph From your opinion would it be something easy to put in place?

@ph
Copy link
Contributor

ph commented Jan 27, 2022

I don't think it would be much effort and this would align with v2 changes for logging. IE: We should be able to stream stderr from a subprocess into the Elastic Agent Appender.

@jlind23 jlind23 added the v8.2.0 label Jan 28, 2022
@ruflin
Copy link
Member

ruflin commented Jan 31, 2022

I like the idea of starting to feed this into the elastic-agent logs. This will help us to already gain some experience around how it will work in v2 as @ph pointed out.

@jlind23 jlind23 transferred this issue from elastic/beats Mar 7, 2022
@ph ph changed the title elastic-agent swallows stderr and panics Elastic-agent should not swallows stderr and panics Mar 17, 2022
@AndersonQ
Copy link
Member

jotting down ideas:

  • do we have panics (stderr) logged to systemd/jornald?
  • could we collect systemd/jornald using the agent's diagnostics collect command?
  • if the logger is set up when the agent panics, we could recover and properly log the panic
  • is there any trade-off for always collecting systemd/jornald logs for the agent on the diagnostics collect?

@ph
Copy link
Contributor

ph commented Mar 23, 2022

Funny thing is, if Agent recover and we are running journald input I presume we would get that information in Elasticsearch.

@mtojek
Copy link
Contributor Author

mtojek commented May 23, 2022

Awesome, thanks @AndersonQ!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working enhancement New feature or request Team:Elastic-Agent-Control-Plane Label for the Agent Control Plane team v8.3.0
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants