Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Log a warning for event loop delay > threshold #96192

Closed
mshustov opened this issue Apr 4, 2021 · 2 comments · Fixed by #103615
Closed

Log a warning for event loop delay > threshold #96192

mshustov opened this issue Apr 4, 2021 · 2 comments · Fixed by #103615
Labels
enhancement New value added to drive a business result Feature:Logging performance Team:Core Core services & architecture: plugins, logging, config, saved objects, http, ES client, i18n, etc

Comments

@mshustov
Copy link
Contributor

mshustov commented Apr 4, 2021

Nodejs is async by nature: it's been created to handle I/O operations quickly, not perform expensive computations.
Read more https://nodejs.org/en/docs/guides/dont-block-the-event-loop/
It means Kibana starts choking when it has to deal with an enormous amount of computations. That can lead to barely debuggable problems: users can observe occasional failures in different unrelated modules that cannot be explained sometimes.

To simplify the diagnosis of problems related to delays in the event loop Core could enhance the existing metrics service logging to notify users when the system is under heavy load. This way, the users can diagnose a problem and keep the whole system running, potentially by disabling some plugins or changing the parameters (e.g. reducing the task manager poll interval).

Scope:

  • There is no "standard" value for nodejs event loop delay: we might need to rely on gathered Kibana APM metrics to find an optimal value for helpfulness and noize level ratio.
  • Add logs that are rate-limited to avoid spamming
  • Add telemetry when threshold violated (TBD on which numbers to report max, min, median, etc.)

@watson can you provide any recommendations, maybe?

@mshustov mshustov added Team:Core Core services & architecture: plugins, logging, config, saved objects, http, ES client, i18n, etc enhancement New value added to drive a business result Feature:Logging labels Apr 4, 2021
@elasticmachine
Copy link
Contributor

Pinging @elastic/kibana-core (Team:Core)

@watson
Copy link
Contributor

watson commented Apr 6, 2021

Now that we're on a newer Node.js version, we have access to APIs from Node.js core to monitor the event loop:

If I recall correctly these are the APIs that APM also uses the collect its metrics, so we could just use those as you also mention.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New value added to drive a business result Feature:Logging performance Team:Core Core services & architecture: plugins, logging, config, saved objects, http, ES client, i18n, etc
Projects
None yet
3 participants