Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Metrics UI] Add Metrics Anomaly Alert Type #89244

Merged
merged 39 commits into from
Feb 9, 2021

Conversation

Zacqary
Copy link
Contributor

@Zacqary Zacqary commented Jan 25, 2021

Summary

Closes #74809

Adds the Metrics Anomaly alert type. Also revamps the alert dropdown to allow you to add anomaly alerts (and all alert types) from either the Inventory page or Metrics Explorer.

Alert UI

Screen Shot 2021-01-27 at 10 31 27 AM

Influencer Filter Autocomplete

Screen Shot 2021-01-27 at 4 31 56 PM

Dropdown UI

Screen Shot 2021-01-26 at 2 02 44 PM

Screen Shot 2021-01-26 at 2 02 39 PM

Screen Shot 2021-01-26 at 2 02 35 PM

How To Test

Alert Type

  1. Generate anomalies with Slingshot using this guide
  2. Open the Create Alert flyout for anomaly alerts, and use the Alert Preview dialog to make sure the alert would have fired over the past hour/day, depending on how easily you can get anomalies to generate.
  3. Create an anomaly alert. Then, generate a new anomaly using Slingshot. Ensure it fires.
  4. Try 2. and 3. with influencer filters, and ensure the alert only fires if you've filtered to a matching influencer. Follow the gist instructions on causing anomalies only on a few pods to create anomalies.
  5. Make sure the influencer filter works both with an exact match and with a wildcard:

Screen Shot 2021-01-26 at 3 21 40 PM

Screen Shot 2021-01-26 at 3 21 33 PM

UI

  1. Stress-test the Influencer Filter autocomplete to make sure it behaves in a way that makes sense.

Checklist

Delete any items that are not applicable to this PR.

@Zacqary Zacqary added release_note:enhancement Feature:Metrics UI Metrics UI feature v8.0.0 Team:Infra Monitoring UI - DEPRECATED DEPRECATED - Label for the Infra Monitoring UI team. Use Team:obs-ux-infra_services v7.12.0 labels Jan 25, 2021
@Zacqary Zacqary requested a review from a team as a code owner January 25, 2021 22:39
@elasticmachine
Copy link
Contributor

Pinging @elastic/logs-metrics-ui (Team:logs-metrics-ui)

@Zacqary
Copy link
Contributor Author

Zacqary commented Jan 26, 2021

@sorantis Should this be called "Metrics Anomaly" or "Infrastructure Anomaly"? Docs call inventory alerts "Infrastructure Threshold" (https://www.elastic.co/guide/en/observability/master/infrastructure-threshold-alert.html) so I'm wondering if we should shift towards that naming convention

@Zacqary
Copy link
Contributor Author

Zacqary commented Jan 26, 2021

@katefarrar Could use some feedback on this influencer filter component.

By default, it looks like:
Screen Shot 2021-01-26 at 3 24 23 PM

  • Is it clear that you can enter a possible field value of kubernetes.pod.uid in this field to limit the alert scope just to a certain pod?
  • Is it clear that when the field is empty and the placeholder says (Any) that that means the alert scope won't be limited at all?

Screen Shot 2021-01-26 at 3 21 40 PM

  • The user can either enter an exact search string, or use * as a wildcard. Is it okay that we don't explicitly state that wildcards can be used (i.e. Elastic users should expect they can use wildcards anywhere), or should we change/add some text to make this clear? (We can mention wildcards in the docs, for sure, but is that enough?)

@sorantis
Copy link

Infrastructure anomaly SGTM

@katefarrar
Copy link
Contributor

katefarrar commented Jan 27, 2021

Talked with @Zacqary and we're going to implement the following changes:
anomaly influencer

  • Add a checkbox with Filter by node and the filter component underneath
  • Change placeholder text from Any to Everything and have that search component auto-fill as you type (similar to the "Create alert per" option on a threshold alert) Screen Shot 2021-01-27 at 12 48 21 PM
  • Update helper text to indicate that a wildcard search can be used: For example: "kubernetes-pod-id" or "kubernetes-pod-id-a*" (Zacqary is going to figure out the final text)

@Zacqary
Copy link
Contributor Author

Zacqary commented Feb 2, 2021

@simianhacker Didn't realize the alert preview wasn't an accurate reflection of how the alert would perform. startTimes only get recorded on multiples of 15 minutes, even if the anomaly is detected mid-bucket, so an alert interval of 1 minute does indeed miss those.

I'm pushing a fix, but it's unable to prevent firing the alert up to 15 times per anomaly. I think this alert type will greatly benefit from not changing the default Notify only on status change setting.

With that in mind, this alert type probably really wants us to merge #89939

# Conflicts:
#	x-pack/plugins/infra/public/alerting/inventory/components/alert_dropdown.tsx
#	x-pack/plugins/infra/public/alerting/metric_threshold/components/alert_dropdown.tsx
#	x-pack/plugins/infra/public/components/logging/log_analysis_setup/subscription_splash_content.tsx
@Zacqary
Copy link
Contributor Author

Zacqary commented Feb 5, 2021

@elasticmachine merge upstream

Copy link
Member

@simianhacker simianhacker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just tested my scenario and everything works as expected.

LGTM

# Conflicts:
#	x-pack/plugins/infra/public/types.ts
#	x-pack/plugins/infra/server/lib/infra_ml/metrics_hosts_anomalies.ts
#	x-pack/plugins/infra/server/lib/infra_ml/metrics_k8s_anomalies.ts
#	x-pack/plugins/infra/server/lib/infra_ml/queries/metrics_hosts_anomalies.ts
#	x-pack/plugins/infra/server/lib/infra_ml/queries/metrics_k8s_anomalies.ts
@Zacqary Zacqary enabled auto-merge (squash) February 9, 2021 21:16
@kibanamachine
Copy link
Contributor

💚 Build Succeeded

Metrics [docs]

Module Count

Fewer modules leads to a faster build time

id before after diff
infra 1046 1050 +4

Async chunks

Total size of all lazy-loaded chunks that will be downloaded as the user navigates the app

id before after diff
infra 1.9MB 1.9MB +15.6KB

Page load bundle

Size of the bundles that are downloaded on every page load. Target size is below 100kb

id before after diff
infra 172.6KB 188.2KB +15.6KB
Unknown metric groups

async chunk count

id before after diff
infra 13 14 +1

History

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

@Zacqary Zacqary merged commit 0d94968 into elastic:master Feb 9, 2021
spalger added a commit that referenced this pull request Feb 10, 2021
@spalger
Copy link
Contributor

spalger commented Feb 10, 2021

Sorry @Zacqary but once this was merged it started failing master builds in a critical way so I needed to revert it https://kibana-ci.elastic.co/job/elastic+kibana+baseline-capture/13507/execution/node/155/log/

Please resubmit in a new PR with master merged. Thanks!

Zacqary added a commit to Zacqary/kibana that referenced this pull request Feb 10, 2021
Zacqary added a commit to Zacqary/kibana that referenced this pull request Feb 10, 2021
Co-authored-by: Kibana Machine <42973632+kibanamachine@users.noreply.github.com>
# Conflicts:
#	x-pack/plugins/infra/public/types.ts
Zacqary added a commit that referenced this pull request Feb 10, 2021
#90889)

* Revert "Revert "[Metrics UI] Add Metrics Anomaly Alert Type (#89244)""

This reverts commit 8166bec.

* Fix type error
gmmorris added a commit to gmmorris/kibana that referenced this pull request Feb 10, 2021
* master: (99 commits)
  [Fleet] Use Fleet Server indices in the search bar (elastic#90835)
  [Search Sessions] added an info flyout to session management (elastic#90559)
  [ILM] Revisit searchable snapshot field after new redesign (elastic#90793)
  [Alerting] License Errors on Alert List View (elastic#89920)
  RFC Improve saved object migrations algorithm (elastic#84333)
  [Lens] (Accessibility) Fix focus on drag and drop actions (elastic#90561)
  Use new shortcut links to Fleet discuss forums. (elastic#90786)
  Do not generate an ephemeral encryption key in production. (elastic#81511)
  [Fleet] Use staging registry for snapshot builds (elastic#90327)
  Actually deleting x-pack/tsconfig.refs.json (elastic#90898)
  Add deprecation warning to all Beats CM pages. (elastic#90741)
  skip flaky suite (elastic#90136)
  Revert "Revert "[Metrics UI] Add Metrics Anomaly Alert Type (elastic#89244)"" (elastic#90889)
  remove ref to removed tsconfig file
  [core.logging] Uses host timezone as default (elastic#90368)
  [Maps] remove maps_file_upload plugin and fold public folder into file_upload plugin (elastic#90292)
  Revert "[Metrics UI] Add Metrics Anomaly Alert Type (elastic#89244)"
  [dev-utils/ci-stats] support disabling ship errors (elastic#90851)
  Prefix with / (elastic#90836)
  [Metrics UI] Add Metrics Anomaly Alert Type (elastic#89244)
  ...
Zacqary added a commit that referenced this pull request Feb 10, 2021
Co-authored-by: Kibana Machine <42973632+kibanamachine@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Feature:Metrics UI Metrics UI feature release_note:enhancement reverted Team:Infra Monitoring UI - DEPRECATED DEPRECATED - Label for the Infra Monitoring UI team. Use Team:obs-ux-infra_services v7.12.0 v8.0.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Metrics UI] Create new Inventory Anomaly alert
8 participants