[Alerting] Active alerts do not recover after re-enabling a rule #111671

YulNaumenko · 2021-09-09T04:30:50Z

Summary

Partially resolves #110080
Based on the multiple discussions about, how alerting framework should resolve the alert instances in the case when the rule was disabled, the decision was made:
to move forward with the recovery option for the short term, with a solution that will not trigger resolve-on-recovery for disabled alerts that are set up to resolve incidents on recovery.

Checklist

Documentation was added for features that require explanation or tutorials
Unit or functional tests were updated or added to match the most common scenarios

x-pack/plugins/alerting/server/rules_client/rules_client.ts

YulNaumenko · 2021-10-13T18:41:07Z

@elasticmachine merge upstream

elasticmachine · 2021-10-13T23:58:05Z

Pinging @elastic/kibana-alerting-services (Team:Alerting Services)

YulNaumenko · 2021-10-14T03:51:12Z

@pmuellr do you think the additional event log flag could be introduced as a separate issue or it is possible to be added in the current scope?

ymao1

Looks great! I verified this works as expected by

creating a rule that will have active alerts, let it run
disable the rule
verify event log docs for recovered-instance are written for each active instance
edit the rule so that the condition will no longer be met
enable the rule and see Recovered in the Rule Details view

I also tried

creating a rule that will have active alerts, let it run
disable the rule
verify event log docs for recovered-instance are written for each active instance
enable the rule (without editing) and I see Recovered in the Rule Details view, but they quickly change back to Active.

I guess in the second scenario if the rule takes a while to execute, you would see Recovered in the UI for the execution duration, which may be a little confusing? But I think that is ok.

Can we add functional tests for this?

x-pack/plugins/alerting/server/rules_client/rules_client.ts

x-pack/plugins/alerting/server/lib/create_alert_event_log_record_object.ts

x-pack/plugins/alerting/server/rules_client/rules_client.ts

Co-authored-by: ymao1 <ying.mao@elastic.co>

ymao1

LGTM! Works as expected

x-pack/plugins/alerting/server/lib/create_alert_event_log_record_object.test.ts

pmuellr · 2021-10-14T18:54:39Z

do you think the additional event log flag could be introduced as a separate issue or it is possible to be added in the current scope?

I assume a new property in the event log mappings? This would be something we add to the recovered events to indicate that they are "recovered" because the rule was disabled? We'd have to come up with a name and location for it in the doc, etc. Something like kibana.alerting.recovered_because_disabled - doesn't feel right, maybe we need a recovered_reason field or something? Feels like we need some design work to figure out the right way to indicate this in the event log

One thing we could do short-term to make these mostly searchable in the event log, would be to update the message field for recovered events. For instance, add some "easily searchable via KQL" text in the message like recovered due to alert being disabled, or similar.

pmuellr

LGTM; love the new createAlertEventLogRecordObject() so we can have a single place (or at least fewer places) to update when we add new fields ...

YulNaumenko · 2021-10-15T20:37:34Z

@elasticmachine merge upstream

YulNaumenko · 2021-10-17T19:54:24Z

@elasticmachine merge upstream

YulNaumenko · 2021-10-18T00:18:32Z

@elasticmachine merge upstream

kibanamachine · 2021-10-18T02:44:03Z

💚 Build Succeeded

Metrics [docs]

✅ unchanged

History

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

cc @YulNaumenko

…-migrate-away-from-injected-css-js * 'master' of github.com:elastic/kibana: (237 commits) [Uptime] Added uptime query inspector panel (elastic#115170) [Osquery] Add packs (elastic#107345) [App Search] Allow for query parameter to indicate ingestion mechanism for new engines (elastic#115188) [Alerting] Active alerts do not recover after re-enabling a rule (elastic#111671) skip flaky tests. elastic#115308, elastic#115313 [Breaking] Remove deprecated `enabled` settings from plugins. (elastic#113495) skip flaky suite. elastic#107057 skip flaky tests. elastic#89052, elastic#113418, elastic#115304 skip flaky test. elastic#113892 Bump node to 16.11.1 (elastic#110684) [Security Solution] Restores Alerts table local storage persistence and the Remove Column action (elastic#114742) skip flaky suite. elastic#115130 one line remove assert (elastic#115127) Fixes migration bug where I was deleting attributes (elastic#115098) [Security Solutions] Fixes the newer notification system throttle resets and enabling immediate execution on first detection of a signal (elastic#114214) [build] Dockerfile update (elastic#115237) Fixes Cypress flake cypress test (elastic#115270) Disable APM e2e tests log an invalid type for SO (elastic#115175) [Fleet] Don't auto upgrade policies for AUTO_UPDATE packages (elastic#115199) ... # Conflicts: # src/plugins/dashboard/public/application/dashboard_app.tsx # src/plugins/dashboard/public/types.ts # x-pack/plugins/reporting/server/lib/layouts/print_layout.ts

…-link-to-kibana-app * 'master' of github.com:elastic/kibana: (287 commits) [Security Solution][Endpoint] Change `trustedAppByPolicyEnabled` flag to `true` by default (elastic#115264) [APM] generator: support error events and application metrics (elastic#115311) [kibanaUtils] Don't import full `semver` client side (elastic#114986) [RAC] Link inventory alerts to the right inventory view (elastic#113553) [Uptime] Added uptime query inspector panel (elastic#115170) [Osquery] Add packs (elastic#107345) [App Search] Allow for query parameter to indicate ingestion mechanism for new engines (elastic#115188) [Alerting] Active alerts do not recover after re-enabling a rule (elastic#111671) skip flaky tests. elastic#115308, elastic#115313 [Breaking] Remove deprecated `enabled` settings from plugins. (elastic#113495) skip flaky suite. elastic#107057 skip flaky tests. elastic#89052, elastic#113418, elastic#115304 skip flaky test. elastic#113892 Bump node to 16.11.1 (elastic#110684) [Security Solution] Restores Alerts table local storage persistence and the Remove Column action (elastic#114742) skip flaky suite. elastic#115130 one line remove assert (elastic#115127) Fixes migration bug where I was deleting attributes (elastic#115098) [Security Solutions] Fixes the newer notification system throttle resets and enabling immediate execution on first detection of a signal (elastic#114214) [build] Dockerfile update (elastic#115237) ... # Conflicts: # x-pack/plugins/reporting/public/management/__snapshots__/report_listing.test.tsx.snap

#111671) (#115443) * fixed merge * fixed merge

…stic#111671) * [Alerting] Active alerts do not recover after re-enabling a rule * created reusable lib file for generating event log object * comment fix * fixed tests * fixed tests * fixed typecheck * fixed due to comments * Apply suggestions from code review Co-authored-by: ymao1 <ying.mao@elastic.co> * fixed due to comments * fixed due to comments * fixed due to comments * fixed tests * Update disable.ts * Update disable.ts Co-authored-by: Kibana Machine <42973632+kibanamachine@users.noreply.github.com> Co-authored-by: ymao1 <ying.mao@elastic.co>

[Alerting] Active alerts do not recover after re-enabling a rule

927bf7c

YulNaumenko self-assigned this Sep 9, 2021

pmuellr mentioned this pull request Sep 10, 2021

Active alerts do not recover after re-enabling a rule #110080

Closed

pmuellr reviewed Sep 29, 2021

View reviewed changes

x-pack/plugins/alerting/server/rules_client/rules_client.ts Outdated Show resolved Hide resolved

pmuellr reviewed Sep 29, 2021

View reviewed changes

x-pack/plugins/alerting/server/rules_client/rules_client.ts Outdated Show resolved Hide resolved

kibanamachine and others added 2 commits October 13, 2021 14:41

Merge branch 'master' into alerting-recover-instances-after-disable

5166063

created reusable lib file for generating event log object

ef7ea30

YulNaumenko marked this pull request as ready for review October 13, 2021 23:58

YulNaumenko requested a review from a team as a code owner October 13, 2021 23:58

YulNaumenko and others added 3 commits October 13, 2021 17:03

comment fix

c516cd1

fixed tests

0929cd2

fixed tests

0f46f7a

YulNaumenko requested a review from pmuellr October 14, 2021 03:44

ymao1 reviewed Oct 14, 2021

View reviewed changes

YulNaumenko and others added 6 commits October 14, 2021 09:49

fixed typecheck

7c02a57

fixed due to comments

45ecc80

Apply suggestions from code review

f04e68c

Co-authored-by: ymao1 <ying.mao@elastic.co>

fixed due to comments

e3390a0

fixed due to comments

d5d6d02

fixed due to comments

ef343d6

YulNaumenko requested a review from ymao1 October 14, 2021 18:23

ymao1 approved these changes Oct 14, 2021

View reviewed changes

x-pack/plugins/alerting/server/lib/create_alert_event_log_record_object.test.ts Outdated Show resolved Hide resolved

fixed due to comments

cbcd252

pmuellr approved these changes Oct 14, 2021

View reviewed changes

kibanamachine and others added 2 commits October 15, 2021 16:37

Merge branch 'master' into alerting-recover-instances-after-disable

61fcec2

fixed tests

189b1cd

kibanamachine and others added 3 commits October 17, 2021 15:54

Merge branch 'master' into alerting-recover-instances-after-disable

6e37109

Update disable.ts

0bcf3cb

Update disable.ts

eca1240

Merge branch 'master' into alerting-recover-instances-after-disable

f5d6cbe

YulNaumenko merged commit 84df569 into elastic:master Oct 18, 2021

YulNaumenko mentioned this pull request Oct 18, 2021

[7.x] [Alerting] Active alerts do not recover after re-enabling a rule (#111671) #115443

Merged

YulNaumenko added a commit that referenced this pull request Oct 18, 2021

[7.x] [Alerting] Active alerts do not recover after re-enabling a rule (

21f3eb7

#111671) (#115443) * fixed merge * fixed merge

ymao1 mentioned this pull request Nov 9, 2021

[Alerting] Gracefully handle errors when retrieving task document on rule disable #118024

Closed

pmuellr mentioned this pull request Jan 31, 2022

[ResponseOps] muted alert still fires when using notifyWhen: onActionGroupChange #124170

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Alerting] Active alerts do not recover after re-enabling a rule #111671

[Alerting] Active alerts do not recover after re-enabling a rule #111671

YulNaumenko commented Sep 9, 2021 •

edited

Loading

YulNaumenko commented Oct 13, 2021

elasticmachine commented Oct 13, 2021

YulNaumenko commented Oct 14, 2021

ymao1 left a comment

ymao1 left a comment

pmuellr commented Oct 14, 2021

pmuellr left a comment

YulNaumenko commented Oct 15, 2021

YulNaumenko commented Oct 17, 2021

YulNaumenko commented Oct 18, 2021

kibanamachine commented Oct 18, 2021

[Alerting] Active alerts do not recover after re-enabling a rule #111671

[Alerting] Active alerts do not recover after re-enabling a rule #111671

Conversation

YulNaumenko commented Sep 9, 2021 • edited Loading

Summary

Checklist

YulNaumenko commented Oct 13, 2021

elasticmachine commented Oct 13, 2021

YulNaumenko commented Oct 14, 2021

ymao1 left a comment

Choose a reason for hiding this comment

ymao1 left a comment

Choose a reason for hiding this comment

pmuellr commented Oct 14, 2021

pmuellr left a comment

Choose a reason for hiding this comment

YulNaumenko commented Oct 15, 2021

YulNaumenko commented Oct 17, 2021

YulNaumenko commented Oct 18, 2021

kibanamachine commented Oct 18, 2021

💚 Build Succeeded

Metrics [docs]

History

YulNaumenko commented Sep 9, 2021 •

edited

Loading