Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Alerting] Show execution duration on Rule Details view #114719

Merged
merged 20 commits into from
Oct 14, 2021

Conversation

ymao1
Copy link
Contributor

@ymao1 ymao1 commented Oct 12, 2021

Resolves #114616

Summary

Added summary of execution durations to Rule Details view, with warning when the average execution duration exceeds the rule timeout

Rule with status: active and 30 most recent execution durations with average
Screen Shot 2021-10-14 at 12 06 52 PM

Rule where last execution ended in error
Screen Shot 2021-10-14 at 12 07 40 PM

Rule where no execution data is available yet
Screen Shot 2021-10-14 at 12 09 15 PM

Rule where average execution duration exceeds rule timeout
Screen Shot 2021-10-14 at 12 17 49 PM

Rule where execution duration suddenly spikes
Screen Shot 2021-10-14 at 12 09 44 PM

Checklist

Delete any items that are not applicable to this PR.

@ymao1 ymao1 changed the title Alerting/show durations rule details [Alerting] Show execution duration on Rule Details view Oct 14, 2021
@ymao1 ymao1 self-assigned this Oct 14, 2021
@ymao1 ymao1 added Feature:Alerting/RulesFramework Issues related to the Alerting Rules Framework release_note:enhancement Team:ResponseOps Label for the ResponseOps team (formerly the Cases and Alerting teams) v7.16.0 v8.0.0 labels Oct 14, 2021
@ymao1 ymao1 marked this pull request as ready for review October 14, 2021 13:38
@ymao1 ymao1 requested review from a team as code owners October 14, 2021 13:38
@elasticmachine
Copy link
Contributor

Pinging @elastic/kibana-alerting-services (Team:Alerting Services)

@ymao1 ymao1 requested a review from mdefazio October 14, 2021 13:38
@@ -37,6 +66,8 @@ type AlertInstancesProps = {
durationEpoch?: number;
} & Pick<AlertApis, 'muteAlertInstance' | 'unmuteAlertInstance'>;

const DESIRED_NUM_EXECUTION_DURATIONS = 30;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are piggybacking on the existing event log query to retrieve the execution duration information. The existing query looks back 60*schedule interval, so due to scheduling drift, we actually get 50-ish to 60 executions. In order to maintain a more consistent chart experience, we're truncating to 30. In the future, we can think about performing a separate query specifically for execution durations (and using aggregations for the avg) and when we do that, we would be able to specify exactly the number of executions to show.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've opened this issue separate out the query for execution durations in the future, which would allow us to show a dropdown for users to select the number they want to see.

Copy link
Contributor

@mdefazio mdefazio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have some feedback on title sizes and flexbox widths. Happy to chat about these if i'm not clear on my comments—hopefully put them in the right spot :)

<EuiHealth
data-test-subj={`ruleStatus-${alert.executionStatus.status}`}
textSize="inherit"
color={healthColor}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like in the screenshot that the 'health' color doesn't match what is shown in the table. (Looks like primary in the table and healthy here).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This health color relates to the rule execution and the color in the table relates to whether the alert for the rule is active or recovered. They are two different things.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess it is confusing that they both say active and they are different colors.....but this color lines up with the color on the Rule Management List view

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Screen Shot 2021-10-14 at 11 54 15 AM

@ymao1 ymao1 requested a review from mdefazio October 14, 2021 16:18
@ymao1
Copy link
Contributor Author

ymao1 commented Oct 14, 2021

@mdefazio Updated and updated the screenshots as well

Copy link
Contributor

@chrisronline chrisronline left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! One small nit but great work here!

Copy link
Contributor

@mdefazio mdefazio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for making these last minute edits!

Copy link
Contributor

@YulNaumenko YulNaumenko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I think it would be useful to expose the charts functionality as a separate (maybe reusable) component, but this could be done in the follow up issue. Don't want to block you with this suggestion.

@ymao1 ymao1 added the auto-backport Deprecated - use backport:version if exact versions are needed label Oct 14, 2021
@kibanamachine
Copy link
Contributor

💚 Build Succeeded

Metrics [docs]

Module Count

Fewer modules leads to a faster build time

id before after diff
triggersActionsUi 376 378 +2

Public APIs missing comments

Total count of every public API that lacks a comment. Target amount is 0. Run node scripts/build_api_docs --plugin [yourplugin] --stats comments for more detailed information.

id before after diff
alerting 249 250 +1

Async chunks

Total size of all lazy-loaded chunks that will be downloaded as the user navigates the app

id before after diff
triggersActionsUi 787.4KB 793.5KB +6.2KB

Page load bundle

Size of the bundles that are downloaded on every page load. Target size is below 100kb

id before after diff
triggersActionsUi 50.6KB 50.8KB +119.0B
Unknown metric groups

API count

id before after diff
alerting 257 258 +1

History

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

cc @ymao1

@kibanamachine
Copy link
Contributor

💚 Backport successful

Status Branch Result
7.x

This backport PR will be merged automatically after passing CI.

kibanamachine pushed a commit to kibanamachine/kibana that referenced this pull request Oct 14, 2021
* Adding execution duration to get alert instance summary

* Showing execution duration on rule details view

* Fixing unit tests

* Updating to match new mockup

* Fixing types

* Fixing functional test

* Removing unneeded max and min and adding tests

* Removing unneeded max and min and adding tests

* Fixing functional test

* Adding left axis

* PR feedback

* Reducing chart height

* PR feedback

* PR feedback

* PR feedback
kibanamachine added a commit that referenced this pull request Oct 14, 2021
…15092)

* Adding execution duration to get alert instance summary

* Showing execution duration on rule details view

* Fixing unit tests

* Updating to match new mockup

* Fixing types

* Fixing functional test

* Removing unneeded max and min and adding tests

* Removing unneeded max and min and adding tests

* Fixing functional test

* Adding left axis

* PR feedback

* Reducing chart height

* PR feedback

* PR feedback

* PR feedback

Co-authored-by: ymao1 <ying.mao@elastic.co>
artem-shelkovnikov pushed a commit to artem-shelkovnikov/kibana that referenced this pull request Oct 20, 2021
* Adding execution duration to get alert instance summary

* Showing execution duration on rule details view

* Fixing unit tests

* Updating to match new mockup

* Fixing types

* Fixing functional test

* Removing unneeded max and min and adding tests

* Removing unneeded max and min and adding tests

* Fixing functional test

* Adding left axis

* PR feedback

* Reducing chart height

* PR feedback

* PR feedback

* PR feedback
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
auto-backport Deprecated - use backport:version if exact versions are needed Feature:Alerting/RulesFramework Issues related to the Alerting Rules Framework release_note:enhancement Team:ResponseOps Label for the ResponseOps team (formerly the Cases and Alerting teams) v7.16.0 v8.0.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Alerting] Surface rule execution durations in rule details
6 participants