Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[UI] Alert Details page alert instances #55424

Closed
gmmorris opened this issue Jan 21, 2020 · 5 comments · Fixed by #56842
Closed

[UI] Alert Details page alert instances #55424

gmmorris opened this issue Jan 21, 2020 · 5 comments · Fixed by #56842
Assignees
Labels
Team:ResponseOps Label for the ResponseOps team (formerly the Cases and Alerting teams) v7.7.0

Comments

@gmmorris
Copy link
Contributor

gmmorris commented Jan 21, 2020

3rd part of the Alert Details page issue, awaiting the event log API.

This issue adds a list of Alert Instances within the Date Range on the Alert Details page.

1st part: #51546
2nd part: #56280

@mikecote
Copy link
Contributor

mikecote commented Jan 29, 2020

It was discussed today with @mdefazio @peterschretlen @pmuellr @YulNaumenko that we should aim for the following in 7.7:

  • Display the following columns from the mockups: "Instance", "Status", "Start" (when the alert went off), "Duration" (now - The time it started) and "Actions"
  • We would pull the active alerts from the state stored in task manager
  • We would pull the muted alert ids from the alert (I believe it's something like mutedInstanceIds)
  • We have 3 statuses that we can display based on the data we have: muted, active, active and muted
  • Muted instances will only have "Instance", "Status" and "Actions" column populated
  • We're ok with muted instances disappearing from the table after a user unmutes the alert

cc @alexfrancoeur

@Bargs Bargs added the Team:ResponseOps Label for the ResponseOps team (formerly the Cases and Alerting teams) label Jan 29, 2020
@elasticmachine
Copy link
Contributor

Pinging @elastic/kibana-alerting-services (Team:Alerting Services)

@alexfrancoeur
Copy link

Apologies for not being able to stay on for the second half of the design sync. After speaking with @gmmorris I had some thoughts arounds ideas that are being considered and from the looks of it, some were discussed today.

We would pull the active alerts from the state stored in task manager
We would pull the muted alert ids from the alert (I believe it's something like mutedInstanceIds)

I believe this may have been brought up earlier, but this has the potential to make for an odd experience. If I receive an alert and am unable to look at it right away, it's possible that clicking into the alert details link will be empty, likely causing a bit of confusion.

While maybe no longer targeted for 7.7, I have heard talks of having the chart shown when creating an alert also being used for the alert details view. It's an interesting idea, but what does this look like for an alert that isn't solely based off an Elasticsearch query? We wouldn't be able to show relevant results if there is custom logic built within the alert that isn't reflected in Elasticsearch. Do we just not show a chart? It feels like need a way to work in alert instances as part of the time series chart. These could be overlaid annotations. It would also be interesting to have a quick option to split by or filter on alert instance to see how specific alert instances have been matching the criteria.

I've spoken briefly about the complete workflow, and will add comments to this discuss issue #56298, but I would expect after clicking into an alert details link from a notification to be brought to the alert details view from within the timeframe of that notification. Maybe looking back 4x the interval defined from when the notification triggered or something along those lines. That way, there is some context around the alert itself.

To me, a large value of this page is around the history associated with the alert. I wonder if it would it be possible to have Gidi and Patrick work together on an event history API that would support the functionality in the alert details page? If we don't have this, UX and limited functionality. In some cases, you may need to react to the alert immediately to better understand what happened. And in a real world scenario, that may not always be the case. I'm all for progress over perfection, but am weary about providing a confusing experience to our early adopters.

I'm sure a lot of this has already been brought up, but am interested in hearing the groups thoughts.

@pmuellr
Copy link
Member

pmuellr commented Jan 30, 2020

it's possible that clicking into the alert details link will be empty, likely causing a bit of confusion.

Ya, and that you could be looking at an instance, refresh the page somehow, and it will be gone. The plan is to use the eventLog to get historic data on the alert, which would show when instances were "triggered".

what does this [the create alert chart used in the alert details page] look like for an alert that isn't solely based off an Elasticsearch query?

TBD, but probably nothing we can show. Ideally we'd allow an alert type to determine how to render some kind of visualization. We should be showing a time-series of when alerts were triggered/etc, so if they have time-series data - awesome. Working through some of this with the index threshold alert type right now ...

It would also be interesting to have a quick option to split by or filter on alert instance to see how specific alert instances have been matching the criteria.

Interesting. I think the implication here is that the alert instance values (eg, a hostname for uptime) will need to be searchable in the eventLog. Will need to find a nice home for them in ECS ...

I would expect after clicking into an alert details link from a notification to be brought to the alert details view from within the timeframe of that notification. Maybe looking back 4x the interval defined from when the notification triggered or something along those lines. That way, there is some context around the alert itself.

This seems doable. Probably want to see context AFTER the notification was triggered as well. I'm building a replacement for the watcher API we use to generate the visualization in the create alert ui, it's fairly obvious that this same "generate viz data" function can be parameterized with a start/stop date, and reused in the details page to show something different from an end date of "now".

To me, a large value of this page is around the history associated with the alert. I wonder if it would it be possible to have Gidi and Patrick work together on an event history API that would support the functionality in the alert details page?

The eventLog was merged into master last week, in a limited form, with a single use in action execution. There are a few blocking issues we need to work through to enable writing the eventLog entries into ES, currently they're tossed into the bit-bucket, by default. You can enable it now, if you want, but it's not ready for prime-time in terms of ILM support/dealing with Kibana version migrations/etc. Here are the current open issues on eventLog:

https://github.com/elastic/kibana/projects/26?card_filter_query=%5Balerting+event+log%5D

@gmmorris
Copy link
Contributor Author

gmmorris commented Jan 31, 2020

The eventLog was merged into master last week, in a limited form, with a single use in action execution. There are a few blocking issues we need to work through to enable writing the eventLog entries into ES, currently they're tossed into the bit-bucket, by default. You can enable it now, if you want, but it's not ready for prime-time in terms of ILM support/dealing with Kibana version migrations/etc. Here are the current open issues on eventLog:

https://github.com/elastic/kibana/projects/26?card_filter_query=%5Balerting+event+log%5D

This might be a mistake on my part - I thought there was still work needed to make it available via API to the UI?

Edit:
Here is what I thought we're waiting for: #55633

@kobelb kobelb added the needs-team Issues missing a team label label Jan 31, 2022
@botelastic botelastic bot removed the needs-team Issues missing a team label label Jan 31, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Team:ResponseOps Label for the ResponseOps team (formerly the Cases and Alerting teams) v7.7.0
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants