Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TSVB rule type investigation / POC #104270

Closed
jasonrhodes opened this issue Jul 4, 2021 · 16 comments
Closed

TSVB rule type investigation / POC #104270

jasonrhodes opened this issue Jul 4, 2021 · 16 comments
Labels
R&D Research and development ticket (not meant to produce code, but to make a decision) Team:Infra Monitoring UI - DEPRECATED DEPRECATED - Label for the Infra Monitoring UI team. Use Team:obs-ux-infra_services

Comments

@jasonrhodes
Copy link
Member

jasonrhodes commented Jul 4, 2021

Investigate and implement a TSVB-backed rule type for mathematical expression-based conditions.

Users will be able to choose (and search) from a list of TSVB visualizations. Once they pick the visualization they can set thresholds per series. If the series has a terms aggregation, the threshold will be run on each split that's returned, similar to how group by works in the Metrics Threshold alerts. The interval for the visualization will be determined by FOR THE LAST x setting and the time range will be 5x the interval. For example, if the user picks FOR THE LAST 5 minutes, the time range will be Last 25 minutes.

We are using the last 5 buckets of the interval for the time range in Metrics Threshold and Inventory Threshold due to the way date_histogram works with extended_bounds. We need to ensure the last bucket is a whole bucket. Also with rates, the first bucket is often null and 5 buckets worth of data guarantees we get at least 3-4 whole buckets.

image

@jasonrhodes jasonrhodes added Team:Infra Monitoring UI - DEPRECATED DEPRECATED - Label for the Infra Monitoring UI team. Use Team:obs-ux-infra_services v7.15.0 labels Jul 4, 2021
@elasticmachine
Copy link
Contributor

Pinging @elastic/logs-metrics-ui (Team:logs-metrics-ui)

@jasonrhodes
Copy link
Member Author

@simianhacker if you can flesh this ticket description out with your rough idea/ideas before you dive into working on it this week, that'd be a big help. Thanks1

@timroes
Copy link
Contributor

timroes commented Jul 5, 2021

Just a note to that: If the alert is meant to communicate with the TSVB endpoint, please make sure it's using the index pattern mode of the API. We're currently moving TSVB over to only work with index patterns, and will at some point fade out usage of the legacy string based definition of index pattern, thus we should no longer consider any new features to rely on that mode.

@simianhacker
Copy link
Member

@timroes Thanks for the head up, we are not planning on changing any of the TSVB functionality. Just giving users a way to set thresholds on metrics they've configured in TSVB.

@timroes
Copy link
Contributor

timroes commented Jul 8, 2021

Let's please make sure to involve someone from product side on those, to not run into an issue, where that alert might get confused later on with the regular TSVB (all visualization) alert. It sounds to me right now (without having deep understanding of your plans) that those things might becomevery similar to each other.

cc @kmartastic

@kmartastic
Copy link
Contributor

cc @ghudgins

@wylieconlon
Copy link
Contributor

I would be interested in seeing what it means to define a rule from TSVB, specifically how will a user define the alerting criteria? For example, is it "if Y is above 100 for the last 5 buckets"?

Also, FYI Lens now supports a larger set of mathematical "formulas" than what TSVB does, for example month over month. We are blocked on Lens alerting for many reasons including product and technical ones, but you can track the parent issue here: Lens alerting

@simianhacker
Copy link
Member

@wylieconlon I just updated the description with a wireframe which should give you basic idea of how this should work. The user would be able to set a threshold per series. If a series has a terms aggregation the threshold will be evaluated per split. Since TSVB is really popular with the observability crowd, it only makes sense to allow them to create alerts for charts they're already using. From my understanding, there are quite a few requests from the field for this functionality.

@wylieconlon
Copy link
Contributor

Thanks, that helps me understand what you're imagining. It looks like your example is the default TSVB chart type, where it's a single-series line chart, but I'm curious whether you think the same approach can handle all the other TSVB options like "Group by", "Top N" or "Entire time range mode".

@jasonrhodes
Copy link
Member Author

Thanks for all of the comments here, Lens folks. We definitely want to keep you all updated as we progress on this. We unfortunately need to get something into the product sooner than later so we're investigating how this might work here. Please follow along with this ticket for updates on which way(s) we decide to go.

@jasonrhodes
Copy link
Member Author

Thanks for fleshing out the proof of concept here, @simianhacker. One other thing we should think about before we get too far into implementation is how we store this information in the alert so it can be edited later -- for now do we just make the alert read from the visualization dynamically, so that if the viz is updated, it would be affecting the alert? Or do we have a way of serializing the TSVB query somehow so it's de-coupled from the visualization, but then it wouldn't be editable?

@simianhacker
Copy link
Member

@wylieconlon The data behind the different visualization types is the same data, except most of them are just displaying the last bucket; we won't need to handle those differently. For entire time range, we will use the same metric but honor the "FOR THE LAST x" as the time range. "Group By" will be handled the same way we handle it in the Metric Threshold alert, the threshold will be evaluated against each split (or line you would see in the chart).

@simianhacker
Copy link
Member

We might want to visualize the chart similar to our previews in the Metric Threshold chart (bar chart) since it's a better representation of the buckets and how the values change over time. I don't think there is much advantage to using the actual TSVB visualization. The real value is using the metric definitions and data returned from the TSVB query.

@jasonrhodes jasonrhodes added the refined Issue refined, ready to work on label Jul 12, 2021
@jasonrhodes jasonrhodes removed the refined Issue refined, ready to work on label Jul 20, 2021
@jasonrhodes jasonrhodes changed the title TSVB rule type TSVB rule type investigation / POC Jul 20, 2021
@jasonrhodes jasonrhodes added the R&D Research and development ticket (not meant to produce code, but to make a decision) label Jul 20, 2021
@jasonrhodes
Copy link
Member Author

After investigation we decided not to move forward with this option for now.

@jasonrhodes
Copy link
Member Author

Main reasons we didn't move forward:

  • No support for multiple group by in TSVB
  • We didn't find a nice way to embed TSVB builder inside a rule creation UX, and having to create a visualization first, then specify the viz in a rule, felt like a bad UX. What happens if someone changes the viz later? Does it potentially and accidentally break the rule? ETc.

@ghudgins
Copy link

really appreciate the notes. linking up a few relevant issues (and pinging a few since I think this is really useful insight!)

  • [Lens] Allow multi field "top values" #95079 for "multi group by" is hopefully coming in 7.16 or 8.1. it's WIP right now
  • this is a large issue that is in our backlog for vis-level alerts: [Lens] Alerting from Lens Visualizations #71150 that, coupled with above, would hopefully help
  • there's an effort in discover planned in early 8.x that could help on the usability points when they go through design...since vis vs. saved search may look different but logically are pretty similar. should definitely have friendly/usable answers to these sorts of questions

CC @flash1293 / @AlonaNadler / @shaunmcgough / @ryankeairns

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
R&D Research and development ticket (not meant to produce code, but to make a decision) Team:Infra Monitoring UI - DEPRECATED DEPRECATED - Label for the Infra Monitoring UI team. Use Team:obs-ux-infra_services
Projects
None yet
Development

No branches or pull requests

8 participants