[Infra UI Meta Issue] Improve UI Field Selection for Metricbeat #40277

simianhacker · 2019-04-05T16:32:10Z

Problem

The fields returned from the _field_cap API for Metricbeat indices includes over 2000 fields since every possible field is present in the index mapping. The current approach in the UI, is to present the user with a "combo box" that allows them to narrow down the list by searching. This requires the user have intimate knowledge of the Metricbeat fields. There is not an Elasticsearch native way to filter down this list to only include fields with actual data.

Possible Solutions

Create an aggregation that paginates through all the fields (100 at a time) and calculate the cardinality of each field for the time range the user is viewing. This is very costly and could take several seconds to complete.
Create parallel count requests (100 at a time) to check if the field exists in the current time range. Initial attempts have also proven to be expensive as well.
Filter the list of fields using event.dataset or metricset.module as a required prefix. This would require an aggregation to be run on the data but the potential cardinality of metricset.module is relatively low. We would also need to keep a whitelist of prefixes for things like host and cloud. The down side to this is any field we don't recognize for as an "official" prefix would be filtered out; this would apply to user defined fields.

Related Issues

https://github.com/elastic/dev/issues/1223
#36843
#38020
#39613
#40120
#41090

The text was updated successfully, but these errors were encountered:

ruflin · 2019-04-08T06:29:39Z

I filed this issue #24709 in Kibana some time ago which I think is related. It seems Kibana already has the capabilities for option 2 today which I think would be right approach.

weltenwort · 2019-04-08T08:59:19Z

A few possibilities come to mind, which are mostly independent:

Grouping: Maybe we could find a good middle ground by partitioning the shown list of fields into multiple sections? There could be a "recommended" or "commonly used" section at the top and "everything else" in a second section at the end. The partitioning could go even further by grouping the fields into sections that map to the ECS fields sets (base, agent, network, geo, etc).

Async cardinality: We could also asynchronously calculate the cardinality of the fields when the user opens the menu. That allows them to immediately select something if they know what they were looking for, or to wait for additional details to load. On the other hand, the incremental addition/re-sorting might be confusing and it still causes some load on the cluster.

Batch cardinality calculation: There is task manager available in Kibana, that can be used to perform coordinated batch operations. We could just pre-calculate the cardinalities every N minutes and store the results in a saved object for constant-time retrieval. (That could also be useful for many other applications, so it might make for an interesting shared service.)

roncohen · 2019-04-12T12:30:27Z

Unless we find a way to query for the relevant metrics that returns quickly, I think it needs to happen in the background, along the lines of "Batch cardinality calculation".

An option is to investigate if we can use the new data frames transformations: https://www.elastic.co/guide/en/elasticsearch/reference/master/preview-data-frame-transform.html

Another option is that metricbeat simply emits a document for each metric it has collected the last minute.

{"@timestamp": ..., “metric”: {“name”: “system.cpu.idle.pct”}.

We could eventually roll this up, so you'd have one document per metric per day. Admittedly, it's not great.

simianhacker · 2019-05-21T23:17:16Z

Here is my solution to this problem: #36843

ruflin · 2019-05-22T06:26:12Z

@simianhacker This is great. Is this similar to what the "left bar" does discussed in #24709 ?

roncohen · 2019-05-22T06:35:20Z

Let’s discuss on the PR?

roncohen · 2019-05-22T07:49:45Z

@bleskes suggested we discuss if we can have the Field Stats API (doc_count) back, as described here: https://www.elastic.co/guide/en/elasticsearch/reference/5.6/search-field-stats.html#_field_statistics_2

ruflin · 2019-05-23T07:20:18Z

Would be nice to get something similar back in ES directly. In the past Field Stats supported some constraints on indices. It would be nice to have this constraint instead on date range especially as we are using ILM which means one index can contain the data up to a month. If we only look for 1 day of data we should only see the fields used during this day.

wylieconlon · 2019-06-20T17:03:14Z

This effort overlaps with the discussions we've been having about improving the usefulness of index patterns. The index pattern service is the natural place to store this extra information, instead of calculating it on every load. The service also lets us share it across all Kibana apps. Improvements to the index pattern service are being discussed here: #35481

roncohen · 2019-06-25T14:35:47Z

this problem also applies to the second drop down: "graph by". We should only show the fields relevant for the metrics you've selected.

For the "graph per" dropdown, when a user has already selected a metric, we could query for X number of documents that have those metrics and only show the labels/keyword-fields available in those documents? The reason it works here as opposed to in the main metrics selector, is that for the same metric, there will be a much smaller variability in the labels/keyword-fields than if you look at the general population.

simianhacker transferred this issue from another repository Jul 3, 2019

simianhacker changed the title ~~Improve UI Field Selection for Metricbeat~~ [Infra UI] Improve UI Field Selection for Metricbeat Jul 3, 2019

simianhacker added Feature:Metrics UI Metrics UI feature discuss enhancement New value added to drive a business result labels Jul 3, 2019

simianhacker changed the title ~~[Infra UI] Improve UI Field Selection for Metricbeat~~ [Infra UI][Meta] Improve UI Field Selection for Metricbeat Jul 23, 2019

simianhacker changed the title ~~[Infra UI][Meta] Improve UI Field Selection for Metricbeat~~ [Infra UI Meta Issue] Improve UI Field Selection for Metricbeat Jul 23, 2019

simianhacker mentioned this issue Aug 14, 2019

[Infra UI] Limit Metric Explorer fields #43322

Merged

1 task

sgrodzicki closed this as completed Nov 18, 2019

timroes mentioned this issue Feb 24, 2020

[Lens] Field existence via 500 sample is not intuitive #58330

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Infra UI Meta Issue] Improve UI Field Selection for Metricbeat #40277

[Infra UI Meta Issue] Improve UI Field Selection for Metricbeat #40277

simianhacker commented Apr 5, 2019 •

edited by sgrodzicki

Loading

ruflin commented Apr 8, 2019

weltenwort commented Apr 8, 2019 •

edited

Loading

roncohen commented Apr 12, 2019

simianhacker commented May 21, 2019

ruflin commented May 22, 2019

roncohen commented May 22, 2019

roncohen commented May 22, 2019

ruflin commented May 23, 2019

wylieconlon commented Jun 20, 2019

roncohen commented Jun 25, 2019

[Infra UI Meta Issue] Improve UI Field Selection for Metricbeat #40277

[Infra UI Meta Issue] Improve UI Field Selection for Metricbeat #40277

Comments

simianhacker commented Apr 5, 2019 • edited by sgrodzicki Loading

Problem

Possible Solutions

Related Issues

ruflin commented Apr 8, 2019

weltenwort commented Apr 8, 2019 • edited Loading

roncohen commented Apr 12, 2019

simianhacker commented May 21, 2019

ruflin commented May 22, 2019

roncohen commented May 22, 2019

roncohen commented May 22, 2019

ruflin commented May 23, 2019

wylieconlon commented Jun 20, 2019

roncohen commented Jun 25, 2019

simianhacker commented Apr 5, 2019 •

edited by sgrodzicki

Loading

weltenwort commented Apr 8, 2019 •

edited

Loading