Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve query for matching indices when creating index pattern #42045

Closed
DaveCTurner opened this issue Jul 26, 2019 · 4 comments
Closed

Improve query for matching indices when creating index pattern #42045

DaveCTurner opened this issue Jul 26, 2019 · 4 comments
Labels
enhancement New value added to drive a business result Feature:Data Views Data Views code and UI - index patterns before 8.0 impact:low Addressing this issue will have a low level of impact on the quality/strength of our product. loe:small Small Level of Effort

Comments

@DaveCTurner
Copy link

Today when creating an index pattern we offer the user a list of indices matching the pattern as they're typing it in. We do this with a search to find the top 200 nonempty matching indices ordered by document count:

POST /pattern*/_search?ignore_unavailable=true
{"size":0,"aggs":{"indices":{"terms":{"field":"_index","size":200}}}}

This seems to come from here:

const params = {
ignoreUnavailable: true,
index: pattern,
ignore: [404],
body: {
size: 0, // no hits
aggs: {
indices: {
terms: {
field: '_index',
size: limit,
}
}
},
}
};

This can be somewhat inefficient on large clusters, because I think it hits every segment of every shard in every matching index in order to compute the document count. In elastic/elasticsearch#44719 is a report of this search taking over 4 seconds.

It's not clear that we need to ask for the largest indices here. If this is not a requirement then it would be at least a little more efficient to set terminate_after: 1 to stop collecting results after finding the first document in each shard.

I also think we should consider doing this via a different route than a search, because really the coordinating node can give a reasonably accurate list of indices without needing to involve any other nodes in the cluster. For instance GET /pattern*/_field_caps?fields=_anything_ would return all the matching indices straight away. It would also return any empty indices, which seems desirable too.

The disadvantage of something like that is in the case where pattern* matches a ludicrous number of indices, because today I don't know of a way to limit any responses down to only the first few. We'd be trading off load on Elasticsearch vs time spent downloading too many results. Maybe that's worth it, I'm not sure.

@cjcenizal cjcenizal added enhancement New value added to drive a business result Feature:Data Views Data Views code and UI - index patterns before 8.0 Team:Visualizations Visualization editors, elastic-charts and infrastructure labels Jul 26, 2019
@elasticmachine
Copy link
Contributor

Pinging @elastic/kibana-app

@cjcenizal
Copy link
Contributor

Thanks for the great suggestion @DaveCTurner!

The disadvantage of something like that is in the case where pattern* matches a ludicrous number of indices, because today I don't know of a way to limit any responses down to only the first few. We'd be trading off load on Elasticsearch vs time spent downloading too many results.

I think this is a valid concern. I'm worried our worst case scenario could result in a terrible UX. We should benchmark this to see exactly how many indices can be returned on a slow connection before the response time becomes painful. It may turn out that this number is so high that we would never expect users to encounter it in the field.

@elasticmachine
Copy link
Contributor

Pinging @elastic/kibana-app-arch (Team:AppArch)

@exalate-issue-sync exalate-issue-sync bot added impact:low Addressing this issue will have a low level of impact on the quality/strength of our product. loe:small Small Level of Effort labels Jun 21, 2021
@mattkime
Copy link
Contributor

mattkime commented Oct 7, 2021

resolved via #70271

@mattkime mattkime closed this as completed Oct 7, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New value added to drive a business result Feature:Data Views Data Views code and UI - index patterns before 8.0 impact:low Addressing this issue will have a low level of impact on the quality/strength of our product. loe:small Small Level of Effort
Projects
None yet
Development

No branches or pull requests

5 participants