-
Notifications
You must be signed in to change notification settings - Fork 8.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
terms stats facet not accurate when dealing with high cardinality fields #921
Comments
Here is a trivial example that demonstrates the issue. When given the following data-set { "name": "Ten", "value": 1} I've made the changes in a fork and here is an screen-shot depicting the data-set with the following: |
Please let me know if you need anymore information regarding this issue. If not, is it possible to have the pull request merged? |
This is a factor of the way elasticsearch does the distributed count. Unfortunately there is no way for Kibana to fix this. Elasticsearch issue here: elastic/elasticsearch#1305 |
When updating the length to 10, the terms pannel can't show the correct mean value. |
to match react version, which was updated in elastic#921
I've experience a scenario where the terms stats facet was not calculating totals accurately. After investigation, I found that this facets has an additional property named shard_size which will determine how many term entries will be requested from each shard. The terms stats facet documentation states:
"When dealing with field with high cardinality (at least higher than the requested size) The greater shard_size is - the more accurate the result will be (and the more expensive the overall facet computation will be). shard_size is there to enable you to increase accuracy yet still avoid returning too many terms_stats entries back to the client."
I would like to add the shard_size property as an option to the terms facet panel to account for data that requires accurate calculations.
The text was updated successfully, but these errors were encountered: