Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added standard deviation / variance sampling to extended stats #49782

Merged
merged 5 commits into from
Jun 10, 2020

Conversation

andrewjohnson2
Copy link
Contributor

Per 49554 I added standard deviation sampling and variance sampling to the extended stats interface.

Closes #49554

@cbuescher cbuescher added the :Analytics/Aggregations Aggregations label Dec 4, 2019
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-analytics-geo (:Analytics/Aggregations)

Copy link
Member

@costin costin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.
I'm not sure if the lower/upper bounds shouldn't be different between std population and sampling.

private static double stdDevSampling(int... vals) {
return Math.sqrt(varianceSampling(vals));
}

private static double variance(int... vals) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This indirection is confusing. Can we replace the calls to variationPopulation

Copy link
Contributor

@csoulios csoulios left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for taking the time to implement this feature.
Generally, LGTM too.

Regarding upper and lower bounds, they are definitely different between population and sampling standard deviation. For the sake of completeness, I think they should be added too.

@polyfractal
Copy link
Contributor

It would also be nice if the documentation could mention that the existing std_dev and variance are "population" metrics, so the user isn't confused by the new fields with identical values.

@andrewjohnson2
Copy link
Contributor Author

thanks for the feedback! I will make those changes.

@rjernst rjernst added the Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) label May 4, 2020
@imotov
Copy link
Contributor

imotov commented Jun 8, 2020

@elasticmachine test this please

@imotov
Copy link
Contributor

imotov commented Jun 9, 2020

@elasticmachine test this please

@imotov
Copy link
Contributor

imotov commented Jun 9, 2020

@costin, @csoulios I have added upper and lower bounds. @polyfractal I have added docs. Any other suggestions?

Copy link
Contributor

@polyfractal polyfractal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for pushing it over the finish line @imotov :)

@imotov imotov merged commit a791d67 into elastic:master Jun 10, 2020
imotov added a commit to imotov/elasticsearch that referenced this pull request Jun 10, 2020
…ic#49782)

Per 49554 I added standard deviation sampling and variance sampling to the extended stats interface.

Closes elastic#49554

Co-authored-by: Igor Motov <igor@motovs.org>
imotov added a commit that referenced this pull request Jun 11, 2020
… (#57947)

Per 49554 I added standard deviation sampling and variance sampling to the extended stats interface.
 
Closes #49554

Co-authored-by: Igor Motov <igor@motovs.org>

Co-authored-by: andrewjohnson2 <aj114114@gmail.com>
@imotov
Copy link
Contributor

imotov commented Jun 11, 2020

Thanks to @andrewjohnson2 for the PR and to @polyfractal and @csoulios for quick reviews!

russcam added a commit to elastic/elasticsearch-net that referenced this pull request Jul 29, 2020
russcam added a commit to elastic/elasticsearch-net that referenced this pull request Jul 31, 2020
github-actions bot pushed a commit to elastic/elasticsearch-net that referenced this pull request Jul 31, 2020
github-actions bot pushed a commit to elastic/elasticsearch-net that referenced this pull request Jul 31, 2020
russcam added a commit to elastic/elasticsearch-net that referenced this pull request Jul 31, 2020
russcam added a commit to elastic/elasticsearch-net that referenced this pull request Jul 31, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Analytics/Aggregations Aggregations >enhancement Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) v7.9.0 v8.0.0-alpha1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add standard deviation / variance sampling to extended stats aggregation
10 participants