Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize sibling aggs together #82373

Closed
nik9000 opened this issue Jan 10, 2022 · 5 comments
Closed

Optimize sibling aggs together #82373

nik9000 opened this issue Jan 10, 2022 · 5 comments
Labels
:Analytics/Aggregations Aggregations >enhancement Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo)

Comments

@nik9000
Copy link
Member

nik9000 commented Jan 10, 2022

If we have sibling aggregations running on the same field there are a few cases where we could collect them with a single aggregation and then split the results. For example, if you run the min and max aggregations as siblings on the same field you could collect a stats aggregation and then split out the results on read. Similarly, if there are two percentiles aggregations as siblings we could combine them into one for collection.

@elasticmachine elasticmachine added the Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) label Jan 10, 2022
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-analytics-geo (Team:Analytics)

@iverase
Copy link
Contributor

iverase commented Jan 11, 2022

FWIW: I observed something similar when developing vector tiles. There are some cases where we are running a GeoGrid aggregation with a sibling GeoBounds aggregation. In some cases even the GeoGrid aggregations has a child GeoCentroid aggregation.

For each aggregation we are having to read and decode the doc value (which for geo_shape might be expensive) so optimising them together would be a big win.

@flash1293
Copy link
Contributor

The most relevant case for the Lens use case is merging separate percentiles aggs on the same field.

e.g.
{
a: { percentiles: { field: x, values: [1,2,3] } },
b: { percentiles: { field: x, values: [4,5,6] } },
}

and
{
a: { percentiles: { field: x, values: [1,2,3,4,5,6] } }
}

should have near-identical performance

@wchaparro
Copy link
Member

we discussed this one as a team, adding to our roadmap as its something we believe we want to do, but we need to spend some time investigating a solution and coming up with a plan for this. removing the team-discuss.

@wchaparro
Copy link
Member

Closing as not planned, focus is on ES|QL development.

@wchaparro wchaparro closed this as not planned Won't fix, can't repro, duplicate, stale Jan 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Analytics/Aggregations Aggregations >enhancement Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo)
Projects
None yet
Development

No branches or pull requests

5 participants