Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Include release highlight for query rewrite #97178

Conversation

salvatore-campagna
Copy link
Contributor

Update the changelog after making the original PR a release highlight.

See #96161

@salvatore-campagna salvatore-campagna added >non-issue :Search/Search Search-related issues that do not fall into other categories :StorageEngine/TSDB You know, for Metrics auto-backport-and-merge Automatically create backport pull requests and merge when ready v8.1.0 v8.9.1 labels Jun 28, 2023
@salvatore-campagna salvatore-campagna self-assigned this Jun 28, 2023
@elasticsearchmachine elasticsearchmachine added Team:Search Meta label for search team Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) v8.10.0 labels Jun 28, 2023
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-analytics-geo (Team:Analytics)

@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-search (Team:Search)

@jpountz
Copy link
Contributor

jpountz commented Jun 28, 2023

I wonder if we can make the highlight title more catchy. From a user perspective this will yield both better search performance (because fewer refreshes run as part of search requests) and indexing performative (because fewer refreshes, so less merging). So maybe something like "Better indexing and search performance under concurrent indexing and search" or something along these lines?

@@ -4,3 +4,21 @@ area: "Search"
type: enhancement
issues:
- 95541
highlight:
title: Skip shards when querying constant keyword fields
body: "When a query like a match phrase query or a wildcard query targets a constant keyword field \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think wildcard query in 8.9.0 doesn't overwrite the indexMetadataRewrite(...) method? Only match_phrase and term queries.

Copy link
Contributor Author

@salvatore-campagna salvatore-campagna Jun 28, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes you are right...because it uses the SearchExecutionContext.

document. This will result in the shard level request to return immediately, before the query is executed on the data\
node and, as a result, skipping the shard completely. In a real world scenario it is likely that index patterns or\
data streams include tens or hundreds of backing indices each of them with multiple shards involved. Skipping shards\
in such scenario might result in better query performance and better cluster resource usage. Note, anyway, that\
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This exlpains how skipping shards helps with search performance, which is true, but Elasticsearch has had this for years. Let's focus instead of the new change, which is that the can_match phase no longer needs to refresh search-idle shards in a number of cases?

Quickly mention better indexing performance here as well thanks to less refreshing and merging?

in such scenario might result in better query performance and better cluster resource usage. Note, anyway, that\
execution of the pre-filter and the corresponding \"can match\" phase where rewriting happens, depends on the overall
number of shards involved and on whether there is at least one of them returning a non-empty result (see\
'pre_filter_shard_size' setting to understand how to control this behaviour). We do the rewrite operation on the data\
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
'pre_filter_shard_size' setting to understand how to control this behaviour). We do the rewrite operation on the data\
'pre_filter_shard_size' setting to understand how to control this behaviour). Elasticsearch does the rewrite operation on the data\

match the value defined in the index mapping, we rewrite the query to match no document. This will result in the\
shard level request to return immediately, before the query is executed on the data node and, as a result, skipping\
the shard completely. Here we leverage the ability to skip shards whenever possible to avoid unnecessary shard\
refreshes and improve query latency. Avoiding such unnecessary shard refreshes improves query latency since the\
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe change and improve query latency into and improve query latency (by not doing any search related i/o)?

Copy link
Member

@martijnvg martijnvg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

shard level request to return immediately, before the query is executed on the data node and, as a result, skipping\
the shard completely. Here we leverage the ability to skip shards whenever possible to avoid unnecessary shard\
refreshes and improve query latency. Avoiding such unnecessary shard refreshes improves query latency since the\
search thread does not need to wait anymore for unnecessary shard refreshes. Before introducing this change a query\
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe also add after search thread does not need to wait anymore for unnecessary shard refreshes., that searches that don't match the query will remain search idle and therefor the indexing throughput won't be impacted negatively by a refresh.

@salvatore-campagna salvatore-campagna added auto-merge Automatically merge pull request when CI checks pass (NB doesn't wait for reviews!) and removed auto-merge Automatically merge pull request when CI checks pass (NB doesn't wait for reviews!) labels Jun 28, 2023
@salvatore-campagna salvatore-campagna added auto-merge Automatically merge pull request when CI checks pass (NB doesn't wait for reviews!) and removed v8.1.0 labels Jun 28, 2023
@elasticsearchmachine elasticsearchmachine merged commit f9b22a9 into elastic:main Jun 28, 2023
@salvatore-campagna salvatore-campagna deleted the fix/96161-skip-shards-release-highlight branch June 28, 2023 15:47
salvatore-campagna added a commit to salvatore-campagna/elasticsearch that referenced this pull request Jun 28, 2023
Update the changelog after making the original PR a `release highlight`.

See elastic#96161
@elasticsearchmachine
Copy link
Collaborator

💚 Backport successful

Status Branch Result
8.9

elasticsearchmachine pushed a commit that referenced this pull request Jun 28, 2023
Update the changelog after making the original PR a `release highlight`.

See #96161
@rjernst rjernst added v8.9.0 and removed v8.9.1 labels Jul 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
auto-backport-and-merge Automatically create backport pull requests and merge when ready auto-merge Automatically merge pull request when CI checks pass (NB doesn't wait for reviews!) >non-issue :Search/Search Search-related issues that do not fall into other categories :StorageEngine/TSDB You know, for Metrics Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) Team:Search Meta label for search team v8.9.0 v8.10.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants