Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Skip shards when querying constant keyword fields #96161

Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
45 commits
Select commit Hold shift + click to select a range
e036818
feature: skip shards when querying constant keyword fields
salvatore-campagna May 16, 2023
fa255a0
Update docs/changelog/96161.yaml
salvatore-campagna May 16, 2023
2f321ec
fix: use just one value for the 'area' field
salvatore-campagna May 16, 2023
1ce1c61
fix: rename class o match naming conventions
salvatore-campagna May 16, 2023
6f2dd52
fix: rename class o match naming conventions
salvatore-campagna May 16, 2023
1262e57
fix: prevent possible NPE
salvatore-campagna May 16, 2023
ea8e00d
fix: prevent possible NPE
salvatore-campagna May 16, 2023
9f9e878
test: reduce the number of shards to kick in shard pre-filtering
salvatore-campagna May 16, 2023
3805907
fix: use variable instead of raw value
salvatore-campagna May 16, 2023
12bd7c5
fix: do not try to rewrite match none queries
salvatore-campagna May 16, 2023
73fbf4c
fix: do not skip shards targeted by queries using runtime mappings
salvatore-campagna May 16, 2023
d80d825
fix: simplify logic removing MappingAwareRewriteContext
salvatore-campagna May 17, 2023
d10b7e9
fix: prevent possible NPE when searcher is null
salvatore-campagna May 17, 2023
9318d5c
fix: return intersects instead of disjoint
salvatore-campagna May 17, 2023
2fdeed1
Merge branch 'main' into feature/95541-avoid-unnecessary-search-idle-…
salvatore-campagna May 17, 2023
f998702
fix: we do not axquire the searcher if we pre-filter the shard
salvatore-campagna May 17, 2023
0de29fe
test: use a lower number of shards for shard pre-filtering
salvatore-campagna May 17, 2023
1ce266b
fix: move code such that we can reuse the index service
salvatore-campagna May 17, 2023
88a9c0f
docs: align javadoc explaining usage of null searcher
salvatore-campagna May 17, 2023
d963148
fix: move skip logic
salvatore-campagna May 17, 2023
581eede
fix: undo counter changes after moving skip logic
salvatore-campagna May 17, 2023
1acf848
fix: mve the logic in the try catch to avod leaking the searcher
salvatore-campagna May 17, 2023
65c40a4
fix: move logic before we mark the shard search active
salvatore-campagna May 17, 2023
56215c2
test: check refresh didn't happen for idle shards
salvatore-campagna May 17, 2023
f77b23f
docs: expain the idea behing skipping shards using index mappings
salvatore-campagna May 17, 2023
e654bed
fix: rename method
salvatore-campagna May 17, 2023
10d76fe
fix: restore original test name
salvatore-campagna May 17, 2023
7e84f10
fix: use multiline if statement
salvatore-campagna May 17, 2023
5981421
fix: checkstyle error on line too long
salvatore-campagna May 17, 2023
f414b5a
fix: return a boolean instead of CanMatchShardResponse
salvatore-campagna May 18, 2023
4847c08
fix: refactor some shard idle tests
salvatore-campagna May 22, 2023
75ea39f
fix: rename method argument
salvatore-campagna May 22, 2023
a0cea4e
javadoc: rephrase comment
salvatore-campagna May 22, 2023
18e4533
fix: reuse existing createIndex
salvatore-campagna May 22, 2023
e45dd7c
fix: do not check refresh stats for active shards
salvatore-campagna May 22, 2023
be1292c
Merge branch 'main' into feature/95541-avoid-unnecessary-search-idle-…
salvatore-campagna Jun 5, 2023
b5b824d
refactor: use a QueryRewriteContext instead of SearchExecutionContext
salvatore-campagna Jun 5, 2023
dfe9a6f
fix: typo in Javadoc
salvatore-campagna Jun 5, 2023
af18a36
fix: number of acquireSearchSupplier invocations
salvatore-campagna Jun 5, 2023
40140cc
fix: not using null searcher anymore
salvatore-campagna Jun 5, 2023
92d5856
docs: uodate javadocs
salvatore-campagna Jun 9, 2023
59b8445
test: can match on matching and non matching field value
salvatore-campagna Jun 9, 2023
00af0d1
fix: remove unused variable
salvatore-campagna Jun 9, 2023
b27fe03
Merge branch 'main' into feature/95541-avoid-unnecessary-search-idle-…
salvatore-campagna Jun 9, 2023
78f7ba6
fix: method parameter name
salvatore-campagna Jun 9, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions docs/changelog/96161.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
pr: 96161
summary: Skip shards when querying constant keyword fields
area: "Search"
type: enhancement
issues:
- 95541
Original file line number Diff line number Diff line change
Expand Up @@ -15,16 +15,23 @@
import org.elasticsearch.action.get.GetRequest;
import org.elasticsearch.action.get.MultiGetRequest;
import org.elasticsearch.action.index.IndexResponse;
import org.elasticsearch.action.search.SearchResponse;
import org.elasticsearch.cluster.metadata.IndexMetadata;
import org.elasticsearch.common.settings.Settings;
import org.elasticsearch.core.TimeValue;
import org.elasticsearch.index.IndexMode;
import org.elasticsearch.index.IndexService;
import org.elasticsearch.index.IndexSettings;
import org.elasticsearch.index.query.ExistsQueryBuilder;
import org.elasticsearch.index.query.RangeQueryBuilder;
import org.elasticsearch.index.refresh.RefreshStats;
import org.elasticsearch.rest.RestStatus;
import org.elasticsearch.test.ESSingleNodeTestCase;
import org.elasticsearch.threadpool.ThreadPool;
import org.elasticsearch.xcontent.XContentType;

import java.util.Arrays;
import java.util.List;
import java.util.Locale;
import java.util.concurrent.CountDownLatch;
import java.util.concurrent.Phaser;
Expand Down Expand Up @@ -223,4 +230,175 @@ public void testSearchIdleStats() throws InterruptedException {
assertTrue(Arrays.stream(statsResponse.getShards()).allMatch(x -> x.getSearchIdleTime() >= searchIdleAfter));
}

public void testSearchIdleBoolQueryMatchOneIndex() throws InterruptedException {
// GIVEN
final String idleIndex = "test1";
final String activeIndex = "test2";
// NOTE: we need many shards because shard pre-filtering and the "can match" phase
// are executed only if we have enough shards.
int idleIndexShardsCount = 3;
int activeIndexShardsCount = 3;
createIndex(
idleIndex,
Settings.builder()
.put(IndexSettings.INDEX_SEARCH_IDLE_AFTER.getKey(), "500ms")
.put(IndexMetadata.SETTING_NUMBER_OF_SHARDS, idleIndexShardsCount)
.put(IndexSettings.MODE.getKey(), IndexMode.TIME_SERIES)
.put(IndexMetadata.INDEX_ROUTING_PATH.getKey(), "routing_field")
.put(IndexSettings.TIME_SERIES_START_TIME.getKey(), "2021-05-10T00:00:00.000Z")
.put(IndexSettings.TIME_SERIES_END_TIME.getKey(), "2021-05-11T00:00:00.000Z")
.build(),
"doc",
"keyword",
"type=keyword",
"@timestamp",
"type=date",
"routing_field",
"type=keyword,time_series_dimension=true"
);
createIndex(
activeIndex,
Settings.builder()
.put(IndexSettings.INDEX_SEARCH_IDLE_AFTER.getKey(), "500ms")
.put(IndexMetadata.SETTING_NUMBER_OF_SHARDS, activeIndexShardsCount)
.put(IndexSettings.MODE.getKey(), IndexMode.TIME_SERIES)
.put(IndexMetadata.INDEX_ROUTING_PATH.getKey(), "routing_field")
.put(IndexSettings.TIME_SERIES_START_TIME.getKey(), "2021-05-12T00:00:00.000Z")
.put(IndexSettings.TIME_SERIES_END_TIME.getKey(), "2021-05-13T23:59:59.999Z")
.build(),
"doc",
"keyword",
"type=keyword",
"@timestamp",
"type=date",
"routing_field",
"type=keyword,time_series_dimension=true"
);

assertEquals(
RestStatus.CREATED,
client().prepareIndex(idleIndex)
.setSource("keyword", "idle", "@timestamp", "2021-05-10T19:00:03.765Z", "routing_field", "aaa")
.get()
.status()
);
assertEquals(
RestStatus.CREATED,
client().prepareIndex(activeIndex)
.setSource("keyword", "active", "@timestamp", "2021-05-12T20:07:12.112Z", "routing_field", "aaa")
.get()
.status()
);
assertEquals(RestStatus.OK, client().admin().indices().prepareRefresh(idleIndex, activeIndex).get().getStatus());

waitUntil(
() -> Arrays.stream(client().admin().indices().prepareStats(idleIndex, activeIndex).get().getShards())
.allMatch(ShardStats::isSearchIdle),
2,
TimeUnit.SECONDS
);

final IndicesStatsResponse idleIndexStatsBefore = client().admin().indices().prepareStats("test1").get();
assertIdleShard(idleIndexStatsBefore);

final IndicesStatsResponse activeIndexStatsBefore = client().admin().indices().prepareStats("test2").get();
assertIdleShard(activeIndexStatsBefore);

// WHEN
final SearchResponse searchResponse = client().prepareSearch("test*")
.setQuery(new RangeQueryBuilder("@timestamp").from("2021-05-12T20:00:00.000Z").to("2021-05-12T21:00:00.000Z"))
.setPreFilterShardSize(5)
.get();

// THEN
assertEquals(RestStatus.OK, searchResponse.status());
assertEquals(idleIndexShardsCount + activeIndexShardsCount - 1, searchResponse.getSkippedShards());
assertEquals(0, searchResponse.getFailedShards());
Arrays.stream(searchResponse.getHits().getHits()).forEach(searchHit -> assertEquals("test2", searchHit.getIndex()));
// NOTE: we need an empty result from at least one shard
assertEquals(1, searchResponse.getHits().getHits().length);
final IndicesStatsResponse idleIndexStatsAfter = client().admin().indices().prepareStats(idleIndex).get();
assertIdleShardsRefreshStats(idleIndexStatsBefore, idleIndexStatsAfter);
}

public void testSearchIdleExistsQueryMatchOneIndex() throws InterruptedException {
// GIVEN
final String idleIndex = "test1";
final String activeIndex = "test2";
// NOTE: we need many shards because shard pre-filtering and the "can match" phase
// are executed only if we have enough shards.
int idleIndexShardsCount = 3;
int activeIndexShardsCount = 3;
createIndex(
idleIndex,
Settings.builder()
.put(IndexSettings.INDEX_SEARCH_IDLE_AFTER.getKey(), "500ms")
.put(IndexMetadata.SETTING_NUMBER_OF_SHARDS, idleIndexShardsCount)
.build(),
"doc",
"keyword",
"type=keyword"
);
createIndex(
activeIndex,
Settings.builder()
.put(IndexSettings.INDEX_SEARCH_IDLE_AFTER.getKey(), "500ms")
.put(IndexMetadata.SETTING_NUMBER_OF_SHARDS, activeIndexShardsCount)
.build(),
"doc",
"keyword",
"type=keyword"
);

assertEquals(RestStatus.CREATED, client().prepareIndex(idleIndex).setSource("keyword", "idle").get().status());
assertEquals(
RestStatus.CREATED,
client().prepareIndex(activeIndex).setSource("keyword", "active", "unmapped", "bbb").get().status()
);
assertEquals(RestStatus.OK, client().admin().indices().prepareRefresh(idleIndex, activeIndex).get().getStatus());

waitUntil(
() -> Arrays.stream(client().admin().indices().prepareStats(idleIndex, activeIndex).get().getShards())
.allMatch(ShardStats::isSearchIdle),
2,
TimeUnit.SECONDS
);

final IndicesStatsResponse idleIndexStatsBefore = client().admin().indices().prepareStats("test1").get();
assertIdleShard(idleIndexStatsBefore);

final IndicesStatsResponse activeIndexStatsBefore = client().admin().indices().prepareStats("test2").get();
assertIdleShard(activeIndexStatsBefore);

// WHEN
final SearchResponse searchResponse = client().prepareSearch("test*")
.setQuery(new ExistsQueryBuilder("unmapped"))
.setPreFilterShardSize(5)
.get();

// THEN
assertEquals(RestStatus.OK, searchResponse.status());
assertEquals(idleIndexShardsCount, searchResponse.getSkippedShards());
assertEquals(0, searchResponse.getFailedShards());
Arrays.stream(searchResponse.getHits().getHits()).forEach(searchHit -> assertEquals("test2", searchHit.getIndex()));
// NOTE: we need an empty result from at least one shard
assertEquals(1, searchResponse.getHits().getHits().length);
final IndicesStatsResponse idleIndexStatsAfter = client().admin().indices().prepareStats(idleIndex).get();
assertIdleShardsRefreshStats(idleIndexStatsBefore, idleIndexStatsAfter);
}

private static void assertIdleShard(final IndicesStatsResponse statsResponse) {
Arrays.stream(statsResponse.getShards()).forEach(shardStats -> assertTrue(shardStats.isSearchIdle()));
Arrays.stream(statsResponse.getShards()).forEach(shardStats -> assertTrue(shardStats.getSearchIdleTime() >= 100));
}

private static void assertIdleShardsRefreshStats(final IndicesStatsResponse before, final IndicesStatsResponse after) {
assertNotEquals(0, before.getShards().length);
assertNotEquals(0, after.getShards().length);
final List<RefreshStats> refreshStatsBefore = Arrays.stream(before.getShards()).map(x -> x.getStats().refresh).toList();
final List<RefreshStats> refreshStatsAfter = Arrays.stream(after.getShards()).map(x -> x.getStats().refresh).toList();
assertEquals(refreshStatsBefore.size(), refreshStatsAfter.size());
assertTrue(refreshStatsAfter.containsAll(refreshStatsBefore));
assertTrue(refreshStatsBefore.containsAll(refreshStatsAfter));
}
}
71 changes: 71 additions & 0 deletions server/src/main/java/org/elasticsearch/index/IndexService.java
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@
import org.elasticsearch.cluster.routing.ShardRouting;
import org.elasticsearch.cluster.service.ClusterService;
import org.elasticsearch.common.io.stream.NamedWriteableRegistry;
import org.elasticsearch.common.regex.Regex;
import org.elasticsearch.common.settings.Setting;
import org.elasticsearch.common.settings.Setting.Property;
import org.elasticsearch.common.settings.Settings;
Expand Down Expand Up @@ -49,9 +50,14 @@
import org.elasticsearch.index.fielddata.IndexFieldDataService;
import org.elasticsearch.index.fielddata.ordinals.GlobalOrdinalsAccounting;
import org.elasticsearch.index.mapper.IdFieldMapper;
import org.elasticsearch.index.mapper.MappedFieldType;
import org.elasticsearch.index.mapper.MapperRegistry;
import org.elasticsearch.index.mapper.MapperService;
import org.elasticsearch.index.mapper.MappingLookup;
import org.elasticsearch.index.mapper.MappingParserContext;
import org.elasticsearch.index.mapper.NodeMappingStats;
import org.elasticsearch.index.mapper.RuntimeField;
import org.elasticsearch.index.query.QueryRewriteContext;
import org.elasticsearch.index.query.SearchExecutionContext;
import org.elasticsearch.index.query.SearchIndexNameMatcher;
import org.elasticsearch.index.seqno.RetentionLeaseSyncer;
Expand All @@ -76,6 +82,7 @@
import org.elasticsearch.script.ScriptService;
import org.elasticsearch.search.aggregations.support.ValuesSourceRegistry;
import org.elasticsearch.threadpool.ThreadPool;
import org.elasticsearch.transport.RemoteClusterAware;
import org.elasticsearch.xcontent.XContentParserConfiguration;

import java.io.Closeable;
Expand Down Expand Up @@ -661,6 +668,46 @@ public SearchExecutionContext newSearchExecutionContext(
);
}

/**
* Creates a new {@link QueryRewriteContext}.
* This class is used to rewrite queries in case access to the index is not required, since we can
* decide rewriting based on mappings alone. This saves the cost of pulling an index searcher as
* well as the associated cost of refreshing idle shards.
*/
public QueryRewriteContext newQueryRewriteContext(
final LongSupplier nowInMillis,
final Map<String, Object> runtimeMappings,
final String clusterAlias
) {
final SearchIndexNameMatcher indexNameMatcher = new SearchIndexNameMatcher(
index().getName(),
clusterAlias,
clusterService,
expressionResolver
);
final MapperService mapperService = mapperService();
final MappingLookup mappingLookup = mapperService().mappingLookup();
return new QueryRewriteContext(
parserConfiguration,
client,
nowInMillis,
mapperService,
mappingLookup,
parseRuntimeMappings(runtimeMappings, mapperService, indexSettings, mappingLookup),
null,
indexSettings,
new Index(
RemoteClusterAware.buildRemoteIndexName(clusterAlias, indexSettings.getIndex().getName()),
indexSettings.getIndex().getUUID()
),
indexNameMatcher,
namedWriteableRegistry,
valuesSourceRegistry,
allowExpensiveQueries,
scriptService
);
}

/**
* The {@link ThreadPool} to use for this index.
*/
Expand Down Expand Up @@ -1212,4 +1259,28 @@ public boolean clearCaches(boolean queryCache, boolean fieldDataCache, String...
return clearedAtLeastOne;
}

public static Map<String, MappedFieldType> parseRuntimeMappings(
Map<String, Object> runtimeMappings,
MapperService mapperService,
IndexSettings indexSettings,
MappingLookup lookup
) {
if (runtimeMappings.isEmpty()) {
return Collections.emptyMap();
}
// TODO add specific tests to SearchExecutionTests similar to the ones in FieldTypeLookupTests
MappingParserContext parserContext = mapperService.parserContext();
Map<String, RuntimeField> runtimeFields = RuntimeField.parseRuntimeFields(new HashMap<>(runtimeMappings), parserContext, false);
Map<String, MappedFieldType> runtimeFieldTypes = RuntimeField.collectFieldTypes(runtimeFields.values());
if (false == indexSettings.getIndexMetadata().getRoutingPaths().isEmpty()) {
for (String r : runtimeMappings.keySet()) {
if (Regex.simpleMatch(indexSettings.getIndexMetadata().getRoutingPaths(), r)) {
throw new IllegalArgumentException("runtime fields may not match [routing_path] but [" + r + "] matched");
}
}
}
runtimeFieldTypes.keySet().forEach(lookup::validateDoesNotShadow);
return runtimeFieldTypes;
}

}
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,6 @@
import org.elasticsearch.index.mapper.MappingParserContext;
import org.elasticsearch.index.mapper.NestedLookup;
import org.elasticsearch.index.mapper.ParsedDocument;
import org.elasticsearch.index.mapper.RuntimeField;
import org.elasticsearch.index.mapper.SourceLoader;
import org.elasticsearch.index.mapper.SourceToParse;
import org.elasticsearch.index.query.support.NestedScope;
Expand Down Expand Up @@ -76,6 +75,8 @@
import java.util.function.LongSupplier;
import java.util.function.Predicate;

import static org.elasticsearch.index.IndexService.parseRuntimeMappings;

/**
* The context used to execute a search request on a shard. It provides access
* to required information like mapping definitions and document data.
Expand Down Expand Up @@ -615,13 +616,15 @@ public IndexSettings getIndexSettings() {
}

/** Return the current {@link IndexReader}, or {@code null} if no index reader is available,
* for instance if this rewrite context is used to index queries (percolation). */
* for instance if this rewrite context is used to index queries (percolation).
*/
public IndexReader getIndexReader() {
return searcher == null ? null : searcher.getIndexReader();
}

/** Return the current {@link IndexSearcher}, or {@code null} if no index reader is available,
* for instance if this rewrite context is used to index queries (percolation). */
/** Return the current {@link IndexSearcher}, or {@code null} if no index reader is available, which happens
* if this rewrite context is used to index queries (percolation).
*/
public IndexSearcher searcher() {
return searcher;
}
Expand All @@ -645,30 +648,6 @@ public boolean fieldExistsInIndex(String fieldname) {
return fieldsInIndex.contains(fieldname);
}

private static Map<String, MappedFieldType> parseRuntimeMappings(
Map<String, Object> runtimeMappings,
MapperService mapperService,
IndexSettings indexSettings,
MappingLookup lookup
) {
if (runtimeMappings.isEmpty()) {
return Collections.emptyMap();
}
// TODO add specific tests to SearchExecutionTests similar to the ones in FieldTypeLookupTests
MappingParserContext parserContext = mapperService.parserContext();
Map<String, RuntimeField> runtimeFields = RuntimeField.parseRuntimeFields(new HashMap<>(runtimeMappings), parserContext, false);
Map<String, MappedFieldType> runtimeFieldTypes = RuntimeField.collectFieldTypes(runtimeFields.values());
if (false == indexSettings.getIndexMetadata().getRoutingPaths().isEmpty()) {
for (String r : runtimeMappings.keySet()) {
if (Regex.simpleMatch(indexSettings.getIndexMetadata().getRoutingPaths(), r)) {
throw new IllegalArgumentException("runtime fields may not match [routing_path] but [" + r + "] matched");
}
}
}
runtimeFieldTypes.keySet().forEach(lookup::validateDoesNotShadow);
return runtimeFieldTypes;
}

/**
* Cache key for current mapping.
*/
Expand Down
Loading