Add a configurable delay for non-assigned shards in search request #56236
Labels
:Distributed/Recovery
Anything around constructing a new shard, either from a local or a remote source.
>feature
:Search/Search
Search-related issues that do not fall into other categories
Team:Distributed
Meta label for distributed team
Team:Search
Meta label for search team
Today when no shards are available for a specific index in a search request, the shard is marked as failed and the failure is added in the search response.
This can be problematic for shards that have no replica since any crash/restart of a node can make several search requests to fail since the re-allocation of the shards will take some time. In the context of searchable snapshots, the re-allocation shouldn't take long (since the data is not entirely copied) so the initial idea is to rely on fast re-allocation rather than replicas. For this to work seamlessly for users, we thought that it could be helpful to add a configurable delay for search requests that would define the maximum amount of time for the request to wait for a shard to become active.
The idea would be to add a listener in the search request that would be notified when the shard is assigned to a new node and ready for requests. If the timeout is reached, a failure would be reported exactly the same as what we do today.
The text was updated successfully, but these errors were encountered: