Skip to content

Commit

Permalink
Improve docs for search preferences (#32098)
Browse files Browse the repository at this point in the history
Today it is unclear what guarantees are offered by the search preference
feature, and we claim a guarantee that is stronger than what we really offer:

> A custom value will be used to guarantee that the same shards will be used
> for the same custom value.

This commit clarifies this documentation and explains more clearly why
`_primary`, `_replica`, etc. are deprecated in `6.x` and removed in `master`.

Relates #31929 #26335 #26791.
  • Loading branch information
DaveCTurner committed Jul 18, 2018
1 parent 9dd7793 commit 14586ee
Showing 1 changed file with 73 additions and 33 deletions.
106 changes: 73 additions & 33 deletions docs/reference/search/request/preference.asciidoc
Original file line number Diff line number Diff line change
@@ -1,55 +1,76 @@
[[search-request-preference]]
=== Preference

Controls a `preference` of which shard copies on which to execute the
search. By default, the operation is randomized among the available shard copies.
Controls a `preference` of which shard copies on which to execute the search.
By default, Elasticsearch selects from the available shard copies in an
unspecified order, taking the <<allocation-awareness,allocation awareness>> and
<<search-adaptive-replica,adaptive replica selection>> configuration into
account. However, it may sometimes be desirable to try and route certain
searches to certain sets of shard copies, for instance to make better use of
per-copy caches.

The `preference` is a query string parameter which can be set to:

[horizontal]
`_primary`::
The operation will go and be executed only on the primary
shards. deprecated[6.1.0, will be removed in 7.0, use `_only_nodes` or `_prefer_nodes`]
`_primary`::
The operation will be executed only on primary shards. deprecated[6.1.0,
will be removed in 7.0. See the warning below for more information.]

`_primary_first`::
The operation will go and be executed on the primary
shard, and if not available (failover), will execute on other shards.
deprecated[6.1.0, will be removed in 7.0, use `_only_nodes` or `_prefer_nodes`]
`_primary_first`::
The operation will be executed on primary shards if possible, but will fall
back to other shards if not. deprecated[6.1.0, will be removed in 7.0. See
the warning below for more information.]

`_replica`::
The operation will go and be executed only on a replica shard.
deprecated[6.1.0, will be removed in 7.0, use `_only_nodes` or `_prefer_nodes`]
The operation will be executed only on replica shards. If there are multiple
replicas then the order of preference between them is unspecified.
deprecated[6.1.0, will be removed in 7.0. See the warning below for more
information.]

`_replica_first`::
The operation will go and be executed only on a replica shard, and if
not available (failover), will execute on other shards.
deprecated[6.1.0, will be removed in 7.0, use `_only_nodes` or `_prefer_nodes`]
The operation will be executed on replica shards if possible, but will fall
back to other shards if not. If there are multiple replicas then the order of
preference between them is unspecified. deprecated[6.1.0, will be removed in
7.0. See the warning below for more information.]

`_local`::
The operation will prefer to be executed on a local
allocated shard if possible.
`_only_local`::
The operation will be executed only on shards allocated to the local
node.

`_local`::
The operation will be executed on shards allocated to the local node if
possible, and will fall back to other shards if not.

`_prefer_nodes:abc,xyz`::
Prefers execution on the nodes with the provided
node ids (`abc` or `xyz` in this case) if applicable.
The operation will be executed on nodes with one of the provided node
ids (`abc` or `xyz` in this case) if possible. If suitable shard copies
exist on more than one of the selected nodes then the order of
preference between these copies is unspecified.

`_shards:2,3`::
Restricts the operation to the specified shards. (`2`
and `3` in this case). This preference can be combined with other
preferences but it has to appear first: `_shards:2,3|_local`
`_shards:2,3`::
Restricts the operation to the specified shards. (`2` and `3` in this
case). This preference can be combined with other preferences but it
has to appear first: `_shards:2,3|_local`

`_only_nodes`::
Restricts the operation to nodes specified in <<cluster,node specification>>
`_only_nodes:abc*,x*yz,...`::
Restricts the operation to nodes specified according to the
<<cluster,node specification>>. If suitable shard copies exist on more
than one of the selected nodes then the order of preference between
these copies is unspecified.

Custom (string) value::
A custom value will be used to guarantee that
the same shards will be used for the same custom value. This can help
with "jumping values" when hitting different shards in different refresh
states. A sample value can be something like the web session id, or the
user name.
Custom (string) value::
Any value that does not start with `_`. If two searches both give the same
custom string value for their preference and the underlying cluster state
does not change then the same ordering of shards will be used for the
searches. This does not guarantee that the exact same shards will be used
each time: the cluster state, and therefore the selected shards, may change
for a number of reasons including shard relocations and shard failures, and
nodes may sometimes reject searches causing fallbacks to alternative nodes.
However, in practice the ordering of shards tends to remain stable for long
periods of time. A good candidate for a custom preference value is something
like the web session id or the user name.

For instance, use the user's session ID to ensure consistent ordering of results
for the user:
For instance, use the user's session ID `xyzabc123` as follows:

[source,js]
------------------------------------------------
Expand All @@ -64,3 +85,22 @@ GET /_search?preference=xyzabc123
------------------------------------------------
// CONSOLE

NOTE: The `_only_local` preference guarantees only to use shard copies on the
local node, which is sometimes useful for troubleshooting. All other options do
not _fully_ guarantee that any particular shard copies are used in a search,
and on a changing index this may mean that repeated searches may yield
different results if they are executed on different shard copies which are in
different refresh states.

WARNING: The `_primary`, `_primary_first`, `_replica` and `_replica_first` are
deprecated as their use is not recommended. They do not help to avoid
inconsistent results that arise from the use of shards that have different
refresh states, and Elasticsearch uses synchronous replication so the primary
does not in general hold fresher data than its replicas. The `_primary_first`
and `_replica_first` preferences silently fall back to non-preferred copies if
it is not possible to search the preferred copies. The `_primary` and
`_replica` preferences will silently change their preferred shards if a replica
is promoted to primary, which can happen at any time. The `_primary` preference
can also put undesirable extra load on the primary shards. The cache-related
benefits of these options can also be obtained using `_only_nodes`,
`_prefer_nodes`, or a custom string value instead.

0 comments on commit 14586ee

Please sign in to comment.