A write alias targeting multiple indices prevents node startup #56186

DaveCTurner · 2020-05-05T10:21:30Z

In 6.x (and earlier) it is possible for a node to fail to start because its on-disk cluster state marks multiple indices as the target of writes for an alias:

[WARN ][o.e.b.ElasticsearchUncaughtExceptionHandler] [REDACTED] uncaught exception in thread [main]
org.elasticsearch.bootstrap.StartupException: java.lang.IllegalStateException: alias [REDACTED] has more than one write index [REDACTED,REDACTED]

This is fundamentally because each node builds its own copy of the cluster state greedily based on all the index metadata that it can find, but there's no guarantee that it finds a consistent set of metadata. For instance, if the node were shut down while persisting a cluster state then it may have only updated some of the index metadata on disk. Perhaps more commonly, when all shards of an index are moved away from a master-ineligible node then that node stops updating the corresponding index metadata, but does not delete the index metadata immediately so it may contain some very stale alias information (with thanks to @henningandersen for noticing that).

7.x (and later) are not directly affected by this problem since #32006 ensures that cluster states are written atomically so we always see a consistent set of index metadata, although a 7.x node can still encounter this broken state during an upgrade from 6.x.

One possible fix is that we could permit a write alias to target multiple indices (but to reject any indexing to that alias until the ambiguity is resolved). I'm open to other ideas.

The text was updated successfully, but these errors were encountered:

elasticmachine · 2020-05-05T10:21:32Z

Pinging @elastic/es-core-features (:Core/Features/Indices APIs)

danhermann · 2020-10-01T16:04:06Z

We discussed this and thought that adding a system property to bypass the cluster state validation logic that prevents a write alias from targeting multiple write indices would probably be the most expedient way of addressing this because:

though rarely encountered, it is hard to recover the cluster to a functional state
the cluster state validation logic is still desirable especially because this state should not occur in 7.x clusters

Indexing through the write alias should still be prevented, but that is much more easily fixed by updating the alias to have a single write index.

Samanthapuri · 2021-08-10T05:25:27Z

We have encountered same issue in our elastic serach 6.x cluster and only one node out of 3 node cluster is not able to come up.

Can you please help me on how to solve this and bring the node up.

DaveCTurner · 2022-04-20T17:31:52Z

All versions affected by this bug are now past EOL so there is nothing to be done here any more. I am therefore closing this.

DaveCTurner added >bug :Data Management/Indices APIs APIs to create and manage indices and templates labels May 5, 2020

elasticmachine added the Team:Data Management Meta label for data/management team label May 5, 2020

danhermann added the team-discuss label May 19, 2020

danhermann removed the team-discuss label Oct 1, 2020

hauntingEcho mentioned this issue Feb 25, 2021

Allow writing to multiple indices via alias #68003

Open

DaveCTurner closed this as completed Apr 20, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

A write alias targeting multiple indices prevents node startup #56186

A write alias targeting multiple indices prevents node startup #56186

DaveCTurner commented May 5, 2020

elasticmachine commented May 5, 2020

danhermann commented Oct 1, 2020

Samanthapuri commented Aug 10, 2021

DaveCTurner commented Apr 20, 2022

A write alias targeting multiple indices prevents node startup #56186

A write alias targeting multiple indices prevents node startup #56186

Comments

DaveCTurner commented May 5, 2020

elasticmachine commented May 5, 2020

danhermann commented Oct 1, 2020

Samanthapuri commented Aug 10, 2021

DaveCTurner commented Apr 20, 2022