Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

More resilient Shrink Action in ILM to node dropping from cluster. #61174

Closed
talevy opened this issue Aug 15, 2020 · 4 comments
Closed

More resilient Shrink Action in ILM to node dropping from cluster. #61174

talevy opened this issue Aug 15, 2020 · 4 comments
Labels
>bug :Data Management/ILM+SLM Index and Snapshot lifecycle management Team:Data Management Meta label for data/management team

Comments

@talevy
Copy link
Contributor

talevy commented Aug 15, 2020

The ILM Shrink Action sets the allocation settings for the index to a specific node by _id. If the node with this allocation disconnects and never returns, then the index's allocation setting index.routing.allocation.require._id will never be valid and the shard will never be allocated correctly. There needs to be a more resilient step after the SetSingleNodeAllocateStep that verifies the node still exists and retries if the node is no longer part of the cluster.

@talevy talevy added >bug :Data Management/ILM+SLM Index and Snapshot lifecycle management labels Aug 15, 2020
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-core-features (:Core/Features/ILM+SLM)

@elasticmachine elasticmachine added the Team:Data Management Meta label for data/management team label Aug 15, 2020
@sophiaxu8
Copy link

Is this #41170 fix this issue?

@dakrone
Copy link
Member

dakrone commented Oct 29, 2020

@sophiaxu8 no, that change was related to internal tests. We're still working on the resiliency aspect for interrupted shrink ILM operations

@dakrone
Copy link
Member

dakrone commented Dec 9, 2020

Closing this in favor of implementing #63519

@dakrone dakrone closed this as completed Dec 9, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>bug :Data Management/ILM+SLM Index and Snapshot lifecycle management Team:Data Management Meta label for data/management team
Projects
None yet
Development

No branches or pull requests

5 participants