Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ILM rollover within the same availability zone #62194

Closed
obogobo opened this issue Sep 9, 2020 · 8 comments
Closed

ILM rollover within the same availability zone #62194

obogobo opened this issue Sep 9, 2020 · 8 comments
Labels
:Data Management/ILM+SLM Index and Snapshot lifecycle management >enhancement Team:Data Management Meta label for data/management team

Comments

@obogobo
Copy link

obogobo commented Sep 9, 2020

Here's an example for an index with one replica, while using force_awareness node attributes to spread the shards across 2x AZ's via the zone property. Right now, indexing / ILM are implemented as such:

Indexing:

  • online replication - each doc is sent from the receiving (gateway) node to another instance in a separate AZ, to be indexed in parallel with its primary shard

ILM:

  • eventually an ILM trigger condition is met
  • a shard recovery from hot -> warm
  • another shard recovery from warm -> warm
  • delete the source shards

This means there are up to 3x network boundary transitions for the lifecycle of a document 🙀
And you better believe Amazon's making hay while the sun's shining at $0.01 / GB transferred across zones.

A solution to this might be an added config option to prefer replication on rollover within the same availability zone (where data transfer is free), according to: https://www.elastic.co/guide/en/elasticsearch/reference/7.8/modules-cluster.html#forced-awareness

i.e., ILM changes the index's routing.allocation.require attribute to start the rollover process, followed by the primary and replica shards getting bulk transferred in parallel to nodes with the new attribute, attempting to pick destinations in the same zone they originated from.

Unless something like this is already possible, and if so I'd love to hear about it! Thanks for the consideration!

@obogobo obogobo added >enhancement needs:triage Requires assignment of a team area label labels Sep 9, 2020
@matriv matriv added :Data Management/ILM+SLM Index and Snapshot lifecycle management and removed needs:triage Requires assignment of a team area label labels Sep 10, 2020
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-core-features (:Core/Features/ILM+SLM)

@elasticmachine elasticmachine added the Team:Data Management Meta label for data/management team label Sep 10, 2020
@seang-es
Copy link

This is not just applicable to ILM, users have raised questions about this in the case where you might be patching a node in one AZ and spinning up another node to take its place. Copies will be coming from the primary shard, whichever zone that might be in, rather than the still available replica in the local zone.

@DaveCTurner
Copy link
Contributor

Relates #63519.

@joegallo
Copy link
Contributor

I'm removing the team-discuss label from some older Team:Data Management issues -- we've had plenty of time to discuss them, but we haven't, so the label isn't serving its purpose. Feel free to delete this comment and/or re-add the team-discuss label.

@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-data-management (Team:Data Management)

@DaveCTurner
Copy link
Contributor

I think this duplicates the (resolved) #73496: data transfer to/from snapshots does not incur cross-zone network traffic costs. Therefore I'm closing it.

@obogobo
Copy link
Author

obogobo commented Mar 23, 2023

is it possible to do ILM rollovers by snapshotting to S3 from a hot node and then recovering on a warm node? instead of a node to node direct data transfer? if so that'd be amazing, and this issue relates - otherwise, not sure that one duplicates this issue.

@DaveCTurner
Copy link
Contributor

Yes that's right, that's what #73496 does.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Data Management/ILM+SLM Index and Snapshot lifecycle management >enhancement Team:Data Management Meta label for data/management team
Projects
None yet
Development

No branches or pull requests

8 participants