Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replicate or relocate data via snapshot #73496

Closed
8 tasks done
dakrone opened this issue May 27, 2021 · 2 comments
Closed
8 tasks done

Replicate or relocate data via snapshot #73496

dakrone opened this issue May 27, 2021 · 2 comments
Assignees
Labels
:Distributed/Allocation All issues relating to the decision making around placing a shard (both master logic & on the nodes) >enhancement Meta Team:Distributed Meta label for distributed team

Comments

@dakrone
Copy link
Member

dakrone commented May 27, 2021

In order to reduce DTS costs for cross-zone data transfer, we should investigate whether we want to replicate or relocate data using a snapshot.

This is close to what a full_copy searchable snapshot index is. Rather than transferring data between ES nodes, we could use an object store as a "free" transferal medium.

Note that since this needs to go through an object store, the index would have to be marked as read-only to prevent data being lost.

There are two options for this, the first being using a regular snapshot and partially restoring it on the receiving node. The second is not using any snapshot infrastructure and instead using S3 as a "temporary staging" ground for relocating the index. Either option will require work, however, as we don't have a way of targeting the restoration of a single shard, and we will need to be able to treat the restoration similar to a regular relocation.

Fully cached searchable snapshot-backed indices do this already, where the recovery source for relocation becomes the snapshot rather than peer recovery, so this is a formalization of that process on a wider scale.

858D5E5A-AD89-4DF2-BA71-E06BB64FCCAA

In order for this to be applicable automatically however, it would be useful to implement the concept of a default repository (#66040), so that a user does not need to specify a repository for their index.

If we implemented this using snapshots, we also need to decide whether the snapshot should be a one-off, where a snapshot is taken on-demand for the index, the relocation happens, then the snapshot is removed, or if we can implement it using existing periodic snapshots. We could also use the clone snapshot API to clone index-specific snapshots out of a particular SLM snapshot rather than creating a new one on-demand.

Phase 1

In this phase we will try to use the available snapshots for read-only and low write indices and thus reduce the inter AZ traffic when possible.

Phase 2

@dakrone dakrone added >enhancement :Distributed/Allocation All issues relating to the decision making around placing a shard (both master logic & on the nodes) needs:triage Requires assignment of a team area label labels May 27, 2021
@elasticmachine elasticmachine added the Team:Distributed Meta label for distributed team label May 27, 2021
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed (Team:Distributed)

@dakrone dakrone removed the needs:triage Requires assignment of a team area label label May 27, 2021
@henningandersen henningandersen self-assigned this May 28, 2021
fcofdez added a commit that referenced this issue Jul 22, 2021
This commit adds a new master transport action TransportGetShardSnapshotAction
that allows getting the last successful snapshot for a particular
shard in a set of repositories. It deals with the different
implementation details around BwC for repositories.

Relates #73496
fcofdez added a commit to fcofdez/elasticsearch that referenced this issue Jul 22, 2021
This commit adds a new master transport action TransportGetShardSnapshotAction
that allows getting the last successful snapshot for a particular
shard in a set of repositories. It deals with the different
implementation details around BwC for repositories.

Relates elastic#73496
Backport of elastic#75080
fcofdez added a commit that referenced this issue Jul 22, 2021
This commit adds a new master transport action TransportGetShardSnapshotAction
that allows getting the last successful snapshot for a particular
shard in a set of repositories. It deals with the different
implementation details around BwC for repositories.

Relates #73496
Backport of #75080
ywangd pushed a commit to ywangd/elasticsearch that referenced this issue Jul 30, 2021
…c#75080)

This commit adds a new master transport action TransportGetShardSnapshotAction
that allows getting the last successful snapshot for a particular
shard in a set of repositories. It deals with the different
implementation details around BwC for repositories.

Relates elastic#73496
fcofdez added a commit that referenced this issue Aug 9, 2021
…#75840)

This commit adds a new set of classes that would compute a peer
recovery plan, based on source files + target files + available
snapshots. When possible it would try to maximize the number of
files used from a snapshot. It uses repositories with `use_for_peer_recovery`
setting set to true.

It adds a new recovery setting `indices.recovery.use_snapshots`

Relates #73496
fcofdez added a commit to fcofdez/elasticsearch that referenced this issue Aug 9, 2021
This commit adds a new set of classes that would compute a peer
recovery plan, based on source files + target files + available
snapshots. When possible it would try to maximize the number of
files used from a snapshot. It uses repositories with `use_for_peer_recovery`
setting set to true.

It adds a new recovery setting `indices.recovery.use_snapshots`

Relates elastic#73496
Backport of elastic#75840
fcofdez added a commit that referenced this issue Aug 9, 2021
…#76239)

This commit adds a new set of classes that would compute a peer
recovery plan, based on source files + target files + available
snapshots. When possible it would try to maximize the number of
files used from a snapshot. It uses repositories with `use_for_peer_recovery`
setting set to true.

It adds a new recovery setting `indices.recovery.use_snapshots`

Relates #73496
Backport of #75840
@henrikno
Copy link
Contributor

Another scenario this can help with is scaling out search-heavy clusters. For instance if you have 5 nodes answering search requests for a hot index and they're close to be at capacity. Trying to increase the replicas often makes this worse because now the already busy nodes have to also replicate their shards to new nodes, which might make it take a long time or not be possible because they're oversaturated. Replicating the shards from snapshot storage would be able to handle this without involving the existing hot nodes.

fcofdez added a commit that referenced this issue Aug 13, 2021
This commit adds peer recoveries from snapshots. It allows establishing a replica by downloading file data from a snapshot rather than transferring the data from the primary. 

Enabling this feature is done on the repository definition. Repositories having the setting `use_for_peer_recovery=true` will be consulted to find a good snapshot when recovering a shard.

Relates #73496
fcofdez added a commit to fcofdez/elasticsearch that referenced this issue Aug 13, 2021
This commit adds peer recoveries from snapshots. It allows establishing a replica by downloading file data from a snapshot rather than transferring the data from the primary.

Enabling this feature is done on the repository definition. Repositories having the setting `use_for_peer_recovery=true` will be consulted to find a good snapshot when recovering a shard.

Relates elastic#73496
Backport of elastic#76237
fcofdez added a commit to fcofdez/elasticsearch that referenced this issue Aug 13, 2021
This commit adds third party integration tests for snapshot based
recoveries in S3, Azure and GCS.

Relates elastic#73496
fcofdez added a commit that referenced this issue Aug 13, 2021
This commit adds peer recoveries from snapshots. It allows establishing a replica by downloading file data from a snapshot rather than transferring the data from the primary.

Enabling this feature is done on the repository definition. Repositories having the setting `use_for_peer_recovery=true` will be consulted to find a good snapshot when recovering a shard.

Relates #73496
Backport of #76237
fcofdez added a commit that referenced this issue Aug 13, 2021
This commit adds third party integration tests for snapshot based
recoveries in S3, Azure and GCS.

Relates #73496
fcofdez added a commit to fcofdez/elasticsearch that referenced this issue Aug 13, 2021
fcofdez added a commit to fcofdez/elasticsearch that referenced this issue Aug 13, 2021
This commit adds third party integration tests for snapshot based
recoveries in S3, Azure and GCS.

Relates elastic#73496
Backport of elastic#76489
fcofdez added a commit that referenced this issue Aug 13, 2021
This commit adds third party integration tests for snapshot based
recoveries in S3, Azure and GCS.

Relates #73496
Backport of #76489
henningandersen pushed a commit that referenced this issue Aug 16, 2021
Adds new field to recovery API to keep track of amount of data
recovered from snapshots.

The normal recovered_bytes field remains and is also increased for
recovery from snapshot but can go backwards in the unlikely case
that recovery from snapshot fails to download a file.

Relates #73496
henningandersen pushed a commit to henningandersen/elasticsearch that referenced this issue Aug 16, 2021
…#76499)

Adds new field to recovery API to keep track of amount of data
recovered from snapshots.

The normal recovered_bytes field remains and is also increased for
recovery from snapshot but can go backwards in the unlikely case
that recovery from snapshot fails to download a file.

Relates elastic#73496
elasticsearchmachine pushed a commit that referenced this issue Aug 16, 2021
…#76572)

* Keep track of data recovered from snapshots in RecoveryState (#76499)

Adds new field to recovery API to keep track of amount of data
recovered from snapshots.

The normal recovered_bytes field remains and is also increased for
recovery from snapshot but can go backwards in the unlikely case
that recovery from snapshot fails to download a file.

Relates #73496

* one less space

Co-authored-by: Francisco Fernández Castaño <francisco.fernandez.castano@gmail.com>
@repantis repantis added the Meta label Sep 2, 2021
fcofdez added a commit to fcofdez/elasticsearch that referenced this issue Sep 8, 2021
This commit adds support for peer recoveries using snapshots after
a primary failover if the snapshot shares the same logical contents
but the phyisical files are different. It uses the seq no information
stored in the snapshot to compare against the current shard source
node seq nos and decide whether or not it can use the snapshot to
recover the shard. Since the underlying index files are different
to the source index files, error handling is different than when
the files are shared. In this case, if there's an error while
snapshots files are recovered, we have to cancel the on-going
downloads, wait until all in-flight operations complete, remove
the recovered files and start from scratch using a fallback
recovery plan that uses the files from the source node.

Relates elastic#73496
fcofdez added a commit that referenced this issue Oct 14, 2021
…rs (#77420)

This commit adds support for peer recoveries using snapshots after
a primary failover if the snapshot shares the same logical contents
but the physical files are different. It uses the seq no information
stored in the snapshot to compare against the current shard source
node seq nos and decide whether or not it can use the snapshot to
recover the shard. Since the underlying index files are different
to the source index files, error handling is different than when
the files are shared. In this case, if there's an error while
snapshots files are recovered, we have to cancel the on-going
downloads, wait until all in-flight operations complete, remove
the recovered files and start from scratch using a fallback
recovery plan that uses the files from the source node.

Relates #73496
fcofdez added a commit to fcofdez/elasticsearch that referenced this issue Oct 14, 2021
This commit adds support for peer recoveries using snapshots after
a primary failover if the snapshot shares the same logical contents
but the physical files are different. It uses the seq no information
stored in the snapshot to compare against the current shard source
node seq nos and decide whether or not it can use the snapshot to
recover the shard. Since the underlying index files are different
to the source index files, error handling is different than when
the files are shared. In this case, if there's an error while
snapshots files are recovered, we have to cancel the on-going
downloads, wait until all in-flight operations complete, remove
the recovered files and start from scratch using a fallback
recovery plan that uses the files from the source node.

Relates elastic#73496
Backport of elastic#77420
fcofdez added a commit that referenced this issue Oct 14, 2021
…ailovers (#79137)

This commit adds support for peer recoveries using snapshots after
a primary failover if the snapshot shares the same logical contents
but the physical files are different. It uses the seq no information
stored in the snapshot to compare against the current shard source
node seq nos and decide whether or not it can use the snapshot to
recover the shard. Since the underlying index files are different
to the source index files, error handling is different than when
the files are shared. In this case, if there's an error while
snapshots files are recovered, we have to cancel the on-going
downloads, wait until all in-flight operations complete, remove
the recovered files and start from scratch using a fallback
recovery plan that uses the files from the source node.

Relates #73496
Backport of #77420
@fcofdez fcofdez closed this as completed Oct 26, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Distributed/Allocation All issues relating to the decision making around placing a shard (both master logic & on the nodes) >enhancement Meta Team:Distributed Meta label for distributed team
Projects
None yet
Development

No branches or pull requests

6 participants