Replicate or relocate data via snapshot #73496

dakrone · 2021-05-27T19:36:07Z

In order to reduce DTS costs for cross-zone data transfer, we should investigate whether we want to replicate or relocate data using a snapshot.

This is close to what a full_copy searchable snapshot index is. Rather than transferring data between ES nodes, we could use an object store as a "free" transferal medium.

Note that since this needs to go through an object store, the index would have to be marked as read-only to prevent data being lost.

There are two options for this, the first being using a regular snapshot and partially restoring it on the receiving node. The second is not using any snapshot infrastructure and instead using S3 as a "temporary staging" ground for relocating the index. Either option will require work, however, as we don't have a way of targeting the restoration of a single shard, and we will need to be able to treat the restoration similar to a regular relocation.

Fully cached searchable snapshot-backed indices do this already, where the recovery source for relocation becomes the snapshot rather than peer recovery, so this is a formalization of that process on a wider scale.

In order for this to be applicable automatically however, it would be useful to implement the concept of a default repository (#66040), so that a user does not need to specify a repository for their index.

If we implemented this using snapshots, we also need to decide whether the snapshot should be a one-off, where a snapshot is taken on-demand for the index, the relocation happens, then the snapshot is removed, or if we can implement it using existing periodic snapshots. We could also use the clone snapshot API to clone index-specific snapshots out of a particular SLM snapshot rather than creating a new one on-demand.

Phase 1

In this phase we will try to use the available snapshots for read-only and low write indices and thus reduce the inter AZ traffic when possible.

Add cluster setting to enable snapshot based peer recoveries
Add service to retrieve snapshots information (files info, commit user data, etc) (Add the ability to fetch the latest successful shard snapshot #75080)
Implement comparison mechanism between snapshot data / recovery source to determine if a snapshot can be used during peer recovery (Add peer recovery planners that take into account available snapshots #75840)
Add index recovery phase to download snapshot files from repository computed in the previous step (including failure handling) (Add peer recoveries using snapshot files when possible #76237)
Enhance recovery stats to include information about the data downloaded from the snapshot (Keep track of data recovered from snapshots in RecoveryState #76499)

Phase 2

Handle recoveries after primary failovers using a sequence numbers (Add support for peer recoveries using snapshots after primary failovers #77420)
Limit concurrent snapshot file restores in recovery per node #79316
Ensure proper license is used, recovering from source otherwise (Move SnapshotsRecoveryPlannerService to its own x-pack plugin #79637)

elasticmachine · 2021-05-27T19:36:10Z

Pinging @elastic/es-distributed (Team:Distributed)

This commit adds a new master transport action TransportGetShardSnapshotAction that allows getting the last successful snapshot for a particular shard in a set of repositories. It deals with the different implementation details around BwC for repositories. Relates #73496

This commit adds a new master transport action TransportGetShardSnapshotAction that allows getting the last successful snapshot for a particular shard in a set of repositories. It deals with the different implementation details around BwC for repositories. Relates elastic#73496 Backport of elastic#75080

This commit adds a new master transport action TransportGetShardSnapshotAction that allows getting the last successful snapshot for a particular shard in a set of repositories. It deals with the different implementation details around BwC for repositories. Relates #73496 Backport of #75080

…c#75080) This commit adds a new master transport action TransportGetShardSnapshotAction that allows getting the last successful snapshot for a particular shard in a set of repositories. It deals with the different implementation details around BwC for repositories. Relates elastic#73496

…#75840) This commit adds a new set of classes that would compute a peer recovery plan, based on source files + target files + available snapshots. When possible it would try to maximize the number of files used from a snapshot. It uses repositories with `use_for_peer_recovery` setting set to true. It adds a new recovery setting `indices.recovery.use_snapshots` Relates #73496

This commit adds a new set of classes that would compute a peer recovery plan, based on source files + target files + available snapshots. When possible it would try to maximize the number of files used from a snapshot. It uses repositories with `use_for_peer_recovery` setting set to true. It adds a new recovery setting `indices.recovery.use_snapshots` Relates elastic#73496 Backport of elastic#75840

…#76239) This commit adds a new set of classes that would compute a peer recovery plan, based on source files + target files + available snapshots. When possible it would try to maximize the number of files used from a snapshot. It uses repositories with `use_for_peer_recovery` setting set to true. It adds a new recovery setting `indices.recovery.use_snapshots` Relates #73496 Backport of #75840

henrikno · 2021-08-10T18:36:58Z

Another scenario this can help with is scaling out search-heavy clusters. For instance if you have 5 nodes answering search requests for a hot index and they're close to be at capacity. Trying to increase the replicas often makes this worse because now the already busy nodes have to also replicate their shards to new nodes, which might make it take a long time or not be possible because they're oversaturated. Replicating the shards from snapshot storage would be able to handle this without involving the existing hot nodes.

This commit adds peer recoveries from snapshots. It allows establishing a replica by downloading file data from a snapshot rather than transferring the data from the primary. Enabling this feature is done on the repository definition. Repositories having the setting `use_for_peer_recovery=true` will be consulted to find a good snapshot when recovering a shard. Relates #73496

This commit adds peer recoveries from snapshots. It allows establishing a replica by downloading file data from a snapshot rather than transferring the data from the primary. Enabling this feature is done on the repository definition. Repositories having the setting `use_for_peer_recovery=true` will be consulted to find a good snapshot when recovering a shard. Relates elastic#73496 Backport of elastic#76237

This commit adds third party integration tests for snapshot based recoveries in S3, Azure and GCS. Relates elastic#73496

This commit adds peer recoveries from snapshots. It allows establishing a replica by downloading file data from a snapshot rather than transferring the data from the primary. Enabling this feature is done on the repository definition. Repositories having the setting `use_for_peer_recovery=true` will be consulted to find a good snapshot when recovering a shard. Relates #73496 Backport of #76237

This commit adds third party integration tests for snapshot based recoveries in S3, Azure and GCS. Relates #73496

Relates elastic#73496

This commit adds third party integration tests for snapshot based recoveries in S3, Azure and GCS. Relates elastic#73496 Backport of elastic#76489

This commit adds third party integration tests for snapshot based recoveries in S3, Azure and GCS. Relates #73496 Backport of #76489

Adds new field to recovery API to keep track of amount of data recovered from snapshots. The normal recovered_bytes field remains and is also increased for recovery from snapshot but can go backwards in the unlikely case that recovery from snapshot fails to download a file. Relates #73496

…#76499) Adds new field to recovery API to keep track of amount of data recovered from snapshots. The normal recovered_bytes field remains and is also increased for recovery from snapshot but can go backwards in the unlikely case that recovery from snapshot fails to download a file. Relates elastic#73496

…#76572) * Keep track of data recovered from snapshots in RecoveryState (#76499) Adds new field to recovery API to keep track of amount of data recovered from snapshots. The normal recovered_bytes field remains and is also increased for recovery from snapshot but can go backwards in the unlikely case that recovery from snapshot fails to download a file. Relates #73496 * one less space Co-authored-by: Francisco Fernández Castaño <francisco.fernandez.castano@gmail.com>

This commit adds support for peer recoveries using snapshots after a primary failover if the snapshot shares the same logical contents but the phyisical files are different. It uses the seq no information stored in the snapshot to compare against the current shard source node seq nos and decide whether or not it can use the snapshot to recover the shard. Since the underlying index files are different to the source index files, error handling is different than when the files are shared. In this case, if there's an error while snapshots files are recovered, we have to cancel the on-going downloads, wait until all in-flight operations complete, remove the recovered files and start from scratch using a fallback recovery plan that uses the files from the source node. Relates elastic#73496

…rs (#77420) This commit adds support for peer recoveries using snapshots after a primary failover if the snapshot shares the same logical contents but the physical files are different. It uses the seq no information stored in the snapshot to compare against the current shard source node seq nos and decide whether or not it can use the snapshot to recover the shard. Since the underlying index files are different to the source index files, error handling is different than when the files are shared. In this case, if there's an error while snapshots files are recovered, we have to cancel the on-going downloads, wait until all in-flight operations complete, remove the recovered files and start from scratch using a fallback recovery plan that uses the files from the source node. Relates #73496

This commit adds support for peer recoveries using snapshots after a primary failover if the snapshot shares the same logical contents but the physical files are different. It uses the seq no information stored in the snapshot to compare against the current shard source node seq nos and decide whether or not it can use the snapshot to recover the shard. Since the underlying index files are different to the source index files, error handling is different than when the files are shared. In this case, if there's an error while snapshots files are recovered, we have to cancel the on-going downloads, wait until all in-flight operations complete, remove the recovered files and start from scratch using a fallback recovery plan that uses the files from the source node. Relates elastic#73496 Backport of elastic#77420

…ailovers (#79137) This commit adds support for peer recoveries using snapshots after a primary failover if the snapshot shares the same logical contents but the physical files are different. It uses the seq no information stored in the snapshot to compare against the current shard source node seq nos and decide whether or not it can use the snapshot to recover the shard. Since the underlying index files are different to the source index files, error handling is different than when the files are shared. In this case, if there's an error while snapshots files are recovered, we have to cancel the on-going downloads, wait until all in-flight operations complete, remove the recovered files and start from scratch using a fallback recovery plan that uses the files from the source node. Relates #73496 Backport of #77420

dakrone added >enhancement :Distributed/Allocation All issues relating to the decision making around placing a shard (both master logic & on the nodes) needs:triage Requires assignment of a team area label labels May 27, 2021

elasticmachine added the Team:Distributed Meta label for distributed team label May 27, 2021

dakrone removed the needs:triage Requires assignment of a team area label label May 27, 2021

dakrone mentioned this issue May 27, 2021

Reduce DTS costs for cross zone data transfer within Elasticsearch #73501

Open

henningandersen self-assigned this May 28, 2021

henningandersen assigned fcofdez Jun 9, 2021

fcofdez mentioned this issue Jun 28, 2021

Add setting to use snapshots during peer recoveries #74636

Closed

fcofdez mentioned this issue Jul 8, 2021

Add the ability to fetch the latest successful shard snapshot #75080

Merged

fcofdez mentioned this issue Jul 22, 2021

[7.x] Add the ability to fetch the latest successful shard snapshot #75627

Merged

fcofdez mentioned this issue Aug 9, 2021

[7.x] Add peer recovery planners that take into account available snapshots #76239

Merged

This was referenced Aug 10, 2021

Compute latest snapshot directly in TransportGetShardSnapshotAction #76254

Merged

Add peer recoveries using snapshot files when possible #76237

Merged

fcofdez mentioned this issue Aug 13, 2021

[7.x] Add peer recoveries using snapshot files when possible #76482

Merged

fcofdez added a commit to fcofdez/elasticsearch that referenced this issue Aug 13, 2021

Add third party integration tests for snapshot based recoveries

4e010a1

This commit adds third party integration tests for snapshot based recoveries in S3, Azure and GCS. Relates elastic#73496

fcofdez mentioned this issue Aug 13, 2021

Add third party integration tests for snapshot based recoveries #76489

Merged

fcofdez added a commit that referenced this issue Aug 13, 2021

Add third party integration tests for snapshot based recoveries (#76489)

a6aa599

This commit adds third party integration tests for snapshot based recoveries in S3, Azure and GCS. Relates #73496

fcofdez added a commit to fcofdez/elasticsearch that referenced this issue Aug 13, 2021

Keep track of data recovered from snapshots in RecoveryState

f5bfa52

Relates elastic#73496

fcofdez mentioned this issue Aug 13, 2021

Keep track of data recovered from snapshots in RecoveryState #76499

Merged

fcofdez mentioned this issue Aug 13, 2021

[7.x] Add third party integration tests for snapshot based recoveries #76500

Merged

fcofdez added a commit that referenced this issue Aug 13, 2021

Add third party integration tests for snapshot based recoveries (#76500)

987f499

This commit adds third party integration tests for snapshot based recoveries in S3, Azure and GCS. Relates #73496 Backport of #76489

henningandersen mentioned this issue Aug 16, 2021

Keep track of data recovered from snapshots in RecoveryState (#76499) #76572

Merged

repantis added the Meta label Sep 2, 2021

fcofdez mentioned this issue Sep 8, 2021

Add support for peer recoveries using snapshots after primary failovers #77420

Merged

fcofdez mentioned this issue Oct 14, 2021

[7.x] Add support for peer recoveries using snapshots after primary failovers #79137

Merged

fcofdez mentioned this issue Oct 14, 2021

Limit the number of connections used by snapshot file downloads during recoveries #79044

Closed

fcofdez closed this as completed Oct 26, 2021

Leaf-Lin mentioned this issue Jan 19, 2022

Make shard balancing aware of forced awareness attributes #73498

Open

DaveCTurner mentioned this issue Mar 23, 2023

ILM rollover within the same availability zone #62194

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Replicate or relocate data via snapshot #73496

Replicate or relocate data via snapshot #73496

dakrone commented May 27, 2021 •

edited by fcofdez

Loading

elasticmachine commented May 27, 2021

henrikno commented Aug 10, 2021

Replicate or relocate data via snapshot #73496

Replicate or relocate data via snapshot #73496

Comments

dakrone commented May 27, 2021 • edited by fcofdez Loading

Phase 1

Phase 2

elasticmachine commented May 27, 2021

henrikno commented Aug 10, 2021

dakrone commented May 27, 2021 •

edited by fcofdez

Loading