You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Note that this issue is to track the design only. There will be two separate implementation phases to implement this design, which will be tracked by separate issues as implementation will end up in a future Velero release.
There are two different goals here, linked by a single primary missing feature in the Velero backup workflow.
The first goal is to enhance backup performance by allowing the primary backup controller to run in multiple threads, enabling Velero to back up multiple items at the same time for a given backup.
The second goal is to enable Velero to eventually support VolumeGroupSnapshots.
For both of these goals, Velero needs a way to determine which items should be backed up together.
This design proposal will include two development phases:
Phase 1 will refactor the backup workflow to identify blocks of items that should be backed up together, and then coordinate backup hooks among items in the block.
Phase 2 will add multiple multiple worker threads for backing up item blocks, so instead of backing up each block as it identified, the velero backup workflow will instead add the block to a channel and one of the workers will pick it up.
Actual support for VolumeGroupSnapshots is out-of-scope here and will be handled in a future design proposal, but the item block refactor introduced in Phase 1 is a primary building block for this future proposal.
Background
Currently, during backup processing, the main Velero backup controller runs in a single thread, completely finishing the primary backup processing for one resource before moving on to the next one.
We can improve the overall backup performance by backing up multiple items for a backup at the same time, but before we can do this we must first identify resources that need to be backed up together.
As part of this initial refactoring, once these "Item Blocks" are identified, an additional change will be to move pod hook processing up to the ItemBlock level.
If there are multiple pods in the ItemBlock, pre-hooks for all pods will be run before backing up the items, followed by post-hooks for all pods.
This change to hook processing is another prerequisite for future VolumeGroupSnapshot support, since supporting this will require backing up the pods and volumes together for any volumes which belong to the same group.
Once we are backing up items by block, the next step will be to create multiple worker threads to process and back up ItemBlocks, so that we can back up multiple ItemBlocks at the same time.
Goals
Identify groups of items to back up together (ItemBlocks)
Manage backup hooks at the ItemBlock level rather than per-item
Using worker threads, back up ItemBlocks at the same time.
Non Goals
Support VolumeGroupSnapshots: this is a future feature, although certain prerequisites for this enhancement are included in this proposal.
Process multiple backups in parallel: this is a future feature, although certain prerequisites for this enhancement are included in this proposal.
Environment:
Velero version (use velero version):
Kubernetes version (use kubectl version):
Kubernetes installer & version:
Cloud provider or hardware configuration:
OS (e.g. from /etc/os-release):
Vote on this issue!
This is an invitation to the Velero community to vote on issues, you can see the project's top voted issues listed here.
Use the "reaction smiley face" up to the right of this comment to vote.
👍 for "The project would be better with this feature added"
👎 for "This feature will not enhance the project in a meaningful way"
The text was updated successfully, but these errors were encountered:
sseago
changed the title
Velero Backup performance Improvements and VolumeGroupSnapshot enablement
Design for Velero Backup performance Improvements and VolumeGroupSnapshot enablement
Feb 27, 2024
I understand that the mechanism for hook execution needs to be changed to support VolumeSnapshotGroup, but it's not quite clear to me why the multi-thread backup is a pre-requisite for VolumeSnapshotGroup. In other words, if all the pods don't have hooks defined, do we need multi-thread in velero backup to support VolumeSnapshotGroup?
If there are multiple pods in the ItemBlock, pre-hooks for all pods will be run before backing up the items, followed by post-hooks for all pods.
Isn't this only necessary when the bound PVCs are in the same VolumeSnapshotGroup? Otherwise how the hooks are executed in one item block does not really matter.
@reasonerjt nulti-thread backup is not a prerequisite for VolumeGroupSnapshot. Rather, the ItemBlock concept is a prerequisite for both VolumeSnapshotGroup and multithreaded backups. In other words, if there is a need for both features, then things are far simpler to use the same building block for VolumeGroupSnapshot and multithreaded backup -- otherwise we risk doing the same work in two different ways resulting in ar more complexity.
"Isn't this only necessary when the bound PVCs are in the same VolumeSnapshotGroup" -- yes, but ordinarily having a VolumeGroup in common is exactly how two pods will end up in the same ItemBlock. The pod BIA will need to look at bound PVCs, and if any of those are in VolumeGroups, then find any other pods bound to other PVCs in the same VolumeGroup -- therefore the defined multi-pod ItemBlock will include all of the pods with volumes in the same VolumeGroup. There may be some outlying situations where a third-party plugin will create a VolumeGroup with two pods in it, and in those cases hooks may end up running together, but 1) it's an edge case and 2) it won't result in incorrect results.
Describe the problem/challenge you have
Note that this issue is to track the design only. There will be two separate implementation phases to implement this design, which will be tracked by separate issues as implementation will end up in a future Velero release.
There are two different goals here, linked by a single primary missing feature in the Velero backup workflow.
The first goal is to enhance backup performance by allowing the primary backup controller to run in multiple threads, enabling Velero to back up multiple items at the same time for a given backup.
The second goal is to enable Velero to eventually support VolumeGroupSnapshots.
For both of these goals, Velero needs a way to determine which items should be backed up together.
This design proposal will include two development phases:
Background
Currently, during backup processing, the main Velero backup controller runs in a single thread, completely finishing the primary backup processing for one resource before moving on to the next one.
We can improve the overall backup performance by backing up multiple items for a backup at the same time, but before we can do this we must first identify resources that need to be backed up together.
As part of this initial refactoring, once these "Item Blocks" are identified, an additional change will be to move pod hook processing up to the ItemBlock level.
If there are multiple pods in the ItemBlock, pre-hooks for all pods will be run before backing up the items, followed by post-hooks for all pods.
This change to hook processing is another prerequisite for future VolumeGroupSnapshot support, since supporting this will require backing up the pods and volumes together for any volumes which belong to the same group.
Once we are backing up items by block, the next step will be to create multiple worker threads to process and back up ItemBlocks, so that we can back up multiple ItemBlocks at the same time.
Goals
Non Goals
Environment:
velero version
):kubectl version
):/etc/os-release
):Vote on this issue!
This is an invitation to the Velero community to vote on issues, you can see the project's top voted issues listed here.
Use the "reaction smiley face" up to the right of this comment to vote.
The text was updated successfully, but these errors were encountered: