-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Kopia] Backup partially failing with timeout error after 24 hours despite default timeout of 4h #8220
Comments
After velero pod restart it start working fine. But after few days, It again start failing. |
Please share Velero log bundle by running |
Thanks @Lyndon-Li |
Could you try with v1.14.1? |
Thanks @blackpiglet Upgrading velero to 1.14.0 is in my plan. is the same fix available in this as well ? |
It should be, but v1.14.1 contains some other fixes. It's better to use the latest patch release. |
After investigation, the timeout after 24 hours issue was not caused by the PodVolumeBackup's event handler's de-registered issue.
The 24-hour gap happens here. I haven't found any logic that would consume that much time yet. @navilg |
Let me take a check about that volume |
app-volume is an emptyDir volume. I will test after excluding this volume from backup. Do we know of any known issue with backing-up of emptyDir volumes ? |
PodVolumeBackup can work with the |
@blackpiglet Thanks. Using resource policies I don't see a way to exclude emptyDir volumes. I excluded it by adding pod annotation. But even after excluding apps-volume, I still see same issue. Here is latest bundle |
In the v1.14 doc you can see following.. you'd just configure for emptyDir and nothing else.
|
Thanks. I got it. I checked 1.12 doc. |
If resolved please close issue. |
@kaovilai Even after excluding the |
Did you try excluding using volumeTypes instead or not? |
No. Does it make any difference in backup ? I am currently excluding wirh pod annotations. |
What steps did you take and what happened:
A scheduled backup is running.
Backup get stuck for 24 hours and fails with timeout error.
In backup description I see some pod volumes backup completed but many missing. They are not even listed in failed or new section of description.
Logs:
What did you expect to happen:
Backup to work fine.
The following information will help us better understand what's going on:
If you are using velero v1.7.0+:
Please use
velero debug --backup <backupname> --restore <restorename>
to generate the support bundle, and attach to this issue, more options please refer tovelero debug --help
If you are using earlier versions:
Please provide the output of the following commands (Pasting long output into a GitHub gist or other pastebin is fine.)
kubectl logs deployment/velero -n velero
velero backup describe <backupname>
orkubectl get backup/<backupname> -n velero -o yaml
velero backup logs <backupname>
velero restore describe <restorename>
orkubectl get restore/<restorename> -n velero -o yaml
velero restore logs <restorename>
Anything else you would like to add:
Environment:
velero version
): 1.12.3velero client config get features
): Nonekubectl version
): 1.28/etc/os-release
): Ubuntu 22.04 with containerdVote on this issue!
This is an invitation to the Velero community to vote on issues, you can see the project's top voted issues listed here.
Use the "reaction smiley face" up to the right of this comment to vote.
The text was updated successfully, but these errors were encountered: