Failed deletes should not prevent volume creation #126

acsulli · 2018-05-17T09:47:19Z

When Trident attempts to delete a storage volume, and fails, that should not cause it to ignore other operations until successful. An example scenario:

Create a PVC, resulting in a storage system volume created as expected.
Create a replication relationship on that volume using some external method, e.g. CLI or the GUI for ONTAP/SolidFire.
Delete the PVC, which will result in Trident failing to delete the storage volume until the replication relationship has been removed.

At this point, Trident will "hang" attempting to delete the volume until it's resolved. Having it do something similar to Kubernetes' "CrashLoopBackoff" and continue to perform other create/delete actions would be desirable.

kangarlou · 2018-05-17T19:09:58Z

Trident doesn't hang if it fails to create or delete a volume. After a failed ZAPI, Trident moves on to process the next request. Upon such failures, Trident reattempts the operation after one minute. If the cause of the failure is an out-of-bound replication, then subsequent reattempts are bound to fail as Trident has no knowledge of the mirroring relationship. These failures shouldn't not prevent provisioning of new volumes unless the failure results in panics.

guillebianco · 2018-05-22T14:02:13Z

If snapmirror is configured in a Trident-controlled volume and that volume is deleted, Trident initialization will fail thus failing to provision a new volume (since it can't delete the original one)

eg:

time="2018-05-22T13:55:14Z" level=debug msg="Kubernetes frontend got notified of a PVC." PVC=guille-trident-test-5 PVC_accessModes="[ReadWriteOnce]" PVC_annotations="map[volume.beta.kubernetes.io/storage-provisioner:netapp.io/trident]" PVC_eventType=update PVC_phase=Pending PVC_size=1Gi PVC_storageClass=standard-nas PVC_uid=c0b34d03-5dc7-11e8-8865-0050569e3732 PVC_volume=
time="2018-05-22T13:55:14Z" level=warning msg="Kubernetes frontend couldn't provision a volume: Trident initialization failed; unable to clean up deleted volume bi-as-bidatalab-dev-jenkins-home-5e88f: error destroying volume ingsafascl03_cs01_bi_as_bidatalab_dev_jenkins_home_5e88f: API status: failed, Reason: Volume \"ingsafascl03_cs01_bi_as_bidatalab_dev_jenkins_home_5e88f\" in Vserver \"ingsafascl03-cs01\" is the source endpoint of one or more SnapMirror relationships. Before you delete the volume, you must release the source information of the SnapMirror relationships using \"snapmirror release\". To display the destinations to be used in the \"snapmirror release\" commands, use the \"snapmirror list-destinations -source-vserver ingsafascl03-cs01 -source-volume ingsafascl03_cs01_bi_as_bidatalab_dev_jenkins_home_5e88f\" command., Code: 18436 (will retry upon resync)" volume=trident-guille-trident-test-5-c0b34

kangarlou · 2018-05-22T14:59:13Z

Thanks, this makes sense. If Trident has bootstrapped successfully, the failure in deleting a volume shouldn't have any impact on Trident. However, once Trident is restarted, the failed deletion causes Trident not to bootstrap successfully. The source of the problem is that the VolumeTransaction object isn't deleted after a failed operation. We'll fix the problem.

kangarlou self-assigned this May 22, 2018

kangarlou added bug tracked labels May 22, 2018

netapp-ci closed this as completed in ab11b23 Jun 28, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Failed deletes should not prevent volume creation #126

Failed deletes should not prevent volume creation #126

acsulli commented May 17, 2018

kangarlou commented May 17, 2018

guillebianco commented May 22, 2018 •

edited by acsulli

Loading

kangarlou commented May 22, 2018

Failed deletes should not prevent volume creation #126

Failed deletes should not prevent volume creation #126

Comments

acsulli commented May 17, 2018

kangarlou commented May 17, 2018

guillebianco commented May 22, 2018 • edited by acsulli Loading

kangarlou commented May 22, 2018

guillebianco commented May 22, 2018 •

edited by acsulli

Loading