unable to use azure disk in StatefulSet since /dev/sd* changed after detach/attach disk #1918

sblyk · 2017-12-12T09:00:20Z

Is this a request for help?:
yes

Is this an ISSUE or FEATURE REQUEST? (choose one):
ISSUE

What version of acs-engine?:
v0.9.1

Orchestrator and version (e.g. Kubernetes, DC/OS, Swarm)
Kubernetes v1.7.9

What happened:
We create a storageClass

kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
  name: uat
provisioner: kubernetes.io/azure-disk
parameters:
  skuName: Standard_LRS
  location: chinaeast
  storageAccount: X

then we create statefulset to use it

volumeMounts:
        - name: admin-persistent-storage
          mountPath: /u01
  volumeClaimTemplates:
  - metadata:
      name: admin-persistent-storage
    spec:
      storageClassName: uat
      accessModes: [ "ReadWriteOnce" ]
      resources:
        requests:
          storage: 50Gi

At first, we could use this disk normally, but after a period of time, we found that the state of the disk changed to read-only. After the deletion of pod, the newly generated pod could use the same disk, but after a period of time, the disk became read-only again.
there are some error message in pod,u01 is our pv disk.

[root@admin-0 /]# cd u01
[root@admin-0 u01]# ls
ls: reading directory .: Input/output error

dmesg|grep error
[852311.079420] EXT4-fs error (device sdf): ext4_find_entry:1465: inode #1310721: comm java: reading directory lblock 0
[852311.283939] EXT4-fs error (device sdf): ext4_wait_block_bitmap:503: comm java: Cannot read block bitmap - block_group = 5, block_bitmap = 1030
[852311.292376] EXT4-fs error (device sdf): ext4_discard_preallocations:4058: comm java: Error -5 reading block bitmap for 5
[852311.301009] EXT4-fs error (device sdf): ext4_wait_block_bitmap:503: comm java: Cannot read block bitmap - block_group = 5, block_bitmap = 1030

What you expected to happen:
Can normal use of dynamic disk

How to reproduce it (as minimally and precisely as possible):

Anything else we need to know:

The text was updated successfully, but these errors were encountered:

andyzhangx · 2017-12-13T05:29:23Z

according to your provided logs, ls also fail.
Could you follow below standard steps to check:
https://github.com/andyzhangx/Demo/tree/master/linux/azuredisk

If azure disk state really changed from readwrite to readonly in your pod, could you log on the agent node, and check in VM? Thanks.

sblyk · 2017-12-17T05:55:30Z

@andyzhangx Thanks for your help,we use standard examples,and the it works well.But I don't kown why only our disks are unreadable.
we have 3 master nodes and 6 agent masters.we need to create some statefulsets:
statefulset A,B,C,D,E in namespace dev use storageAccount dev
statefulset A,B,C,D,E in namespace sit use storageAccount sit
statefulset A,B,C,D,E in namespace uat use storageAccount uat
we find some disk become not work ,when we create a new statefulsets,for example,when we create uat-A,the disk of dev-B become not work.
some error message on agnet node:
df -aTh

/dev/sdd       -              -     -     -    - /var/lib/kubelet/plugins/kubernetes.io/azure-disk/mounts/b3068172079
/dev/sdd       -              -     -     -    - /var/lib/kubelet/plugins/kubernetes.io/azure-disk/mounts/b3068172079
/dev/sdd       -              -     -     -    - /var/lib/kubelet/pods/1c5fdb37-e010-11e7-ba11-0017fa009264/volumes/kubernetes.io~azure-disk/pvc-1c57b4d2-e010-11e7-ba11-0017fa009264
/dev/sdd       -              -     -     -    - /var/lib/kubelet/pods/1c5fdb37-e010-11e7-ba11-0017fa009264/volumes/kubernetes.io~azure-disk/pvc-1c57b4d2-e010-11e7-ba11-0017fa009264

cd /var/lib/kubelet/pods/1c5fdb37-e010-11e7-ba11-0017fa009264/volumes/kubernetes.io~azure-disk/pvc-1c57b4d2-e010-11e7-ba11-0017fa009264
ls
ls: reading directory '.': Input/output error

dmesg |grep sdd

[16510.548361] EXT4-fs error (device sdd): ext4_find_entry:1465: inode #1310773: comm java: reading directory lblock 0
[16510.577109] EXT4-fs error (device sdd): ext4_find_entry:1465: inode #1310721: comm java: reading directory lblock 0
[16510.584177] EXT4-fs error (device sdd): ext4_find_entry:1465: inode #1310721: comm java: reading directory lblock 0
[17083.426883] EXT4-fs warning (device sdd): htree_dirblock_to_tree:962: inode #2: lblock 0: comm ls: error -5 reading directory block
[17949.982790] EXT4-fs error (device sdd): ext4_wait_block_bitmap:503: comm java: Cannot read block bitmap - block_group = 4, block_bitmap = 1029
[17949.991638] EXT4-fs error (device sdd): ext4_discard_preallocations:4058: comm java: Error -5 reading block bitmap for 4
[74172.617994] EXT4-fs warning (device sdd): htree_dirblock_to_tree:962: inode #2: lblock 0: comm updatedb.mlocat: error -5 reading directory block
[74172.680767] EXT4-fs warning (device sdd): htree_dirblock_to_tree:962: inode #2: lblock 0: comm updatedb.mlocat: error -5 reading directory block
[109351.904041] EXT4-fs (sdd): error count since last fsck: 11
[109351.904043] EXT4-fs (sdd): initial error at time 1513175048: ext4_find_entry:1465: inode 1310721
[109351.904044] EXT4-fs (sdd): last error at time 1513176494: ext4_discard_preallocations:4058: inode 1310721
[201626.588043] EXT4-fs (sdd): error count since last fsck: 11
[201626.588045] EXT4-fs (sdd): initial error at time 1513175048: ext4_find_entry:1465
[201626.588077] EXT4-fs (sdd): last error at time 1513176494: ext4_discard_preallocations:4058: inode 1310721
[293901.276053] EXT4-fs (sdd): error count since last fsck: 11
[293901.276073] EXT4-fs (sdd): initial error at time 1513175048: ext4_find_entry:1465
[293901.276087] EXT4-fs (sdd): last error at time 1513176494: ext4_discard_preallocations:4058: inode 1310721
[331006.988219] EXT4-fs warning (device sdd): htree_dirblock_to_tree:962: inode #2: lblock 0: comm ls: error -5 reading directory block
[331159.447686] EXT4-fs error (device sdd): ext4_find_entry:1465: inode #2: comm bash: reading directory lblock 0

and we find after we delete the pod,and it recreate.
the disk become useful.

andyzhangx · 2017-12-17T11:00:02Z

@sblyk could you print out kubectl get pv to make sure different StatefulSet use different PV, it sounds like two statefulsets are using one PV. Also could you also mark which PV is in trouble(readonly)?

andyzhangx · 2017-12-17T11:07:02Z

@sblyk also, you could check whether disks are attached to agent VM in azure portal if you found one disk is readonly, the disk name is in the format like pvc-1c57b4d2-e010-11e7-ba11-0017fa009264.
And pls remember azure disk is RWO, which means one azure disk could only be attached to one agent

sblyk · 2017-12-18T05:43:10Z

@andyzhangx yes, different StatefulSet use different PV.these statefulsets are in different namespaces.
there are some logs
kubectl get pv

pvc-124ea6fd-e013-11e7-ba11-0017fa009264   50Gi       RWO           Delete          Bound     sit/admin-persistent-storage-admin-0                         sit                      4d
pvc-1c57b4d2-e010-11e7-ba11-0017fa009264   50Gi       RWO           Delete          Bound     dev/admin-persistent-storage-admin-0                         dev                      4d
pvc-22260c6a-de95-11e7-ba11-0017fa009264   50Gi       RWO           Delete          Bound     staging/admin-persistent-storage-admin-0                     staging                  6d
pvc-a8f4094f-de87-11e7-ba11-0017fa009264   50Gi       RWO           Delete          Bound     uat/admin-persistent-storage-admin-0                         uat                      6d

andyzhangx · 2017-12-18T09:21:32Z

@sblyk when you found disk 1c5fdb37-e010-11e7-ba11-0017fa009264 is unreachable, could you log on the portal, find the VM where your pod is located and check whether that disk is in the VM disk list? and also check that disk status in the storage account. I could not repro on my env.

sblyk · 2017-12-18T09:54:10Z

@andyzhangx
the disk is in the VM disk list:

but it is unreachable

root@k8s-agentpool-21518299-2:/home/azureuser# cd /var/lib/kubelet/pods/1c5fdb37-e010-11e7-ba11-0017fa009264/volumes/kubernetes.io~azure-disk/pvc-1c57b4d2-e010-11e7-ba11-0017fa009264
root@k8s-agentpool-21518299-2:/var/lib/kubelet/pods/1c5fdb37-e010-11e7-ba11-0017fa009264/volumes/kubernetes.io~azure-disk/pvc-1c57b4d2-e010-11e7-ba11-0017fa009264# ls
ls: reading directory '.': Input/output error

andyzhangx · 2017-12-18T13:30:44Z

@sblyk thanks for the info. We are close to get the root cause. Pls keep this env for debugging, thanks!
The symblink link is there, while it does not mean the disk is in your agent VM.
Would you provide me info according to below instructions? thanks.
https://github.com/andyzhangx/Demo/blob/master/linux/azuredisk/azuredisk-attachment-debugging.md
My working mail is xiazhang@microsoft.com, you could email me directly if you think it's more convenient for you, I will support you in this case as my P0 priority.

sblyk · 2017-12-19T02:51:39Z

@andyzhangx
Thanks,The following are messages
System : Ubuntu 16.04.3 LTS (GNU/Linux 4.11.0-1016-azure x86_64
VM disk list:

df -aTh

/dev/sdd       -               -     -     -    - /var/lib/kubelet/plugins/kubernetes.io/azure-disk/mounts/b3068172079
/dev/sdd       -               -     -     -    - /var/lib/kubelet/plugins/kubernetes.io/azure-disk/mounts/b3068172079
/dev/sdd       -               -     -     -    - /var/lib/kubelet/pods/1c5fdb37-e010-11e7-ba11-0017fa009264/volumes/kubernetes.io~azure-disk/pvc-1c57b4d2-e010-11e7-ba11-0017fa009264
/dev/sdd       -               -     -     -    - /var/lib/kubelet/pods/1c5fdb37-e010-11e7-ba11-0017fa009264/volumes/kubernetes.io~azure-disk/pvc-1c57b4d2-e010-11e7-ba11-0017fa009264
shm            -               -     -     -    - /var/lib/docker/containers/2ae510aa1f8b51013f363459e3c6ad70443488aef19dbc5afce82e8c5492010a/shm
nsfs           -               -     -     -    - /run/docker/netns/d61c4523c7d4
/dev/sdf       -               -     -     -    - /var/lib/kubelet/plugins/kubernetes.io/azure-disk/mounts/b91411834
/dev/sdf       -               -     -     -    - /var/lib/kubelet/plugins/kubernetes.io/azure-disk/mounts/b91411834
/dev/sdf       -               -     -     -    - /var/lib/kubelet/pods/26938711-e010-11e7-ba11-0017fa009264/volumes/kubernetes.io~azure-disk/pvc-26886203-e010-11e7-ba11-0017fa009264
/dev/sdf       -               -     -     -    - /var/lib/kubelet/pods/26938711-e010-11e7-ba11-0017fa009264/volumes/kubernetes.io~azure-disk/pvc-26886203-e010-11e7-ba11-0017fa009264
shm            -               -     -     -    - /var/lib/docker/containers/c92dbcb02c9b1d77e02240bf5a3e1d4df6a9402190de07518a3148bf8e784e7c/shm
nsfs           -               -     -     -    - /run/docker/netns/2a6cdf638fb3
/dev/sde       -               -     -     -    - /var/lib/kubelet/plugins/kubernetes.io/azure-disk/mounts/b2031366359
/dev/sde       -               -     -     -    - /var/lib/kubelet/plugins/kubernetes.io/azure-disk/mounts/b2031366359
/dev/sde       -               -     -     -    - /var/lib/kubelet/pods/20405568-e010-11e7-ba11-0017fa009264/volumes/kubernetes.io~azure-disk/pvc-2030085a-e010-11e7-ba11-0017fa009264
/dev/sde       -               -     -     -    - /var/lib/kubelet/pods/20405568-e010-11e7-ba11-0017fa009264/volumes/kubernetes.io~azure-disk/pvc-2030085a-e010-11e7-ba11-0017fa009264
shm            -               -     -     -    - /var/lib/docker/containers/924a73d737a95959af1c201f4234d465361c64b5831ddb5bcae8beefe8ee3705/shm
nsfs           -               -     -     -    - /run/docker/netns/354545197f0f
/dev/sdh       -               -     -     -    - /var/lib/kubelet/plugins/kubernetes.io/azure-disk/mounts/b247037408
/dev/sdh       -               -     -     -    - /var/lib/kubelet/plugins/kubernetes.io/azure-disk/mounts/b247037408
/dev/sdh       -               -     -     -    - /var/lib/kubelet/pods/2896ec12-e010-11e7-ba11-0017fa009264/volumes/kubernetes.io~azure-disk/pvc-288e6fa8-e010-11e7-ba11-0017fa009264
/dev/sdh       -               -     -     -    - /var/lib/kubelet/pods/2896ec12-e010-11e7-ba11-0017fa009264/volumes/kubernetes.io~azure-disk/pvc-288e6fa8-e010-11e7-ba11-0017fa009264
shm            -               -     -     -    - /var/lib/docker/containers/0e49bafa119ff52d62415b62f97753757dc6268cfb6d135678f752482b0df645/shm
nsfs           -               -     -     -    - /run/docker/netns/f9e7d3f36780
tmpfs          -               -     -     -    - /var/lib/kubelet/pods/7280fe71-e018-11e7-ba11-0017fa009264/volumes/kubernetes.io~secret/default-token-cnxwr
tmpfs          -               -     -     -    - /var/lib/kubelet/pods/7280fe71-e018-11e7-ba11-0017fa009264/volumes/kubernetes.io~secret/default-token-cnxwr
/dev/sdg       -               -     -     -    - /var/lib/kubelet/plugins/kubernetes.io/azure-disk/mounts/b2002896536
/dev/sdg       -               -     -     -    - /var/lib/kubelet/plugins/kubernetes.io/azure-disk/mounts/b2002896536
/dev/sdg       -               -     -     -    - /var/lib/kubelet/pods/7280fe71-e018-11e7-ba11-0017fa009264/volumes/kubernetes.io~azure-disk/pvc-b369fe92-de87-11e7-ba11-0017fa009264
/dev/sdg       -               -     -     -    - /var/lib/kubelet/pods/7280fe71-e018-11e7-ba11-0017fa009264/volumes/kubernetes.io~azure-disk/pvc-b369fe92-de87-11e7-ba11-0017fa009264

we can only use sdg ,the error message when we use other directories:
ls: reading directory '.': Input/output error

# sudo ls -lt /dev/disk/azure/
total 0
drwxr-xr-x 2 root root 140 Dec 13 15:16 scsi1
lrwxrwxrwx 1 root root  10 Dec 13 09:49 root-part1 -> ../../sda1
lrwxrwxrwx 1 root root  10 Dec 13 09:49 resource-part1 -> ../../sdb1
lrwxrwxrwx 1 root root   9 Dec 13 09:49 resource -> ../../sdb
lrwxrwxrwx 1 root root   9 Dec 13 09:49 root -> ../../sda

# ls -lt /sys/bus/scsi/devices
total 0
lrwxrwxrwx 1 root root 0 Dec 19 01:40 host2 -> ../../../devices/LNXSYSTM:00/LNXSYBUS:00/PNP0A03:00/device:07/VMBUS:01/f8b3781a-1e82-4818-a1c3-63d806ec15bb/host2
lrwxrwxrwx 1 root root 0 Dec 19 01:40 3:0:0:0 -> ../../../devices/LNXSYSTM:00/LNXSYBUS:00/PNP0A03:00/device:07/VMBUS:01/f8b3781b-1e82-4818-a1c3-63d806ec15bb/host3/target3:0:0/3:0:0:0
lrwxrwxrwx 1 root root 0 Dec 19 01:40 1:0:1:0 -> ../../../devices/LNXSYSTM:00/LNXSYBUS:00/PNP0A03:00/device:07/VMBUS:01/00000000-0001-8899-0000-000000000000/host1/target1:0:1/1:0:1:0
lrwxrwxrwx 1 root root 0 Dec 19 01:40 host4 -> ../../../devices/pci0000:00/0000:00:07.1/ata1/host4
lrwxrwxrwx 1 root root 0 Dec 19 01:40 target0:0:0 -> ../../../devices/LNXSYSTM:00/LNXSYBUS:00/PNP0A03:00/device:07/VMBUS:01/00000000-0000-8899-0000-000000000000/host0/target0:0:0
lrwxrwxrwx 1 root root 0 Dec 19 01:40 0:0:0:0 -> ../../../devices/LNXSYSTM:00/LNXSYBUS:00/PNP0A03:00/device:07/VMBUS:01/00000000-0000-8899-0000-000000000000/host0/target0:0:0/0:0:0:0
lrwxrwxrwx 1 root root 0 Dec 19 01:40 3:0:0:2 -> ../../../devices/LNXSYSTM:00/LNXSYBUS:00/PNP0A03:00/device:07/VMBUS:01/f8b3781b-1e82-4818-a1c3-63d806ec15bb/host3/target3:0:0/3:0:0:2
lrwxrwxrwx 1 root root 0 Dec 19 01:40 3:0:0:4 -> ../../../devices/LNXSYSTM:00/LNXSYBUS:00/PNP0A03:00/device:07/VMBUS:01/f8b3781b-1e82-4818-a1c3-63d806ec15bb/host3/target3:0:0/3:0:0:4
lrwxrwxrwx 1 root root 0 Dec 19 01:40 target3:0:0 -> ../../../devices/LNXSYSTM:00/LNXSYBUS:00/PNP0A03:00/device:07/VMBUS:01/f8b3781b-1e82-4818-a1c3-63d806ec15bb/host3/target3:0:0
lrwxrwxrwx 1 root root 0 Dec 19 01:40 target1:0:1 -> ../../../devices/LNXSYSTM:00/LNXSYBUS:00/PNP0A03:00/device:07/VMBUS:01/00000000-0001-8899-0000-000000000000/host1/target1:0:1
lrwxrwxrwx 1 root root 0 Dec 19 01:40 host1 -> ../../../devices/LNXSYSTM:00/LNXSYBUS:00/PNP0A03:00/device:07/VMBUS:01/00000000-0001-8899-0000-000000000000/host1
lrwxrwxrwx 1 root root 0 Dec 19 01:40 host3 -> ../../../devices/LNXSYSTM:00/LNXSYBUS:00/PNP0A03:00/device:07/VMBUS:01/f8b3781b-1e82-4818-a1c3-63d806ec15bb/host3
lrwxrwxrwx 1 root root 0 Dec 19 01:40 host5 -> ../../../devices/pci0000:00/0000:00:07.1/ata2/host5
lrwxrwxrwx 1 root root 0 Dec 19 01:40 3:0:0:3 -> ../../../devices/LNXSYSTM:00/LNXSYBUS:00/PNP0A03:00/device:07/VMBUS:01/f8b3781b-1e82-4818-a1c3-63d806ec15bb/host3/target3:0:0/3:0:0:3
lrwxrwxrwx 1 root root 0 Dec 19 01:40 3:0:0:5 -> ../../../devices/LNXSYSTM:00/LNXSYBUS:00/PNP0A03:00/device:07/VMBUS:01/f8b3781b-1e82-4818-a1c3-63d806ec15bb/host3/target3:0:0/3:0:0:5
lrwxrwxrwx 1 root root 0 Dec 19 01:40 host0 -> ../../../devices/LNXSYSTM:00/LNXSYBUS:00/PNP0A03:00/device:07/VMBUS:01/00000000-0000-8899-0000-000000000000/host0

# cat /sys/bus/scsi/devices/0\:0\:0\:0/vendor
Msft
# cat /sys/bus/scsi/devices/0\:0\:0\:0/model
Virtual Disk
# ls -lt /sys/bus/scsi/devices/0\:0\:0\:0/block/
total 0
drwxr-xr-x 9 root root 0 Dec 19 01:58 sda
# cat /sys/bus/scsi/devices/1\:0\:1\:0/vendor
Msft
# cat /sys/bus/scsi/devices/1\:0\:1\:0/model
Virtual Disk
# ls -lt /sys/bus/scsi/devices/1\:0\:1\:0/block/
total 0
drwxr-xr-x 9 root root 0 Dec 19 01:58 sdb
# cat /sys/bus/scsi/devices/3\:0\:0\:0/vendor
Msft
# cat /sys/bus/scsi/devices/3\:0\:0\:0/model
Virtual Disk
# ls -lt /sys/bus/scsi/devices/3\:0\:0\:0/block/
total 0
drwxr-xr-x 8 root root 0 Dec 19 01:57 sdg
# ls -lt /sys/bus/scsi/devices/3\:0\:0\:2/block/
total 0
drwxr-xr-x 8 root root 0 Dec 19 01:58 sdi
# ls -lt /sys/bus/scsi/devices/3\:0\:0\:3/block/
total 0
drwxr-xr-x 8 root root 0 Dec 19 01:58 sdk
# ls -lt /sys/bus/scsi/devices/3\:0\:0\:4/block/
total 0
drwxr-xr-x 8 root root 0 Dec 19 01:58 sdm
# ls -lt /sys/bus/scsi/devices/3\:0\:0\:5/block/
total 0
drwxr-xr-x 8 root root 0 Dec 19 01:58 sdc

andyzhangx · 2017-12-19T03:13:23Z

@sblyk could you also run "fdisk -l" on your agent VM? Thanks.
And what's your agent VM size?

sblyk · 2017-12-19T03:25:26Z

@andyzhangx

# fdisk -l
Disk /dev/sdb: 200 GiB, 214748364800 bytes, 419430400 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disklabel type: dos
Disk identifier: 0x6d7c2eac

Device     Boot Start       End   Sectors  Size Id Type
/dev/sdb1        2048 419428351 419426304  200G  7 HPFS/NTFS/exFAT


Disk /dev/sda: 30 GiB, 32212254720 bytes, 62914560 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: dos
Disk identifier: 0x3276fc6f

Device     Boot Start      End  Sectors Size Id Type
/dev/sda1  *     2048 62914526 62912479  30G 83 Linux


Disk /dev/sdi: 50 GiB, 53687091200 bytes, 104857600 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes


Disk /dev/sdk: 50 GiB, 53687091200 bytes, 104857600 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes


Disk /dev/sdm: 50 GiB, 53687091200 bytes, 104857600 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes


Disk /dev/sdc: 50 GiB, 53687091200 bytes, 104857600 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes


Disk /dev/sdg: 50 GiB, 53687091200 bytes, 104857600 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes

andyzhangx · 2017-12-19T03:33:03Z

could you format the disk /dev/sdk in VM? you could follow example by:
https://ubuntuforums.org/showthread.php?t=267869
Just to make sure whether those disks are really unreachable. Thanks.

andyzhangx · 2017-12-19T03:35:37Z

sudo fdisk /dev/sdk
sudo mkfs.ext4 /dev/sdk1

and then try to use /dev/sdk1 by following command:

sudo mkdir /media/sdk1
sudo mount /dev/sdk1 /media/sdk1

andyzhangx · 2017-12-19T03:45:36Z

also, could you collect the kubelet logs on agent VM? Thanks.

id=`docker ps -a | grep "hyperkube kubelet" | awk -F ' ' '{print $1}'`
docker logs $id > $id.log 2>&1
vi $id.log

sblyk · 2017-12-19T04:42:15Z

@andyzhangx the disk sdk was mounted successfully

~$ cd sdk1
~/sdk1$ ls
e682a2d77826.log  test.txt

kubelet logs on agent VM：
e682a2d77826.zip

andyzhangx · 2017-12-19T05:43:57Z

@sblyk one finding:
all unreachable disks are all from storage account: pvc300170136002

https://pvc300170136002.blob.core.chinacloudapi.cn/300170136/k8s-cn-devtest-dynamic-pvc-1c57b4d2-e010-11e7-ba11-0017fa009264.vhd

The usable disk(sdg) is from storage account: appdatacnuat

ttps://appdatacnuat.blob.core.chinacloudapi.cn/vhds/k8s-cn-devtest-dynamic-pvc-b369fe92-de87-11e7-ba11-0017fa009264.vhd

andyzhangx · 2017-12-19T05:48:46Z

one possibility is that storage account pvc300170136002 contains too many azure disks, there would be IOPS throttling. Could you check how many azure disks totally are in your pvc300170136002? And the reason why you would use pvc300170136002 is that you did not specify storage account in your azure disk storage class.

sblyk · 2017-12-19T06:10:02Z

@andyzhangx at first we create some statefulsets like this:
statefulset A,B,C,D,E in namespace dev use storageAccount dev
statefulset A,B,C,D,E in namespace sit use storageAccount sit
statefulset A,B,C,D,E in namespace uat use storageAccount uat
one storageAccount have 5 disk
but the problem occurred
then for a test we changed StorageClass of dev from

kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
  name: dev
provisioner: kubernetes.io/azure-disk
parameters:
  skuName: Standard_LRS
  location: chinaeast
  storageAccount: appdatacndev

to

kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
  name: dev
provisioner: kubernetes.io/azure-disk
parameters:
  storageaccounttype: Standard_LRS
  kind: Shared

andyzhangx · 2017-12-19T07:49:32Z

@sblyk In your df -aTh command it shows following disks:
/dev/sdd, /dev/sde, /dev/sdf, /dev/sdg, /dev/sdh
While in your /sys/bus/scsi/devices, it shows other disks:
sdg, sdi, sdk, sdm, sdc
Could you make sure it's the same VM? If so, that's the issue here

sblyk · 2017-12-19T07:54:39Z

@andyzhangx yes I'm sure it's the same VM

andyzhangx · 2017-12-19T08:24:24Z

@sblyk Could you run following command to rescan host scsi (you need to run as root)

echo "- - -" > /sys/class/scsi_host/host0/scan
echo "- - -" > /sys/class/scsi_host/host1/scan
echo "- - -" > /sys/class/scsi_host/host2/scan
echo "- - -" > /sys/class/scsi_host/host3/scan
echo "- - -" > /sys/class/scsi_host/host4/scan
echo "- - -" > /sys/class/scsi_host/host5/scan

And then check following device:

ls -lt /sys/bus/scsi/devices/3\:0\:0\:4/block/

sblyk · 2017-12-19T08:36:31Z

@andyzhangx this is the output

# echo "- - -" > /sys/class/scsi_host/host0/scan
# echo "- - -" > /sys/class/scsi_host/host1/scan
# echo "- - -" > /sys/class/scsi_host/host2/scan
# echo "- - -" > /sys/class/scsi_host/host3/scan
# echo "- - -" > /sys/class/scsi_host/host4/scan
# echo "- - -" > /sys/class/scsi_host/host5/scan
# ls -lt /sys/bus/scsi/devices/3\:0\:0\:4/block/
total 0
drwxr-xr-x 8 root root 0 Dec 19 01:58 sdm

andyzhangx · 2017-12-19T08:43:37Z

@sblyk one workaround is reboot your agent which has this issue, just don't know why would your disk /dev/sdd would be changed to other dev name. Did you detach/attach disks manually?

sblyk · 2017-12-19T09:37:38Z

@andyzhangx not only this agent,we have 6 agents,all have this issue.
and disk in storageAccount pvc300170136002 is created for test, so we did not detach/attach disks

andyzhangx · 2017-12-19T09:38:57Z

@khenidak Did you got such case in the before that /dev/sd* changed to other dev name and k8s just did not detect it? In this case, /dev/sdd, /dev/sde, /dev/sdf, /dev/sdg, /dev/sdh changed to sdg, sdi, sdk, sdm, sdc.

andyzhangx · 2017-12-19T09:41:15Z

@sblyk could you reboot agent to check whether that issue exists any more? Another question, I found LUN 1 is not used in this agent VM? is it the same for other VM? Thanks.

sblyk · 2017-12-19T10:08:02Z

@andyzhangx
I've restarted the VM ,and this method is effective
but this problem seems to happen when creating new statefulset(mount disk） and deleting pod ( umount and mount to other VM ).
and in other agent VM

andyzhangx · 2017-12-19T11:48:59Z

@sblyk thanks for the check. Could you share the complete statefulset config? I would like to do a deep investigation tomorrow. And pls note that azure disk could only be attached to one VM, there could be problem when migrate one pod from one vm to another vm, since disk detach and attachment would take a few minutes. I would suggest you use azure file instead, it could be mounted in multiple VMs. You could find azure file example here:
https://github.com/andyzhangx/Demo/tree/master/linux/azurefile

BTW, sudo systemctl restart kubelet (no need to reboot) would also solve your issue as a workaround.

andyzhangx · 2017-12-20T08:48:32Z

@sblyk I have identified this is a bug for azure disk since it's using /dev/sd*, and /dev/sd* could be changed when detach/attach azure disks. Could you change this issue name to
unable to use azure disk in StatefulSet since /dev/sd* changed after detach/attach disk

andyzhangx · 2017-12-20T09:34:04Z

@sblyk again, thanks for the reporting, I am now working on a fix in k8s upstream.

@rootfs

Automatic merge from submit-queue (batch tested with PRs 56382, 57549). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>. fix azure disk not available issue when device name changed **What this PR does / why we need it**: There is possibility that device name(`/dev/sd*`) would change when attach/detach data disk in Azure VM according to [Troubleshoot Linux VM device name change](https://docs.microsoft.com/en-us/azure/virtual-machines/linux/troubleshoot-device-names-problems). And We did hit this issue, see customer [case](Azure/acs-engine#1918). This PR would use `/dev/disk/by-id` instead of `/dev/sd*` for azure disk and `/dev/disk/by-id` would not change even device name changed. **Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*: Fixes #57444 **Special notes for your reviewer**: In a customer [case](Azure/acs-engine#1918), customer is unable to use azure disk in StatefulSet since /dev/sd* changed after detach/attach disk. we are using `/dev/sd*`(code is [here](https://github.com/kubernetes/kubernetes/blob/master/pkg/volume/azure_dd/azure_common_linux.go#L140)) to "mount -bind" k8s path, while `/dev/sd*` could be changed when VM is attach/detaching data disks, see [Troubleshoot Linux VM device name change](https://docs.microsoft.com/en-us/azure/virtual-machines/linux/troubleshoot-device-names-problems) And I have also checked related AWS, GCE code, they are using `/dev/disk/by-id/` other than `/dev/sd*`, see [aws code](https://github.com/kubernetes/kubernetes/blob/master/pkg/volume/aws_ebs/aws_util.go#L228) [gce code](https://github.com/kubernetes/kubernetes/blob/master/pkg/volume/gce_pd/gce_util.go#L278) **Release note**: ``` fix azure disk not available when device name changed ``` /sig azure /assign @rootfs @karataliu @brendandburns @khenidak

andyzhangx · 2018-02-25T03:42:40Z

@sblyk finally I fixed this issue, I wrote a doc describe details about this issue:
https://github.com/andyzhangx/Demo/blob/master/issues/README.md#2-disk-unavailable-after-attachdetach-a-data-disk-on-a-node

Fix or workaround:

add cachingmode: None in azure disk storage class(default is ReadWrite), e.g.

kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
  name: hdd
provisioner: kubernetes.io/azure-disk
parameters:
  skuname: Standard_LRS
  kind: Managed
  cachingmode: None

PR fix device name change issue for azure disk could fix this issue too, it will change default cachingmode value from ReadWrite to None.

@karataliu

Automatic merge from submit-queue (batch tested with PRs 60346, 60135, 60289, 59643, 52640). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>. fix device name change issue for azure disk **What this PR does / why we need it**: fix device name change issue for azure disk due to default host cache setting changed from None to ReadWrite from v1.7, and default host cache setting in azure portal is `None` **Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*: Fixes #60344, #57444 also fixes following issues: Azure/acs-engine#1918 Azure/AKS#201 **Special notes for your reviewer**: From v1.7, default host cache setting changed from None to ReadWrite, this would lead to device name change after attach multiple disks on azure vm, finally lead to disk unaccessiable from pod. For an example: statefulset with 8 replicas(each with an azure disk) on one node will always fail, according to my observation, add the 6th data disk will always make dev name change, some pod could not access data disk after that. I have verified this fix on v1.8.4 Without this PR on one node(dev name changes): ``` azureuser@k8s-agentpool2-40588258-0:~$ tree /dev/disk/azure ... â””â”€â”€ scsi1 â”œâ”€â”€ lun0 -> ../../../sdk â”œâ”€â”€ lun1 -> ../../../sdj â”œâ”€â”€ lun2 -> ../../../sde â”œâ”€â”€ lun3 -> ../../../sdf â”œâ”€â”€ lun4 -> ../../../sdg â”œâ”€â”€ lun5 -> ../../../sdh â””â”€â”€ lun6 -> ../../../sdi ``` With this PR on one node(no dev name change): ``` azureuser@k8s-agentpool2-40588258-1:~$ tree /dev/disk/azure ... â””â”€â”€ scsi1 â”œâ”€â”€ lun0 -> ../../../sdc â”œâ”€â”€ lun1 -> ../../../sdd â”œâ”€â”€ lun2 -> ../../../sde â”œâ”€â”€ lun3 -> ../../../sdf â”œâ”€â”€ lun5 -> ../../../sdh â””â”€â”€ lun6 -> ../../../sdi ``` Following `myvm-0`, `myvm-1` is crashing due to dev name change, after controller manager replacement, myvm2-x pods work well. ``` Every 2.0s: kubectl get po Sat Feb 24 04:16:26 2018 NAME READY STATUS RESTARTS AGE myvm-0 0/1 CrashLoopBackOff 13 41m myvm-1 0/1 CrashLoopBackOff 11 38m myvm-2 1/1 Running 0 35m myvm-3 1/1 Running 0 33m myvm-4 1/1 Running 0 31m myvm-5 1/1 Running 0 29m myvm-6 1/1 Running 0 26m myvm2-0 1/1 Running 0 17m myvm2-1 1/1 Running 0 14m myvm2-2 1/1 Running 0 12m myvm2-3 1/1 Running 0 10m myvm2-4 1/1 Running 0 8m myvm2-5 1/1 Running 0 5m myvm2-6 1/1 Running 0 3m ``` **Release note**: ``` fix device name change issue for azure disk ``` /assign @karataliu /sig azure @feiskyer could you mark it as v1.10 milestone? @brendandburns @khenidak @rootfs @jdumars FYI Since it's a critical bug, I will cherry pick this fix to v1.7-v1.9, note that v1.6 does not have this issue since default cachingmode is `None`

sblyk · 2018-03-01T01:58:31Z

@andyzhangx Thank you!I tested it yesterday.
add cachingmode: None in azure disk storage class
now it works well

andyzhangx · 2018-03-01T07:49:46Z

@sblyk would you close this issue then?

fauzan-n · 2019-02-22T08:03:11Z

@sblyk finally I fixed this issue, I wrote a doc describe details about this issue:
https://github.com/andyzhangx/Demo/blob/master/issues/README.md#2-disk-unavailable-after-attachdetach-a-data-disk-on-a-node

Fix or workaround:

add cachingmode: None in azure disk storage class(default is ReadWrite), e.g.
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
  name: hdd
provisioner: kubernetes.io/azure-disk
parameters:
  skuname: Standard_LRS
  kind: Managed
  cachingmode: None
PR fix device name change issue for azure disk could fix this issue too, it will change default cachingmode value from ReadWrite to None.

Is this change save for the disk? I mean what happened to the data after changing the cachingmode?

andyzhangx · 2019-02-22T08:15:13Z

@foosome
NOTE: Azure platform has fixed the host cache issue, the suggested host cache setting of data disk is ReadOnly now, more details about azure disk cache setting Issue details:

what's your issue here?

fauzan-n · 2019-02-28T18:53:07Z

@foosome
NOTE: Azure platform has fixed the host cache issue, the suggested host cache setting of data disk is ReadOnly now, more details about azure disk cache setting Issue details:

what's your issue here?

I 'm using Alibaba Cloud

Solved with creating StorageClass with adding cachingmode parameter

Thank You @andyzhangx !

andyzhangx · 2019-03-01T02:51:03Z

@foosome but cachingmode is for Azure cloud? Alibaba Cloud has same issue?

fauzan-n · 2019-03-02T19:16:41Z

@foosome but cachingmode is for Azure cloud? Alibaba Cloud has same issue?

Yes, with default storageclass I got the same issue.

caching mode is defined on StorageClass manifest so it's in kubernetes layer. I think azure and alibaba cloud has this same issue because they use same default StorageClass on Kubernetes

andyzhangx · 2019-03-03T01:30:49Z

@foosome no
cachingmode is defined here in Azure Cloud privider:
https://github.com/kubernetes/kubernetes/blob/79ce30cc6ab0bdb1fcea8de70fa6a6f16797aef8/pkg/volume/azure_dd/azure_provision.go#L155-L156

sblyk changed the title ~~Unable to access pv on AzureChinaCloud~~ Unable to usepv on AzureChinaCloud Dec 12, 2017

sblyk changed the title ~~Unable to usepv on AzureChinaCloud~~ Unable to use pv on AzureChinaCloud Dec 12, 2017

andyzhangx mentioned this issue Dec 20, 2017

unable to use azure disk in StatefulSet since /dev/sd* changed after detach/attach disk kubernetes/kubernetes#57444

Closed

sblyk changed the title ~~Unable to use pv on AzureChinaCloud~~ unable to use azure disk in StatefulSet since /dev/sd* changed after detach/attach disk Dec 21, 2017

andyzhangx mentioned this issue Dec 22, 2017

fix azure disk not available issue when device name changed kubernetes/kubernetes#57549

Merged

andyzhangx mentioned this issue Feb 24, 2018

fix device name change issue for azure disk kubernetes/kubernetes#60346

Merged

sblyk closed this as completed Mar 1, 2018

andyzhangx mentioned this issue Mar 8, 2018

fix disk unavailable issue when mounting multiple azure disks due to dev name change #2410

Merged

andyzhangx mentioned this issue Apr 11, 2018

Input/output error when accessing PV Azure/AKS#297

Closed

antoineco mentioned this issue Nov 27, 2018

Azure Disks occasionally mounted in a way leading to I/O errors kubernetes/kubernetes#71453

Closed

unable to use azure disk in StatefulSet since /dev/sd* changed after detach/attach disk #1918

unable to use azure disk in StatefulSet since /dev/sd* changed after detach/attach disk #1918

Comments

sblyk commented Dec 12, 2017 • edited Loading

Is this a request for help?: yes

Is this an ISSUE or FEATURE REQUEST? (choose one): ISSUE

What version of acs-engine?: v0.9.1

andyzhangx commented Dec 13, 2017

sblyk commented Dec 17, 2017

andyzhangx commented Dec 17, 2017

andyzhangx commented Dec 17, 2017

sblyk commented Dec 18, 2017

andyzhangx commented Dec 18, 2017

sblyk commented Dec 18, 2017

andyzhangx commented Dec 18, 2017 • edited Loading

sblyk commented Dec 19, 2017 • edited Loading

andyzhangx commented Dec 19, 2017 • edited Loading

sblyk commented Dec 19, 2017

andyzhangx commented Dec 19, 2017

andyzhangx commented Dec 19, 2017

andyzhangx commented Dec 19, 2017

sblyk commented Dec 19, 2017

andyzhangx commented Dec 19, 2017

andyzhangx commented Dec 19, 2017

sblyk commented Dec 19, 2017

andyzhangx commented Dec 19, 2017

sblyk commented Dec 19, 2017

andyzhangx commented Dec 19, 2017

sblyk commented Dec 19, 2017

andyzhangx commented Dec 19, 2017

sblyk commented Dec 19, 2017

andyzhangx commented Dec 19, 2017

andyzhangx commented Dec 19, 2017

sblyk commented Dec 19, 2017 • edited Loading

andyzhangx commented Dec 19, 2017

andyzhangx commented Dec 20, 2017

andyzhangx commented Dec 20, 2017

andyzhangx commented Feb 25, 2018

sblyk commented Mar 1, 2018

andyzhangx commented Mar 1, 2018

fauzan-n commented Feb 22, 2019

andyzhangx commented Feb 22, 2019

fauzan-n commented Feb 28, 2019

andyzhangx commented Mar 1, 2019

fauzan-n commented Mar 2, 2019 • edited Loading

andyzhangx commented Mar 3, 2019

sblyk commented Dec 12, 2017 •

edited

Loading

Is this a request for help?:
yes

Is this an ISSUE or FEATURE REQUEST? (choose one):
ISSUE

What version of acs-engine?:
v0.9.1

andyzhangx commented Dec 18, 2017 •

edited

Loading

sblyk commented Dec 19, 2017 •

edited

Loading

andyzhangx commented Dec 19, 2017 •

edited

Loading

sblyk commented Dec 19, 2017 •

edited

Loading

fauzan-n commented Mar 2, 2019 •

edited

Loading