Skip to content
This repository has been archived by the owner on Jan 11, 2023. It is now read-only.

unable to use azure disk in StatefulSet since /dev/sd* changed after detach/attach disk #1918

Closed
sblyk opened this issue Dec 12, 2017 · 39 comments · Fixed by #2410
Closed

unable to use azure disk in StatefulSet since /dev/sd* changed after detach/attach disk #1918

sblyk opened this issue Dec 12, 2017 · 39 comments · Fixed by #2410

Comments

@sblyk
Copy link

sblyk commented Dec 12, 2017

Is this a request for help?:
yes

Is this an ISSUE or FEATURE REQUEST? (choose one):
ISSUE

What version of acs-engine?:
v0.9.1

Orchestrator and version (e.g. Kubernetes, DC/OS, Swarm)
Kubernetes v1.7.9

What happened:
We create a storageClass

kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
  name: uat
provisioner: kubernetes.io/azure-disk
parameters:
  skuName: Standard_LRS
  location: chinaeast
  storageAccount: X

then we create statefulset to use it

volumeMounts:
        - name: admin-persistent-storage
          mountPath: /u01
  volumeClaimTemplates:
  - metadata:
      name: admin-persistent-storage
    spec:
      storageClassName: uat
      accessModes: [ "ReadWriteOnce" ]
      resources:
        requests:
          storage: 50Gi

At first, we could use this disk normally, but after a period of time, we found that the state of the disk changed to read-only. After the deletion of pod, the newly generated pod could use the same disk, but after a period of time, the disk became read-only again.
there are some error message in pod,u01 is our pv disk.

[root@admin-0 /]# cd u01
[root@admin-0 u01]# ls
ls: reading directory .: Input/output error
dmesg|grep error
[852311.079420] EXT4-fs error (device sdf): ext4_find_entry:1465: inode #1310721: comm java: reading directory lblock 0
[852311.283939] EXT4-fs error (device sdf): ext4_wait_block_bitmap:503: comm java: Cannot read block bitmap - block_group = 5, block_bitmap = 1030
[852311.292376] EXT4-fs error (device sdf): ext4_discard_preallocations:4058: comm java: Error -5 reading block bitmap for 5
[852311.301009] EXT4-fs error (device sdf): ext4_wait_block_bitmap:503: comm java: Cannot read block bitmap - block_group = 5, block_bitmap = 1030

What you expected to happen:
Can normal use of dynamic disk

How to reproduce it (as minimally and precisely as possible):

Anything else we need to know:

@sblyk sblyk changed the title Unable to access pv on AzureChinaCloud Unable to usepv on AzureChinaCloud Dec 12, 2017
@sblyk sblyk changed the title Unable to usepv on AzureChinaCloud Unable to use pv on AzureChinaCloud Dec 12, 2017
@andyzhangx
Copy link
Contributor

according to your provided logs, ls also fail.
Could you follow below standard steps to check:
https://github.com/andyzhangx/Demo/tree/master/linux/azuredisk

If azure disk state really changed from readwrite to readonly in your pod, could you log on the agent node, and check in VM? Thanks.

@sblyk
Copy link
Author

sblyk commented Dec 17, 2017

@andyzhangx Thanks for your help,we use standard examples,and the it works well.But I don't kown why only our disks are unreadable.
we have 3 master nodes and 6 agent masters.we need to create some statefulsets:
statefulset A,B,C,D,E in namespace dev use storageAccount dev
statefulset A,B,C,D,E in namespace sit use storageAccount sit
statefulset A,B,C,D,E in namespace uat use storageAccount uat
we find some disk become not work ,when we create a new statefulsets,for example,when we create uat-A,the disk of dev-B become not work.
some error message on agnet node:
df -aTh

/dev/sdd       -              -     -     -    - /var/lib/kubelet/plugins/kubernetes.io/azure-disk/mounts/b3068172079
/dev/sdd       -              -     -     -    - /var/lib/kubelet/plugins/kubernetes.io/azure-disk/mounts/b3068172079
/dev/sdd       -              -     -     -    - /var/lib/kubelet/pods/1c5fdb37-e010-11e7-ba11-0017fa009264/volumes/kubernetes.io~azure-disk/pvc-1c57b4d2-e010-11e7-ba11-0017fa009264
/dev/sdd       -              -     -     -    - /var/lib/kubelet/pods/1c5fdb37-e010-11e7-ba11-0017fa009264/volumes/kubernetes.io~azure-disk/pvc-1c57b4d2-e010-11e7-ba11-0017fa009264

cd /var/lib/kubelet/pods/1c5fdb37-e010-11e7-ba11-0017fa009264/volumes/kubernetes.io~azure-disk/pvc-1c57b4d2-e010-11e7-ba11-0017fa009264
ls
ls: reading directory '.': Input/output error

dmesg |grep sdd

[16510.548361] EXT4-fs error (device sdd): ext4_find_entry:1465: inode #1310773: comm java: reading directory lblock 0
[16510.577109] EXT4-fs error (device sdd): ext4_find_entry:1465: inode #1310721: comm java: reading directory lblock 0
[16510.584177] EXT4-fs error (device sdd): ext4_find_entry:1465: inode #1310721: comm java: reading directory lblock 0
[17083.426883] EXT4-fs warning (device sdd): htree_dirblock_to_tree:962: inode #2: lblock 0: comm ls: error -5 reading directory block
[17949.982790] EXT4-fs error (device sdd): ext4_wait_block_bitmap:503: comm java: Cannot read block bitmap - block_group = 4, block_bitmap = 1029
[17949.991638] EXT4-fs error (device sdd): ext4_discard_preallocations:4058: comm java: Error -5 reading block bitmap for 4
[74172.617994] EXT4-fs warning (device sdd): htree_dirblock_to_tree:962: inode #2: lblock 0: comm updatedb.mlocat: error -5 reading directory block
[74172.680767] EXT4-fs warning (device sdd): htree_dirblock_to_tree:962: inode #2: lblock 0: comm updatedb.mlocat: error -5 reading directory block
[109351.904041] EXT4-fs (sdd): error count since last fsck: 11
[109351.904043] EXT4-fs (sdd): initial error at time 1513175048: ext4_find_entry:1465: inode 1310721
[109351.904044] EXT4-fs (sdd): last error at time 1513176494: ext4_discard_preallocations:4058: inode 1310721
[201626.588043] EXT4-fs (sdd): error count since last fsck: 11
[201626.588045] EXT4-fs (sdd): initial error at time 1513175048: ext4_find_entry:1465
[201626.588077] EXT4-fs (sdd): last error at time 1513176494: ext4_discard_preallocations:4058: inode 1310721
[293901.276053] EXT4-fs (sdd): error count since last fsck: 11
[293901.276073] EXT4-fs (sdd): initial error at time 1513175048: ext4_find_entry:1465
[293901.276087] EXT4-fs (sdd): last error at time 1513176494: ext4_discard_preallocations:4058: inode 1310721
[331006.988219] EXT4-fs warning (device sdd): htree_dirblock_to_tree:962: inode #2: lblock 0: comm ls: error -5 reading directory block
[331159.447686] EXT4-fs error (device sdd): ext4_find_entry:1465: inode #2: comm bash: reading directory lblock 0

and we find after we delete the pod,and it recreate.
the disk become useful.

@andyzhangx
Copy link
Contributor

@sblyk could you print out kubectl get pv to make sure different StatefulSet use different PV, it sounds like two statefulsets are using one PV. Also could you also mark which PV is in trouble(readonly)?

@andyzhangx
Copy link
Contributor

@sblyk also, you could check whether disks are attached to agent VM in azure portal if you found one disk is readonly, the disk name is in the format like pvc-1c57b4d2-e010-11e7-ba11-0017fa009264.
And pls remember azure disk is RWO, which means one azure disk could only be attached to one agent

@sblyk
Copy link
Author

sblyk commented Dec 18, 2017

@andyzhangx yes, different StatefulSet use different PV.these statefulsets are in different namespaces.
there are some logs
kubectl get pv

pvc-124ea6fd-e013-11e7-ba11-0017fa009264   50Gi       RWO           Delete          Bound     sit/admin-persistent-storage-admin-0                         sit                      4d
pvc-1c57b4d2-e010-11e7-ba11-0017fa009264   50Gi       RWO           Delete          Bound     dev/admin-persistent-storage-admin-0                         dev                      4d
pvc-22260c6a-de95-11e7-ba11-0017fa009264   50Gi       RWO           Delete          Bound     staging/admin-persistent-storage-admin-0                     staging                  6d
pvc-a8f4094f-de87-11e7-ba11-0017fa009264   50Gi       RWO           Delete          Bound     uat/admin-persistent-storage-admin-0                         uat                      6d

@andyzhangx
Copy link
Contributor

@sblyk when you found disk 1c5fdb37-e010-11e7-ba11-0017fa009264 is unreachable, could you log on the portal, find the VM where your pod is located and check whether that disk is in the VM disk list? and also check that disk status in the storage account. I could not repro on my env.

@sblyk
Copy link
Author

sblyk commented Dec 18, 2017

@andyzhangx
the disk is in the VM disk list:
image
but it is unreachable

root@k8s-agentpool-21518299-2:/home/azureuser# cd /var/lib/kubelet/pods/1c5fdb37-e010-11e7-ba11-0017fa009264/volumes/kubernetes.io~azure-disk/pvc-1c57b4d2-e010-11e7-ba11-0017fa009264
root@k8s-agentpool-21518299-2:/var/lib/kubelet/pods/1c5fdb37-e010-11e7-ba11-0017fa009264/volumes/kubernetes.io~azure-disk/pvc-1c57b4d2-e010-11e7-ba11-0017fa009264# ls
ls: reading directory '.': Input/output error

@andyzhangx
Copy link
Contributor

andyzhangx commented Dec 18, 2017

@sblyk thanks for the info. We are close to get the root cause. Pls keep this env for debugging, thanks!
The symblink link is there, while it does not mean the disk is in your agent VM.
Would you provide me info according to below instructions? thanks.
https://github.com/andyzhangx/Demo/blob/master/linux/azuredisk/azuredisk-attachment-debugging.md
My working mail is xiazhang@microsoft.com, you could email me directly if you think it's more convenient for you, I will support you in this case as my P0 priority.

@sblyk
Copy link
Author

sblyk commented Dec 19, 2017

@andyzhangx
Thanks,The following are messages
System : Ubuntu 16.04.3 LTS (GNU/Linux 4.11.0-1016-azure x86_64
VM disk list:
image
df -aTh

/dev/sdd       -               -     -     -    - /var/lib/kubelet/plugins/kubernetes.io/azure-disk/mounts/b3068172079
/dev/sdd       -               -     -     -    - /var/lib/kubelet/plugins/kubernetes.io/azure-disk/mounts/b3068172079
/dev/sdd       -               -     -     -    - /var/lib/kubelet/pods/1c5fdb37-e010-11e7-ba11-0017fa009264/volumes/kubernetes.io~azure-disk/pvc-1c57b4d2-e010-11e7-ba11-0017fa009264
/dev/sdd       -               -     -     -    - /var/lib/kubelet/pods/1c5fdb37-e010-11e7-ba11-0017fa009264/volumes/kubernetes.io~azure-disk/pvc-1c57b4d2-e010-11e7-ba11-0017fa009264
shm            -               -     -     -    - /var/lib/docker/containers/2ae510aa1f8b51013f363459e3c6ad70443488aef19dbc5afce82e8c5492010a/shm
nsfs           -               -     -     -    - /run/docker/netns/d61c4523c7d4
/dev/sdf       -               -     -     -    - /var/lib/kubelet/plugins/kubernetes.io/azure-disk/mounts/b91411834
/dev/sdf       -               -     -     -    - /var/lib/kubelet/plugins/kubernetes.io/azure-disk/mounts/b91411834
/dev/sdf       -               -     -     -    - /var/lib/kubelet/pods/26938711-e010-11e7-ba11-0017fa009264/volumes/kubernetes.io~azure-disk/pvc-26886203-e010-11e7-ba11-0017fa009264
/dev/sdf       -               -     -     -    - /var/lib/kubelet/pods/26938711-e010-11e7-ba11-0017fa009264/volumes/kubernetes.io~azure-disk/pvc-26886203-e010-11e7-ba11-0017fa009264
shm            -               -     -     -    - /var/lib/docker/containers/c92dbcb02c9b1d77e02240bf5a3e1d4df6a9402190de07518a3148bf8e784e7c/shm
nsfs           -               -     -     -    - /run/docker/netns/2a6cdf638fb3
/dev/sde       -               -     -     -    - /var/lib/kubelet/plugins/kubernetes.io/azure-disk/mounts/b2031366359
/dev/sde       -               -     -     -    - /var/lib/kubelet/plugins/kubernetes.io/azure-disk/mounts/b2031366359
/dev/sde       -               -     -     -    - /var/lib/kubelet/pods/20405568-e010-11e7-ba11-0017fa009264/volumes/kubernetes.io~azure-disk/pvc-2030085a-e010-11e7-ba11-0017fa009264
/dev/sde       -               -     -     -    - /var/lib/kubelet/pods/20405568-e010-11e7-ba11-0017fa009264/volumes/kubernetes.io~azure-disk/pvc-2030085a-e010-11e7-ba11-0017fa009264
shm            -               -     -     -    - /var/lib/docker/containers/924a73d737a95959af1c201f4234d465361c64b5831ddb5bcae8beefe8ee3705/shm
nsfs           -               -     -     -    - /run/docker/netns/354545197f0f
/dev/sdh       -               -     -     -    - /var/lib/kubelet/plugins/kubernetes.io/azure-disk/mounts/b247037408
/dev/sdh       -               -     -     -    - /var/lib/kubelet/plugins/kubernetes.io/azure-disk/mounts/b247037408
/dev/sdh       -               -     -     -    - /var/lib/kubelet/pods/2896ec12-e010-11e7-ba11-0017fa009264/volumes/kubernetes.io~azure-disk/pvc-288e6fa8-e010-11e7-ba11-0017fa009264
/dev/sdh       -               -     -     -    - /var/lib/kubelet/pods/2896ec12-e010-11e7-ba11-0017fa009264/volumes/kubernetes.io~azure-disk/pvc-288e6fa8-e010-11e7-ba11-0017fa009264
shm            -               -     -     -    - /var/lib/docker/containers/0e49bafa119ff52d62415b62f97753757dc6268cfb6d135678f752482b0df645/shm
nsfs           -               -     -     -    - /run/docker/netns/f9e7d3f36780
tmpfs          -               -     -     -    - /var/lib/kubelet/pods/7280fe71-e018-11e7-ba11-0017fa009264/volumes/kubernetes.io~secret/default-token-cnxwr
tmpfs          -               -     -     -    - /var/lib/kubelet/pods/7280fe71-e018-11e7-ba11-0017fa009264/volumes/kubernetes.io~secret/default-token-cnxwr
/dev/sdg       -               -     -     -    - /var/lib/kubelet/plugins/kubernetes.io/azure-disk/mounts/b2002896536
/dev/sdg       -               -     -     -    - /var/lib/kubelet/plugins/kubernetes.io/azure-disk/mounts/b2002896536
/dev/sdg       -               -     -     -    - /var/lib/kubelet/pods/7280fe71-e018-11e7-ba11-0017fa009264/volumes/kubernetes.io~azure-disk/pvc-b369fe92-de87-11e7-ba11-0017fa009264
/dev/sdg       -               -     -     -    - /var/lib/kubelet/pods/7280fe71-e018-11e7-ba11-0017fa009264/volumes/kubernetes.io~azure-disk/pvc-b369fe92-de87-11e7-ba11-0017fa009264

we can only use sdg ,the error message when we use other directories:
ls: reading directory '.': Input/output error

# sudo ls -lt /dev/disk/azure/
total 0
drwxr-xr-x 2 root root 140 Dec 13 15:16 scsi1
lrwxrwxrwx 1 root root  10 Dec 13 09:49 root-part1 -> ../../sda1
lrwxrwxrwx 1 root root  10 Dec 13 09:49 resource-part1 -> ../../sdb1
lrwxrwxrwx 1 root root   9 Dec 13 09:49 resource -> ../../sdb
lrwxrwxrwx 1 root root   9 Dec 13 09:49 root -> ../../sda
# ls -lt /sys/bus/scsi/devices
total 0
lrwxrwxrwx 1 root root 0 Dec 19 01:40 host2 -> ../../../devices/LNXSYSTM:00/LNXSYBUS:00/PNP0A03:00/device:07/VMBUS:01/f8b3781a-1e82-4818-a1c3-63d806ec15bb/host2
lrwxrwxrwx 1 root root 0 Dec 19 01:40 3:0:0:0 -> ../../../devices/LNXSYSTM:00/LNXSYBUS:00/PNP0A03:00/device:07/VMBUS:01/f8b3781b-1e82-4818-a1c3-63d806ec15bb/host3/target3:0:0/3:0:0:0
lrwxrwxrwx 1 root root 0 Dec 19 01:40 1:0:1:0 -> ../../../devices/LNXSYSTM:00/LNXSYBUS:00/PNP0A03:00/device:07/VMBUS:01/00000000-0001-8899-0000-000000000000/host1/target1:0:1/1:0:1:0
lrwxrwxrwx 1 root root 0 Dec 19 01:40 host4 -> ../../../devices/pci0000:00/0000:00:07.1/ata1/host4
lrwxrwxrwx 1 root root 0 Dec 19 01:40 target0:0:0 -> ../../../devices/LNXSYSTM:00/LNXSYBUS:00/PNP0A03:00/device:07/VMBUS:01/00000000-0000-8899-0000-000000000000/host0/target0:0:0
lrwxrwxrwx 1 root root 0 Dec 19 01:40 0:0:0:0 -> ../../../devices/LNXSYSTM:00/LNXSYBUS:00/PNP0A03:00/device:07/VMBUS:01/00000000-0000-8899-0000-000000000000/host0/target0:0:0/0:0:0:0
lrwxrwxrwx 1 root root 0 Dec 19 01:40 3:0:0:2 -> ../../../devices/LNXSYSTM:00/LNXSYBUS:00/PNP0A03:00/device:07/VMBUS:01/f8b3781b-1e82-4818-a1c3-63d806ec15bb/host3/target3:0:0/3:0:0:2
lrwxrwxrwx 1 root root 0 Dec 19 01:40 3:0:0:4 -> ../../../devices/LNXSYSTM:00/LNXSYBUS:00/PNP0A03:00/device:07/VMBUS:01/f8b3781b-1e82-4818-a1c3-63d806ec15bb/host3/target3:0:0/3:0:0:4
lrwxrwxrwx 1 root root 0 Dec 19 01:40 target3:0:0 -> ../../../devices/LNXSYSTM:00/LNXSYBUS:00/PNP0A03:00/device:07/VMBUS:01/f8b3781b-1e82-4818-a1c3-63d806ec15bb/host3/target3:0:0
lrwxrwxrwx 1 root root 0 Dec 19 01:40 target1:0:1 -> ../../../devices/LNXSYSTM:00/LNXSYBUS:00/PNP0A03:00/device:07/VMBUS:01/00000000-0001-8899-0000-000000000000/host1/target1:0:1
lrwxrwxrwx 1 root root 0 Dec 19 01:40 host1 -> ../../../devices/LNXSYSTM:00/LNXSYBUS:00/PNP0A03:00/device:07/VMBUS:01/00000000-0001-8899-0000-000000000000/host1
lrwxrwxrwx 1 root root 0 Dec 19 01:40 host3 -> ../../../devices/LNXSYSTM:00/LNXSYBUS:00/PNP0A03:00/device:07/VMBUS:01/f8b3781b-1e82-4818-a1c3-63d806ec15bb/host3
lrwxrwxrwx 1 root root 0 Dec 19 01:40 host5 -> ../../../devices/pci0000:00/0000:00:07.1/ata2/host5
lrwxrwxrwx 1 root root 0 Dec 19 01:40 3:0:0:3 -> ../../../devices/LNXSYSTM:00/LNXSYBUS:00/PNP0A03:00/device:07/VMBUS:01/f8b3781b-1e82-4818-a1c3-63d806ec15bb/host3/target3:0:0/3:0:0:3
lrwxrwxrwx 1 root root 0 Dec 19 01:40 3:0:0:5 -> ../../../devices/LNXSYSTM:00/LNXSYBUS:00/PNP0A03:00/device:07/VMBUS:01/f8b3781b-1e82-4818-a1c3-63d806ec15bb/host3/target3:0:0/3:0:0:5
lrwxrwxrwx 1 root root 0 Dec 19 01:40 host0 -> ../../../devices/LNXSYSTM:00/LNXSYBUS:00/PNP0A03:00/device:07/VMBUS:01/00000000-0000-8899-0000-000000000000/host0
# cat /sys/bus/scsi/devices/0\:0\:0\:0/vendor
Msft
# cat /sys/bus/scsi/devices/0\:0\:0\:0/model
Virtual Disk
# ls -lt /sys/bus/scsi/devices/0\:0\:0\:0/block/
total 0
drwxr-xr-x 9 root root 0 Dec 19 01:58 sda
# cat /sys/bus/scsi/devices/1\:0\:1\:0/vendor
Msft
# cat /sys/bus/scsi/devices/1\:0\:1\:0/model
Virtual Disk
# ls -lt /sys/bus/scsi/devices/1\:0\:1\:0/block/
total 0
drwxr-xr-x 9 root root 0 Dec 19 01:58 sdb
# cat /sys/bus/scsi/devices/3\:0\:0\:0/vendor
Msft
# cat /sys/bus/scsi/devices/3\:0\:0\:0/model
Virtual Disk
# ls -lt /sys/bus/scsi/devices/3\:0\:0\:0/block/
total 0
drwxr-xr-x 8 root root 0 Dec 19 01:57 sdg
# ls -lt /sys/bus/scsi/devices/3\:0\:0\:2/block/
total 0
drwxr-xr-x 8 root root 0 Dec 19 01:58 sdi
# ls -lt /sys/bus/scsi/devices/3\:0\:0\:3/block/
total 0
drwxr-xr-x 8 root root 0 Dec 19 01:58 sdk
# ls -lt /sys/bus/scsi/devices/3\:0\:0\:4/block/
total 0
drwxr-xr-x 8 root root 0 Dec 19 01:58 sdm
# ls -lt /sys/bus/scsi/devices/3\:0\:0\:5/block/
total 0
drwxr-xr-x 8 root root 0 Dec 19 01:58 sdc

@andyzhangx
Copy link
Contributor

andyzhangx commented Dec 19, 2017

@sblyk could you also run "fdisk -l" on your agent VM? Thanks.
And what's your agent VM size?

@sblyk
Copy link
Author

sblyk commented Dec 19, 2017

@andyzhangx

# fdisk -l
Disk /dev/sdb: 200 GiB, 214748364800 bytes, 419430400 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disklabel type: dos
Disk identifier: 0x6d7c2eac

Device     Boot Start       End   Sectors  Size Id Type
/dev/sdb1        2048 419428351 419426304  200G  7 HPFS/NTFS/exFAT


Disk /dev/sda: 30 GiB, 32212254720 bytes, 62914560 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: dos
Disk identifier: 0x3276fc6f

Device     Boot Start      End  Sectors Size Id Type
/dev/sda1  *     2048 62914526 62912479  30G 83 Linux


Disk /dev/sdi: 50 GiB, 53687091200 bytes, 104857600 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes


Disk /dev/sdk: 50 GiB, 53687091200 bytes, 104857600 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes


Disk /dev/sdm: 50 GiB, 53687091200 bytes, 104857600 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes


Disk /dev/sdc: 50 GiB, 53687091200 bytes, 104857600 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes


Disk /dev/sdg: 50 GiB, 53687091200 bytes, 104857600 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes

@andyzhangx
Copy link
Contributor

could you format the disk /dev/sdk in VM? you could follow example by:
https://ubuntuforums.org/showthread.php?t=267869
Just to make sure whether those disks are really unreachable. Thanks.

@andyzhangx
Copy link
Contributor

sudo fdisk /dev/sdk
sudo mkfs.ext4 /dev/sdk1

and then try to use /dev/sdk1 by following command:

sudo mkdir /media/sdk1
sudo mount /dev/sdk1 /media/sdk1

@andyzhangx
Copy link
Contributor

also, could you collect the kubelet logs on agent VM? Thanks.

id=`docker ps -a | grep "hyperkube kubelet" | awk -F ' ' '{print $1}'`
docker logs $id > $id.log 2>&1
vi $id.log

@sblyk
Copy link
Author

sblyk commented Dec 19, 2017

@andyzhangx the disk sdk was mounted successfully

~$ cd sdk1
~/sdk1$ ls
e682a2d77826.log  test.txt

kubelet logs on agent VM:
e682a2d77826.zip

@andyzhangx
Copy link
Contributor

@sblyk one finding:
all unreachable disks are all from storage account: pvc300170136002

https://pvc300170136002.blob.core.chinacloudapi.cn/300170136/k8s-cn-devtest-dynamic-pvc-1c57b4d2-e010-11e7-ba11-0017fa009264.vhd

The usable disk(sdg) is from storage account: appdatacnuat

ttps://appdatacnuat.blob.core.chinacloudapi.cn/vhds/k8s-cn-devtest-dynamic-pvc-b369fe92-de87-11e7-ba11-0017fa009264.vhd

@andyzhangx
Copy link
Contributor

one possibility is that storage account pvc300170136002 contains too many azure disks, there would be IOPS throttling. Could you check how many azure disks totally are in your pvc300170136002? And the reason why you would use pvc300170136002 is that you did not specify storage account in your azure disk storage class.

@sblyk
Copy link
Author

sblyk commented Dec 19, 2017

@andyzhangx at first we create some statefulsets like this:
statefulset A,B,C,D,E in namespace dev use storageAccount dev
statefulset A,B,C,D,E in namespace sit use storageAccount sit
statefulset A,B,C,D,E in namespace uat use storageAccount uat
one storageAccount have 5 disk
but the problem occurred
then for a test we changed StorageClass of dev from

kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
  name: dev
provisioner: kubernetes.io/azure-disk
parameters:
  skuName: Standard_LRS
  location: chinaeast
  storageAccount: appdatacndev

to

kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
  name: dev
provisioner: kubernetes.io/azure-disk
parameters:
  storageaccounttype: Standard_LRS
  kind: Shared

@andyzhangx
Copy link
Contributor

@sblyk In your df -aTh command it shows following disks:
/dev/sdd, /dev/sde, /dev/sdf, /dev/sdg, /dev/sdh
While in your /sys/bus/scsi/devices, it shows other disks:
sdg, sdi, sdk, sdm, sdc
Could you make sure it's the same VM? If so, that's the issue here

@sblyk
Copy link
Author

sblyk commented Dec 19, 2017

@andyzhangx yes I'm sure it's the same VM

@andyzhangx
Copy link
Contributor

@sblyk Could you run following command to rescan host scsi (you need to run as root)

echo "- - -" > /sys/class/scsi_host/host0/scan
echo "- - -" > /sys/class/scsi_host/host1/scan
echo "- - -" > /sys/class/scsi_host/host2/scan
echo "- - -" > /sys/class/scsi_host/host3/scan
echo "- - -" > /sys/class/scsi_host/host4/scan
echo "- - -" > /sys/class/scsi_host/host5/scan

And then check following device:

ls -lt /sys/bus/scsi/devices/3\:0\:0\:4/block/

@sblyk
Copy link
Author

sblyk commented Dec 19, 2017

@andyzhangx this is the output

# echo "- - -" > /sys/class/scsi_host/host0/scan
# echo "- - -" > /sys/class/scsi_host/host1/scan
# echo "- - -" > /sys/class/scsi_host/host2/scan
# echo "- - -" > /sys/class/scsi_host/host3/scan
# echo "- - -" > /sys/class/scsi_host/host4/scan
# echo "- - -" > /sys/class/scsi_host/host5/scan
# ls -lt /sys/bus/scsi/devices/3\:0\:0\:4/block/
total 0
drwxr-xr-x 8 root root 0 Dec 19 01:58 sdm

@andyzhangx
Copy link
Contributor

@sblyk one workaround is reboot your agent which has this issue, just don't know why would your disk /dev/sdd would be changed to other dev name. Did you detach/attach disks manually?

@sblyk
Copy link
Author

sblyk commented Dec 19, 2017

@andyzhangx not only this agent,we have 6 agents,all have this issue.
and disk in storageAccount pvc300170136002 is created for test, so we did not detach/attach disks

@andyzhangx
Copy link
Contributor

@khenidak Did you got such case in the before that /dev/sd* changed to other dev name and k8s just did not detect it? In this case, /dev/sdd, /dev/sde, /dev/sdf, /dev/sdg, /dev/sdh changed to sdg, sdi, sdk, sdm, sdc.

@andyzhangx
Copy link
Contributor

@sblyk could you reboot agent to check whether that issue exists any more? Another question, I found LUN 1 is not used in this agent VM? is it the same for other VM? Thanks.

@sblyk
Copy link
Author

sblyk commented Dec 19, 2017

@andyzhangx
I've restarted the VM ,and this method is effective
but this problem seems to happen when creating new statefulset(mount disk) and deleting pod ( umount and mount to other VM ).
and in other agent VM

image

@andyzhangx
Copy link
Contributor

@sblyk thanks for the check. Could you share the complete statefulset config? I would like to do a deep investigation tomorrow. And pls note that azure disk could only be attached to one VM, there could be problem when migrate one pod from one vm to another vm, since disk detach and attachment would take a few minutes. I would suggest you use azure file instead, it could be mounted in multiple VMs. You could find azure file example here:
https://github.com/andyzhangx/Demo/tree/master/linux/azurefile

BTW, sudo systemctl restart kubelet (no need to reboot) would also solve your issue as a workaround.

@andyzhangx
Copy link
Contributor

@sblyk I have identified this is a bug for azure disk since it's using /dev/sd*, and /dev/sd* could be changed when detach/attach azure disks. Could you change this issue name to
unable to use azure disk in StatefulSet since /dev/sd* changed after detach/attach disk

@andyzhangx
Copy link
Contributor

@sblyk again, thanks for the reporting, I am now working on a fix in k8s upstream.

@sblyk sblyk changed the title Unable to use pv on AzureChinaCloud unable to use azure disk in StatefulSet since /dev/sd* changed after detach/attach disk Dec 21, 2017
k8s-github-robot pushed a commit to kubernetes/kubernetes that referenced this issue Jan 4, 2018
Automatic merge from submit-queue (batch tested with PRs 56382, 57549). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

fix azure disk not available issue when device name changed

**What this PR does / why we need it**:
There is possibility that device name(`/dev/sd*`) would change when attach/detach data disk in Azure VM according to  [Troubleshoot Linux VM device name change](https://docs.microsoft.com/en-us/azure/virtual-machines/linux/troubleshoot-device-names-problems). 
And We did hit this issue, see customer [case](Azure/acs-engine#1918). 
This PR would use `/dev/disk/by-id` instead of `/dev/sd*` for azure disk and `/dev/disk/by-id` would not change even device name changed.

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #57444

**Special notes for your reviewer**:
In a customer [case](Azure/acs-engine#1918), customer is unable to use azure disk in StatefulSet since /dev/sd* changed after detach/attach disk. 
we are using `/dev/sd*`(code is [here](https://github.com/kubernetes/kubernetes/blob/master/pkg/volume/azure_dd/azure_common_linux.go#L140)) to "mount -bind" k8s path, while `/dev/sd*` could be changed when VM is attach/detaching data disks, see [Troubleshoot Linux VM device name change](https://docs.microsoft.com/en-us/azure/virtual-machines/linux/troubleshoot-device-names-problems)
And I have also checked related AWS, GCE code, they are using `/dev/disk/by-id/` other than `/dev/sd*`, see [aws code](https://github.com/kubernetes/kubernetes/blob/master/pkg/volume/aws_ebs/aws_util.go#L228)  
[gce code](https://github.com/kubernetes/kubernetes/blob/master/pkg/volume/gce_pd/gce_util.go#L278)

**Release note**:

```
fix azure disk not available when device name changed
```
/sig azure
/assign @rootfs 
@karataliu @brendandburns @khenidak
@andyzhangx
Copy link
Contributor

@sblyk finally I fixed this issue, I wrote a doc describe details about this issue:
https://github.com/andyzhangx/Demo/blob/master/issues/README.md#2-disk-unavailable-after-attachdetach-a-data-disk-on-a-node

Fix or workaround:

  • add cachingmode: None in azure disk storage class(default is ReadWrite), e.g.
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
  name: hdd
provisioner: kubernetes.io/azure-disk
parameters:
  skuname: Standard_LRS
  kind: Managed
  cachingmode: None

k8s-github-robot pushed a commit to kubernetes/kubernetes that referenced this issue Feb 25, 2018
Automatic merge from submit-queue (batch tested with PRs 60346, 60135, 60289, 59643, 52640). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

fix device name change issue for azure disk

**What this PR does / why we need it**:
fix device name change issue for azure disk due to default host cache setting changed from None to ReadWrite from v1.7, and default host cache setting in azure portal is `None`

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #60344, #57444
also fixes following issues:
Azure/acs-engine#1918
Azure/AKS#201

**Special notes for your reviewer**:
From v1.7, default host cache setting changed from None to ReadWrite, this would lead to device name change after attach multiple disks on azure vm, finally lead to disk unaccessiable from pod.
For an example:
statefulset with 8 replicas(each with an azure disk) on one node will always fail, according to my observation, add the 6th data disk will always make dev name change, some pod could not access data disk after that.

I have verified this fix on v1.8.4
Without this PR on one node(dev name changes):
```
azureuser@k8s-agentpool2-40588258-0:~$ tree /dev/disk/azure
...
└── scsi1
    ├── lun0 -> ../../../sdk
    ├── lun1 -> ../../../sdj
    ├── lun2 -> ../../../sde
    ├── lun3 -> ../../../sdf
    ├── lun4 -> ../../../sdg
    ├── lun5 -> ../../../sdh
    └── lun6 -> ../../../sdi
```

With this PR on one node(no dev name change):
```
azureuser@k8s-agentpool2-40588258-1:~$ tree /dev/disk/azure
...
└── scsi1
    ├── lun0 -> ../../../sdc
    ├── lun1 -> ../../../sdd
    ├── lun2 -> ../../../sde
    ├── lun3 -> ../../../sdf
    ├── lun5 -> ../../../sdh
    └── lun6 -> ../../../sdi
```

Following `myvm-0`, `myvm-1` is crashing due to dev name change, after controller manager replacement, myvm2-x  pods work well.

```
Every 2.0s: kubectl get po                                                                                                                                                   Sat Feb 24 04:16:26 2018

NAME      READY     STATUS             RESTARTS   AGE
myvm-0    0/1       CrashLoopBackOff   13         41m
myvm-1    0/1       CrashLoopBackOff   11         38m
myvm-2    1/1       Running            0          35m
myvm-3    1/1       Running            0          33m
myvm-4    1/1       Running            0          31m
myvm-5    1/1       Running            0          29m
myvm-6    1/1       Running            0          26m

myvm2-0   1/1       Running            0          17m
myvm2-1   1/1       Running            0          14m
myvm2-2   1/1       Running            0          12m
myvm2-3   1/1       Running            0          10m
myvm2-4   1/1       Running            0          8m
myvm2-5   1/1       Running            0          5m
myvm2-6   1/1       Running            0          3m
```

**Release note**:

```
fix device name change issue for azure disk
```
/assign @karataliu 
/sig azure
@feiskyer  could you mark it as v1.10 milestone?
@brendandburns @khenidak @rootfs @jdumars FYI

Since it's a critical bug, I will cherry pick this fix to v1.7-v1.9, note that v1.6 does not have this issue since default cachingmode is `None`
@sblyk
Copy link
Author

sblyk commented Mar 1, 2018

@andyzhangx Thank you!I tested it yesterday.
add cachingmode: None in azure disk storage class
now it works well

@andyzhangx
Copy link
Contributor

@sblyk would you close this issue then?

@fauzan-n
Copy link

@sblyk finally I fixed this issue, I wrote a doc describe details about this issue:
https://github.com/andyzhangx/Demo/blob/master/issues/README.md#2-disk-unavailable-after-attachdetach-a-data-disk-on-a-node

Fix or workaround:

  • add cachingmode: None in azure disk storage class(default is ReadWrite), e.g.
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
  name: hdd
provisioner: kubernetes.io/azure-disk
parameters:
  skuname: Standard_LRS
  kind: Managed
  cachingmode: None

Is this change save for the disk? I mean what happened to the data after changing the cachingmode?

@andyzhangx
Copy link
Contributor

@foosome
NOTE: Azure platform has fixed the host cache issue, the suggested host cache setting of data disk is ReadOnly now, more details about azure disk cache setting Issue details:

what's your issue here?

@fauzan-n
Copy link

@foosome
NOTE: Azure platform has fixed the host cache issue, the suggested host cache setting of data disk is ReadOnly now, more details about azure disk cache setting Issue details:

what's your issue here?

I 'm using Alibaba Cloud

Solved with creating StorageClass with adding cachingmode parameter

Thank You @andyzhangx !

@andyzhangx
Copy link
Contributor

@foosome but cachingmode is for Azure cloud? Alibaba Cloud has same issue?

@fauzan-n
Copy link

fauzan-n commented Mar 2, 2019

@foosome but cachingmode is for Azure cloud? Alibaba Cloud has same issue?

Yes, with default storageclass I got the same issue.

caching mode is defined on StorageClass manifest so it's in kubernetes layer. I think azure and alibaba cloud has this same issue because they use same default StorageClass on Kubernetes

@andyzhangx
Copy link
Contributor

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
3 participants