Skip to content

Commit

Permalink
Expose zfs_vdev_open_timeout_ms as a tunable
Browse files Browse the repository at this point in the history
Some of our customers have been occasionally hitting zfs import failures
in Linux because udevd doesn't create the by-id symbolic links in time
for zpool import to use them. The main issue is that the
systemd-udev-settle.service that zfs-import-cache.service and other
services depend on is racy. There is also an openzfs issue filed (see
#10891) outlining the problem and
potential solutions.

With the proper solutions being significant in terms of complexity and
the priority of the issue being low for the time being, this patch
exposes `zfs_vdev_open_timeout_ms` as a tunable so people that are
experiencing this issue often can increase it as a workaround.

Signed-off-by: Serapheim Dimitropoulos <serapheim@delphix.com>
  • Loading branch information
sdimitro committed Nov 3, 2022
1 parent da3d266 commit e5fcb7b
Show file tree
Hide file tree
Showing 2 changed files with 11 additions and 1 deletion.
7 changes: 7 additions & 0 deletions man/man4/zfs.4
Original file line number Diff line number Diff line change
Expand Up @@ -1249,6 +1249,13 @@ Ideally, this will be at least the sum of each queue's
.Sy max_active .
.No See Sx ZFS I/O SCHEDULER .
.
.It Sy zfs_vdev_open_timeout_ms Ns = Ns Sy 1000 Pq uint
Timeout value to wait before determining a device is missing
during import.
This is helpful for transient missing paths due
to links being briefly removed and recreated in response to
udev events.
.
.It Sy zfs_vdev_rebuild_max_active Ns = Ns Sy 3 Pq uint
Maximum sequential resilver I/O operations active to each device.
.No See Sx ZFS I/O SCHEDULER .
Expand Down
5 changes: 4 additions & 1 deletion module/os/linux/zfs/vdev_disk.c
Original file line number Diff line number Diff line change
Expand Up @@ -56,7 +56,7 @@ static void *zfs_vdev_holder = VDEV_HOLDER;
* device is missing. The missing path may be transient since the links
* can be briefly removed and recreated in response to udev events.
*/
static unsigned zfs_vdev_open_timeout_ms = 1000;
static uint_t zfs_vdev_open_timeout_ms = 1000;

/*
* Size of the "reserved" partition, in blocks.
Expand Down Expand Up @@ -1042,3 +1042,6 @@ param_set_max_auto_ashift(const char *buf, zfs_kernel_param_t *kp)

return (0);
}

ZFS_MODULE_PARAM(zfs_vdev, zfs_vdev_, open_timeout_ms, UINT, ZMOD_RW,
"Timeout before determining that a device is missing");

0 comments on commit e5fcb7b

Please sign in to comment.