Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ZIL IO hung due to deadlock on spa config lock #15078

Closed
prakashsurya opened this issue Jul 18, 2023 · 1 comment
Closed

ZIL IO hung due to deadlock on spa config lock #15078

prakashsurya opened this issue Jul 18, 2023 · 1 comment
Labels
Type: Defect Incorrect behavior (e.g. crash, hang)

Comments

@prakashsurya
Copy link
Member

I believe we've uncovered a bug resulting from the changes made in f63811f.

We have a system that's stuck with the following stacks:

0xffff8911c9862000 UNINTERRUPTIBLE       1
                  __schedule+0x2c8
                  __schedule+0x2c8
                  schedule+0x69
                  cv_wait_common+0x16e
                  __cv_wait+0x15
                  spa_config_enter_impl.constprop.0+0x11a
                  spa_config_enter+0x12
                  spa_vdev_state_enter+0x88
                  zfs_ioc_pool_reopen+0x63
                  zfsdev_ioctl_common+0x295
                  zfsdev_ioctl+0x57
                  __x64_sys_ioctl+0x92
                  __x64_sys_ioctl+0x92
                  __x64_sys_ioctl+0x92
                  __x64_sys_ioctl+0x92
                  do_syscall_64+0x59
                  do_syscall_64+0x59
                  entry_SYSCALL_64+0x99

i.e. this thread attempting to get the lock as writer..

0xffff8912c5266000 UNINTERRUPTIBLE       1
                  __schedule+0x2c8
                  __schedule+0x2c8
                  schedule+0x69
                  cv_wait_common+0x16e
                  __cv_wait+0x15
                  spa_config_enter_impl.constprop.0+0x1de
                  spa_config_enter+0x12
                  spa_sync_rewrite_vdev_config+0x5b
                  spa_sync+0x37b
                  txg_sync_thread+0x27a
                  thread_generic_wrapper+0x6c
                  kthread+0x127
                  ret_from_fork+0x1f
                  0x0+0x0

spa_sync also trying to get this lock..

and then we have ZIL threads also trying to get the lock.. e.g..

0xffff89120363a000 UNINTERRUPTIBLE      20
                  __schedule+0x2c8
                  __schedule+0x2c8
                  schedule+0x69
                  cv_wait_common+0x16e
                  __cv_wait+0x15
                  spa_config_enter_impl.constprop.0+0x1de
                  spa_config_enter+0x12
                  zil_lwb_write_issue+0x19f
                  zil_commit_writer+0xe9
                  zil_commit_impl+0x62
                  zil_commit+0x79
                  zfs_write+0xbad
                  zpl_iter_write+0xe7
                  do_iter_readv_writev+0x14f
                  do_iter_readv_writev+0x14f
                  do_iter_write+0x8c
                  do_iter_write+0x8c
                  vfs_iter_write+0x19
                  nfsd_vfs_write+0x149
                  nfsd4_write+0x130
                  nfsd4_proc_compound+0x415
                  nfsd_dispatch+0x16c
                  svc_process_common+0x3d5
                  svc_process+0xbc
                  nfsd+0xed
                  kthread+0x127
                  ret_from_fork+0x1f
                  0x0+0x0
0xffff8912c0276000 UNINTERRUPTIBLE       1
                  __schedule+0x2c8
                  __schedule+0x2c8
                  schedule+0x69
                  cv_wait_common+0x16e
                  __cv_wait+0x15
                  spa_config_enter_impl.constprop.0+0x1de
                  spa_config_enter+0x12
                  zil_lwb_write_issue+0x19f
                  zil_commit_waiter_timeout+0x1e8
                  zil_commit_waiter+0x215
                  zil_commit_impl+0x6d
                  zil_commit+0x79
                  zfs_fsync+0x89
                  zpl_fsync+0xe6
                  vfs_fsync_range+0x46
                  do_fsync+0x3d
                  do_fsync+0x3d
                  __x64_sys_fdatasync+0x17
                  __x64_sys_fdatasync+0x17
                  __x64_sys_fdatasync+0x17
                  do_syscall_64+0x59
                  do_syscall_64+0x59
                  entry_SYSCALL_64+0x99

Once in this state, the system remains in this state indefinitely.. deadlocked.

What we found, is we believe the config lock is held (as reader) by ZIL IOs that have not yet completed.. there's a lot of ZIL IOs pending..

sdb> zio -r | member io_done ! grep zil_lwb | sort | uniq -c
     32 (zio_done_func_t *)zil_lwb_flush_vdevs_done+0x0 = 0xffffffffc03a8620
    183 (zio_done_func_t *)zil_lwb_write_done+0x0 = 0xffffffffc03a5d60

With trees of ZIOs like this:

     0xffff8912d76b3a20        WRITE OPEN             -                             -
      0xffff8911c55e09b0       WRITE OPEN             -                             -
       0xffff89111508ab98      WRITE ISSUE_ASYNC      -                             -
        0xffff891115089838     WRITE ISSUE_ASYNC      -                             -
         0xffff89121cc884d8    WRITE ISSUE_ASYNC      -                             -
          0xffff89121cc88e88   WRITE ISSUE_ASYNC      -                             -
           0xffff89121cc8a1e8  WRITE ISSUE_ASYNC      -                             -
            0xffff8912cb470e88 WRITE ISSUE_ASYNC      -                             -
             0xffff8912cb471d10 WRITE OPEN             -                             -
    0xffff8912d76b21e8         WRITE OPEN             -                             -
     0xffff8912d76b3a20        WRITE OPEN             -                             -
      0xffff8911c55e09b0       WRITE OPEN             -                             -
       0xffff89111508ab98      WRITE ISSUE_ASYNC      -                             -
        0xffff891115089838     WRITE ISSUE_ASYNC      -                             -
         0xffff89121cc884d8    WRITE ISSUE_ASYNC      -                             -
          0xffff89121cc88e88   WRITE ISSUE_ASYNC      -                             -
           0xffff89121cc8a1e8  WRITE ISSUE_ASYNC      -                             -
            0xffff8912cb470e88 WRITE ISSUE_ASYNC      -                             -
             0xffff8912cb471d10 WRITE OPEN             -                             -
   0xffff8912dbbf5838          WRITE OPEN             -                             -
    0xffff8912d76b21e8         WRITE OPEN             -                             -
     0xffff8912d76b3a20        WRITE OPEN             -                             -
      0xffff8911c55e09b0       WRITE OPEN             -                             -
       0xffff89111508ab98      WRITE ISSUE_ASYNC      -                             -
        0xffff891115089838     WRITE ISSUE_ASYNC      -                             -
         0xffff89121cc884d8    WRITE ISSUE_ASYNC      -                             -
          0xffff89121cc88e88   WRITE ISSUE_ASYNC      -                             -
           0xffff89121cc8a1e8  WRITE ISSUE_ASYNC      -                             -
            0xffff8912cb470e88 WRITE ISSUE_ASYNC      -                             -
             0xffff8912cb471d10 WRITE OPEN             -                             -

It's a hard to see from this output, but we verified these ZIOs to be for ZIL blocks.

So, at this point, the question we tried to solve, is why aren't these ZIOs completing? From what we can tell, that's the cause of the deadlock.. i.e. we have ZIL blocks that took the config lock as reader (i.e. acquired in zil_lwb_write_issue), got issued, but never dropped that lock because they never completed (i.e. would have dropped in zil_lwb_flush_vdevs_done).. All other stuck threads, are stuck because these ZIL IOs haven't dropped the lock..

I believe what is happening, is, prior to f63811f, we would have issued all LWBs in the order that they get allocated.. due to us calling zil_lwb_write_issue while holding the zl_issuer_lock.. meaning, that from a ZIO parent/child perspective, the child ZIOs would always get issued prior to their parents getting issued..

Meaning, by the time the parent ZIO tries to get the config lock in zil_lwb_write_issue, the child ZIO for it would have necessarily taken its reader hold on the config lock for itself.. i.e. the child ZIO would always get a reader lock before the parent gets its own reader lock..

I think, due to the changes in f63811f, this isn't necessarily true anymore. Specifically, now we drop the zl_issuer_lock before we call zil_lwb_write_issue.. such that, it's possible for the thread to call schedule in between dropping the zl_issuer_lock, and actually issuing the LWBs..

While that thread is sleeping.. it's also possible for another thread to acquire the zl_issuer_lock and call zil_lwb_write_issue on it's LWBs.. all while the first thread is still sleeping, and hasn't yet issued those original LWBs..

Meaning, the second thread's LWBs will be parents of the first thread's LWBs (from the perspective of the LWB root ZIOs).. and these parent LWBs would get issued (and thus, grab the config lock as reader) prior to the child LWBs being issued..

Further, if those child LWBs are then unable to get the config lock as reader (as is our case), then the child ZIOs will never complete, and they'll prevent the parent ZIOs from completing as well.

Since that's a little convoluted, I want to be clear.. Here's what is happening:

  1. Thread A calls zil_commit_writer, gets the zl_issuer_lock, builds up its ilwbs list, drops the zl_issuer_lock, then goes to sleep before processing and issuing LWBs from the ilwbs list..
  2. Thread B calls zil_commit_writer, gets the zl_issuer_lock, builds up it's own ilwbslist, dropszl_issuer_lock, calls zil_lwb_write_issueon all of itsilwbs`..
  3. Thread C (unrelated to the ZIL) tries to get the config lock as writer (e.g. zfs_ioc_pool_reopen is called), but is blocked due to ZIOs from (2) holding the lock as reader, so it waits..
  4. Thread A wakes up, calls zil_lwb_write_issue and tries to get the config lock as reader, but is blocked due to thread from (3) trying to get the lock as writer, so it waits..

At this point, the system is deadlocked.. the ZIOs from (4) are children of the ZIOs from (2).. the ZIOs from (4) are blocked on the thread from (3).. the thread from (3) is blocked by the ZIOs from (2).. and the ZIOs from (2) are blocked waiting on the ZIOs from (4).. so we're deadlocked..

I've verified this to be true.. for example, we can take this tree of ZIOs from earlier:

   0xffff8912dbbf5838          WRITE OPEN             -                             -
    0xffff8912d76b21e8         WRITE OPEN             -                             -
     0xffff8912d76b3a20        WRITE OPEN             -                             -
      0xffff8911c55e09b0       WRITE OPEN             -                             -
       0xffff89111508ab98      WRITE ISSUE_ASYNC      -                             -
        0xffff891115089838     WRITE ISSUE_ASYNC      -                             -
         0xffff89121cc884d8    WRITE ISSUE_ASYNC      -                             -
          0xffff89121cc88e88   WRITE ISSUE_ASYNC      -                             -
           0xffff89121cc8a1e8  WRITE ISSUE_ASYNC      -                             -
            0xffff8912cb470e88 WRITE ISSUE_ASYNC      -                             -
             0xffff8912cb471d10 WRITE OPEN             -                             -

In this case, 0xffff8912cb471d10 is a child of 0xffff8912cb470e88.. since 0xffff8912cb470e88 is in the ZIO pipeline, we know it must hold the config lock as reader (e.g. it's a parent ZIO from (2) in prior example).. 0xffff8912cb471d10 is a child ZIO that would have been created in (1) of the prior example, but hasn't been issued yet.. we can verify it's stuck as I described in (4), by getting the lwb_t from that zio_t, and then mapping that pointer to a thread on the system:

sdb> echo 0xffff8912cb471d10 | zio | member io_private
(void *)0xffff891202b2a620
Thread 675 (pid 1586):
#0  0xffffffffa3a3a568 in context_switch (rf=0xffff9b7cc559b7c0, next=0xffff891204ca4000, prev=0xffff8912057c2000, rq=0xffff8912f9730ac0) at linux/kernel/sched/core.c:4994
#1  0xffffffffa3a3a568 in __schedule (sched_mode=sched_mode@entry=0) at linux/kernel/sched/core.c:6341
#2  0xffffffffa3a3ab69 in schedule () at linux/kernel/sched/core.c:6424
#3  0xffffffffc012d78e in cv_wait_common (cvp=cvp@entry=0xffff891245c261c0, mp=mp@entry=0xffff891245c26180, state=state@entry=2, io=io@entry=0) at zfs/module/os/linux/spl/spl-condvar.c:146
#4  0xffffffffc012d8d5 in __cv_wait (cvp=cvp@entry=0xffff891245c261c0, mp=mp@entry=0xffff891245c26180) at zfs/module/os/linux/spl/spl-condvar.c:171
#5  0xffffffffc031d3de in spa_config_enter_impl (spa=0xffff891245c24000, locks=locks@entry=2, rw=rw@entry=RW_READER, mmp_flag=mmp_flag@entry=0, tag=0xffff891202b2a620) at zfs/module/zfs/spa_misc.c:613
#6  0xffffffffc031d4d2 in spa_config_enter (spa=<optimized out>, locks=locks@entry=2, tag=tag@entry=0xffff891202b2a620, rw=rw@entry=RW_READER) at zfs/module/zfs/spa_misc.c:643
#7  0xffffffffc03a6a7f in zil_lwb_write_issue (zilog=zilog@entry=0xffff8911107f7800, lwb=0xffff891202b2a620) at zfs/module/zfs/zil.c:1947
#8  0xffffffffc03ac5e9 in zil_commit_writer (zilog=zilog@entry=0xffff8911107f7800, zcw=zcw@entry=0xffff8912c5e9d720) at zfs/module/zfs/zil.c:3009
#9  0xffffffffc03ad112 in zil_commit_impl (zilog=zilog@entry=0xffff8911107f7800, foid=foid@entry=139) at zfs/module/zfs/zil.c:3540
#10 0xffffffffc03ad1e9 in zil_commit (zilog=zilog@entry=0xffff8911107f7800, foid=139) at zfs/module/zfs/zil.c:3503
#11 0xffffffffc03a1aed in zfs_write (zp=zp@entry=0xffff8912db672c60, uio=uio@entry=0xffff9b7cc559bb98, ioflag=1085442, cr=cr@entry=0xffff8912756e0d80) at zfs/module/zfs/zfs_vnops.c:786
#12 0xffffffffc0406447 in zpl_iter_write (kiocb=0xffff9b7cc559bc48, from=<optimized out>) at zfs/module/os/linux/zfs/zpl_file.c:392
#13 0xffffffffa317ca9f in call_write_iter (file=<unavailable>, iter=<unavailable>, kio=<unavailable>) at linux/include/linux/fs.h:2097
#14 0xffffffffa317ca9f in do_iter_readv_writev (filp=<unavailable>, filp@entry=0xffff89125443f300, iter=<unavailable>, iter@entry=0xffff9b7cc559bd28, ppos=ppos@entry=0xffff9b7cc559bd20, type=<unavailable>, type@entry=1, flags=flags@entry=4) at linux/fs/read_write.c:731
#15 0xffffffffa317de3c in do_iter_write (flags=4, pos=0xffff9b7cc559bd20, iter=0xffff9b7cc559bd28, file=0xffff89125443f300) at linux/fs/read_write.c:857
#16 0xffffffffa317de3c in do_iter_write (file=0xffff89125443f300, iter=0xffff9b7cc559bd28, pos=0xffff9b7cc559bd20, flags=4) at linux/fs/read_write.c:838
#17 0xffffffffa317df99 in vfs_iter_write (file=file@entry=0xffff89125443f300, iter=iter@entry=0xffff9b7cc559bd28, ppos=ppos@entry=0xffff9b7cc559bd20, flags=flags@entry=4) at linux/fs/read_write.c:898
#18 0xffffffffc08d0cd9 in nfsd_vfs_write (rqstp=rqstp@entry=0xffff891204fd0000, fhp=fhp@entry=0xffff8912057d0030, nf=<optimized out>, offset=71827456, vec=vec@entry=0xffff891204fd0b88, vlen=<optimized out>, cnt=0xffff9b7cc559bdb0, stable=2, verf=0xffff8912057c9210) at linux/fs/nfsd/vfs.c:1009
#19 0xffffffffc08e1620 in nfsd4_write (rqstp=0xffff891204fd0000, cstate=0xffff8912057d0030, u=0xffff8912057c91a0) at linux/fs/nfsd/nfs4proc.c:1044
#20 0xffffffffc08e3f05 in nfsd4_proc_compound (rqstp=0xffff891204fd0000) at linux/fs/nfsd/nfs4proc.c:2558
#21 0xffffffffc08ca9cc in nfsd_dispatch (rqstp=0xffff891204fd0000, statp=0xffff891218d9d014) at linux/fs/nfsd/nfssvc.c:1037
#22 0xffffffffc07df255 in svc_process_common (rqstp=rqstp@entry=0xffff891204fd0000, argv=argv@entry=0xffff891204fd01a8, resv=resv@entry=0xffff891204fd0288) at linux/net/sunrpc/svc.c:1402
#23 0xffffffffc07df65c in svc_process (rqstp=rqstp@entry=0xffff891204fd0000) at linux/net/sunrpc/svc.c:1542
#24 0xffffffffc08ca28d in nfsd (vrqstp=0xffff891204fd0000) at linux/fs/nfsd/nfssvc.c:969
#25 0xffffffffa2ee6f57 in kthread (_create=0xffff891203ca2600) at linux/kernel/kthread.c:319
#26 0xffffffffa2e04c3f in ret_from_fork () at linux/arch/x86/entry/entry_64.S:298

Here we can see the lwb pointer matches the frame:

#7  0xffffffffc03a6a7f in zil_lwb_write_issue (zilog=zilog@entry=0xffff8911107f7800, lwb=0xffff891202b2a620) at zfs/module/zfs/zil.c:1947

and we can see this thread is stuck in zil_lwb_write_issue trying to get the config lock as reader.. stuck because this other thread is trying to get the config lock as writer:

Thread 719 (pid 7956):
#0  0xffffffffa3a3a568 in context_switch (rf=0xffff9b7cc3a63be8, next=0xffff891219658000, prev=0xffff8911c9862000, rq=0xffff8912f9630ac0) at linux/kernel/sched/core.c:4994
#1  0xffffffffa3a3a568 in __schedule (sched_mode=sched_mode@entry=0) at linux/kernel/sched/core.c:6341
#2  0xffffffffa3a3ab69 in schedule () at linux/kernel/sched/core.c:6424
#3  0xffffffffc012d78e in cv_wait_common (cvp=cvp@entry=0xffff891245c261c0, mp=mp@entry=0xffff891245c26180, state=state@entry=2, io=io@entry=0) at zfs/module/os/linux/spl/spl-condvar.c:146
#4  0xffffffffc012d8d5 in __cv_wait (cvp=cvp@entry=0xffff891245c261c0, mp=mp@entry=0xffff891245c26180) at zfs/module/os/linux/spl/spl-condvar.c:171
#5  0xffffffffc031d31a in spa_config_enter_impl (spa=spa@entry=0xffff891245c24000, locks=locks@entry=22, rw=rw@entry=RW_WRITER, mmp_flag=mmp_flag@entry=0, tag=0xffff891245c24000) at zfs/module/zfs/spa_misc.c:629
#6  0xffffffffc031d4d2 in spa_config_enter (spa=spa@entry=0xffff891245c24000, locks=locks@entry=22, tag=tag@entry=0xffff891245c24000, rw=rw@entry=RW_WRITER) at zfs/module/zfs/spa_misc.c:643
#7  0xffffffffc031d568 in spa_vdev_state_enter (spa=0xffff891245c24000, oplocks=oplocks@entry=0) at zfs/module/zfs/spa_misc.c:1472
#8  0xffffffffc038e453 in zfs_ioc_pool_reopen (pool=0xffff8912d391c000 "domain0", innvl=<optimized out>, outnvl=<optimized out>) at zfs/module/zfs/zfs_ioctl.c:5871
#9  0xffffffffc0396705 in zfsdev_ioctl_common (vecnum=vecnum@entry=61, zc=zc@entry=0xffff8912d391c000, flag=flag@entry=0) at zfs/module/zfs/zfs_ioctl.c:7800
#10 0xffffffffc03e4c27 in zfsdev_ioctl (filp=<optimized out>, cmd=23101, arg=140289147176688) at zfs/module/os/linux/zfs/zfs_ioctl_os.c:144
#11 0xffffffffa3198682 in vfs_ioctl (arg=140289147176688, cmd=<optimized out>, filp=0xffff891208b9e800) at linux/fs/ioctl.c:51
#12 0xffffffffa3198682 in __do_sys_ioctl (arg=140289147176688, cmd=23101, fd=6) at linux/fs/ioctl.c:874
#13 0xffffffffa3198682 in __se_sys_ioctl (arg=140289147176688, cmd=23101, fd=6) at linux/fs/ioctl.c:860
#14 0xffffffffa3198682 in __x64_sys_ioctl (regs=<optimized out>) at linux/fs/ioctl.c:860
#15 0xffffffffa3a2c709 in do_syscall_x64 (nr=<optimized out>, regs=0xffff9b7cc3a63f58) at linux/arch/x86/entry/common.c:50
#16 0xffffffffa3a2c709 in do_syscall_64 (regs=0xffff9b7cc3a63f58, nr=<optimized out>) at linux/arch/x86/entry/common.c:80
#17 0xffffffffa3c00099 in entry_SYSCALL_64 () at linux/arch/x86/entry/entry_64.S:118

FWIW, the unrelated (to the ZIL) thread attempting to get the config lock as a writer is necessary to trigger the deadlock.. which might not be a common case for most users, but actually is common for Delphix.. which might help explain why we've seen this, but perhaps not others.

Further, I think what this boils down to is.. due to how we obtain the config lock as reader for each LWB that we issue, and the parent/child relationship of LWB ZIOs.. I think we need to issue the LWBs in parent/child order.. i.e. always issue the child ZIOs before the parents, as we did prior to f63811f.. so that we cannot get into a situation where a parent ZIO holds the config lock, and a child ZIO can't get it.. or, perhaps, change how we acquire the config lock when issuing the ZIOs..

@amotin does this make sense to you?

@prakashsurya prakashsurya added the Type: Defect Incorrect behavior (e.g. crash, hang) label Jul 18, 2023
amotin added a commit to amotin/zfs that referenced this issue Jul 18, 2023
When we have some LWBs closed and their ZIOs ready to be issued, we
can not afford sleeping on config lock if somebody else try to lock
it as writer, or it will cause a deadlock.

To solve it, move spa_config_enter() from zil_lwb_write_issue() to
zil_lwb_write_close() under zl_issuer_lock to enforce lock ordering
with other threads.  Now if we can't immediately lock config, issue
all previously closed LWBs so that they could drop their config
locks after completion, and only then allow sleeping on our lock.

Signed-off-by:  Alexander Motin <mav@FreeBSD.org>
Sponsored by:   iXsystems, Inc.
Issue openzfs#15078
@amotin
Copy link
Member

amotin commented Jul 18, 2023

@prakashsurya Thank you for the deep analysis. It makes sense to me. Please take a look on #15080. I think it should fix the problem with smallest complications.

amotin added a commit to amotin/zfs that referenced this issue Jul 20, 2023
When we have some LWBs closed and their ZIOs ready to be issued, we
can not afford sleeping on config lock if somebody else try to lock
it as writer, or it will cause a deadlock.

To solve it, move spa_config_enter() from zil_lwb_write_issue() to
zil_lwb_write_close() under zl_issuer_lock to enforce lock ordering
with other threads.  Now if we can't immediately lock config, issue
all previously closed LWBs so that they could drop their config
locks after completion, and only then allow sleeping on our lock.

Signed-off-by:  Alexander Motin <mav@FreeBSD.org>
Sponsored by:   iXsystems, Inc.
Issue openzfs#15078
amotin added a commit to amotin/zfs that referenced this issue Jul 24, 2023
When we have some LWBs closed and their ZIOs ready to be issued, we
can not afford sleeping on config lock if somebody else try to lock
it as writer, or it will cause a deadlock.

To solve it, move spa_config_enter() from zil_lwb_write_issue() to
zil_lwb_write_close() under zl_issuer_lock to enforce lock ordering
with other threads.  Now if we can't immediately lock config, issue
all previously closed LWBs so that they could drop their config
locks after completion, and only then allow sleeping on our lock.

Signed-off-by:  Alexander Motin <mav@FreeBSD.org>
Sponsored by:   iXsystems, Inc.
Issue openzfs#15078
amotin added a commit to amotin/zfs that referenced this issue Jul 25, 2023
When we have some LWBs closed and their ZIOs ready to be issued, we
can not afford sleeping on config lock if somebody else try to lock
it as writer, or it will cause a deadlock.

To solve it, move spa_config_enter() from zil_lwb_write_issue() to
zil_lwb_write_close() under zl_issuer_lock to enforce lock ordering
with other threads.  Now if we can't immediately lock config, issue
all previously closed LWBs so that they could drop their config
locks after completion, and only then allow sleeping on our lock.

Reviewed-by: Mark Maybee <mark.maybee@delphix.com>
Reviewed-by: Prakash Surya <prakash.surya@delphix.com>
Reviewed-by: George Wilson <george.wilson@delphix.com>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Sponsored by: iXsystems, Inc.
Closes openzfs#15078
Closes openzfs#15080
behlendorf pushed a commit that referenced this issue Jul 25, 2023
When we have some LWBs closed and their ZIOs ready to be issued, we
can not afford sleeping on config lock if somebody else try to lock
it as writer, or it will cause a deadlock.

To solve it, move spa_config_enter() from zil_lwb_write_issue() to
zil_lwb_write_close() under zl_issuer_lock to enforce lock ordering
with other threads.  Now if we can't immediately lock config, issue
all previously closed LWBs so that they could drop their config
locks after completion, and only then allow sleeping on our lock.

Reviewed-by: Mark Maybee <mark.maybee@delphix.com>
Reviewed-by: Prakash Surya <prakash.surya@delphix.com>
Reviewed-by: George Wilson <george.wilson@delphix.com>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Sponsored by: iXsystems, Inc.
Closes #15078
Closes #15080
lundman pushed a commit to openzfsonwindows/openzfs that referenced this issue Dec 12, 2023
When we have some LWBs closed and their ZIOs ready to be issued, we
can not afford sleeping on config lock if somebody else try to lock
it as writer, or it will cause a deadlock.

To solve it, move spa_config_enter() from zil_lwb_write_issue() to
zil_lwb_write_close() under zl_issuer_lock to enforce lock ordering
with other threads.  Now if we can't immediately lock config, issue
all previously closed LWBs so that they could drop their config
locks after completion, and only then allow sleeping on our lock.

Reviewed-by: Mark Maybee <mark.maybee@delphix.com>
Reviewed-by: Prakash Surya <prakash.surya@delphix.com>
Reviewed-by: George Wilson <george.wilson@delphix.com>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Sponsored by: iXsystems, Inc.
Closes openzfs#15078
Closes openzfs#15080
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Type: Defect Incorrect behavior (e.g. crash, hang)
Projects
None yet
Development

No branches or pull requests

2 participants