Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 2 Jan 2012 08:15:00 GMT
From:      Boris Lytochkin <lytboris@yandex-team.ru>
To:        freebsd-gnats-submit@FreeBSD.org
Subject:   kern/163770: LOR between zfs&syncer + vnlru leading to ZFS deadlock
Message-ID:  <201201020815.q028F07t062585@red.freebsd.org>
Resent-Message-ID: <201201020820.q028K8E0044943@freefall.freebsd.org>

next in thread | raw e-mail | index | archive | help

>Number:         163770
>Category:       kern
>Synopsis:       LOR between zfs&syncer + vnlru leading to ZFS deadlock
>Confidential:   no
>Severity:       critical
>Priority:       medium
>Responsible:    freebsd-bugs
>State:          open
>Quarter:        
>Keywords:       
>Date-Required:
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Mon Jan 02 08:20:08 UTC 2012
>Closed-Date:
>Last-Modified:
>Originator:     Boris Lytochkin
>Release:        RELENG_8
>Organization:
Yandex
>Environment:
FreeBSD skylla.yandex.net 8.2-STABLE FreeBSD 8.2-STABLE #7: Tue Dec 27 19:54:33 MSK 2011     lytboris@skylla.yandex.net:/usr/obj/usr/src/sys/CACTI  amd64
>Description:
Deadlocks are seen periodically, most of them can be triggered with low kern.maxvnodes and running /etc/periodic/security/100.chksetuid (a big find in it).
LOR itself as seen couple of minutes after server boots into multiuser:
--- syscall (5, FreeBSD ELF64, open), rip = 0x800f8666c, rsp = 0x7fffffffe8d8, rbp = 0x1b0 ---
lock order reversal:
 1st 0xffffff00508c1098 syncer (syncer) @ /usr/src/sys/kern/vfs_subr.c:1737
 2nd 0xffffff108f8abba8 zfs (zfs) @ /usr/src/sys/kern/vfs_subr.c:2137
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2a
kdb_backtrace() at kdb_backtrace+0x37
_witness_debugger() at _witness_debugger+0x2c
witness_checkorder() at witness_checkorder+0x651
__lockmgr_args() at __lockmgr_args+0xb98
vop_stdlock() at vop_stdlock+0x39
VOP_LOCK1_APV() at VOP_LOCK1_APV+0x52
_vn_lock() at _vn_lock+0x47
vget() at vget+0x56
vfs_msync() at vfs_msync+0xa5
sync_fsync() at sync_fsync+0x12a
VOP_FSYNC_APV() at VOP_FSYNC_APV+0x4a
sync_vnode() at sync_vnode+0x157
sched_sync() at sched_sync+0x1b1
fork_exit() at fork_exit+0x11d
fork_trampoline() at fork_trampoline+0xe
--- trap 0, rip = 0, rsp = 0xffffff9b0b285d00, rbp = 0 ---

When deadlock appears, syncer gets stuck in zilog sync:
procstat:   20 100179 vnlru            -                mi_switch sleepq_timedwait _sleep zfs_zget zfs_get_data zil_commit zfs_freebsd_write VOP_WRITE_APV vnode_pager_generic_putpages VOP_PUTPAGES_APV vnode_pager_putpages vm_pageout_flush vm_object_page_collect_flush vm_object_page_clean vm_object_terminate vnode_destroy_vobject zfs_freebsd_reclaim VOP_RECLAIM_APV
procstat:   21 100180 syncer           -                mi_switch sleepq_wait _cv_wait zil_commit zfs_sync sync_fsync VOP_FSYNC_APV sync_vnode sched_sync fork_exit fork_trampoline
debug.procstat:63429 101808 rrdtool          -                mi_switch sleepq_catch_signals sleepq_wait_sig _sleep pipe_read dofileread kern_readv read syscallenter syscall Xfast_syscall
procstat:63430 101562 rrdtool          -                mi_switch sleepq_catch_signals sleepq_wait_sig _sleep pipe_read dofileread kern_readv read syscallenter syscall Xfast_syscall

ps:    0    20     0   0  46  0     0    16 zcolli DL    ??    1:36.89 [vnlru]
ps:    0    21     0   0  51  0     0    16 zilog- DL    ??  151:27.35 [syncer]
ps:   80 63429 63419   0  48  0 13280  2032 piperd I     ??    2:36.48 /usr/local/bin/rrdtool -
ps:   80 63430 63419   0  76  0 13280  1712 piperd I     ??    0:00.01 /usr/local/bin/rrdtool -

At this moment these locks are held:
db> show alllocks
Process 69468 (sshd) thread 0xffffff0f49769460 (102462)
exclusive sx so_rcv_sx (so_rcv_sx) r = 0 (0xffffff131329a0f8) locked @ /usr/src/sys/kern/uipc_sockbuf.c:148
Process 63419 (php) thread 0xffffff115a574000 (102314)
shared lockmgr zfs (zfs) r = 0 (0xffffff0e7822e448) locked @ /usr/src/sys/kern/vfs_subr.c:2137
Process 21 (syncer) thread 0xffffff0027d96000 (100180)
exclusive lockmgr syncer (syncer) r = 0 (0xffffff0045d94098) locked @ /usr/src/sys/kern/vfs_subr.c:1737
Process 20 (vnlru) thread 0xffffff0027d96460 (100179)
exclusive lockmgr zfs (zfs) r = 0 (0xffffff140637f620) locked @ /usr/src/sys/kern/vfs_subr.c:2249

Excerpt from vfs_subr.c:
1732:
                vdrop(vp);
                mtx_lock(&sync_mtx);
                return (*bo == LIST_FIRST(slp));
        }
        vn_lock(vp, LK_EXCLUSIVE | LK_RETRY);
        (void) VOP_FSYNC(vp, MNT_LAZY, td);
        VOP_UNLOCK(vp, 0);
        vn_finished_write(mp);
        BO_LOCK(*bo);
        if (((*bo)->bo_flag & BO_ONWORKLST) != 0) {
2132:

        if ((flags & LK_INTERLOCK) == 0)
                VI_LOCK(vp);
        vholdl(vp);
        if ((error = vn_lock(vp, flags | LK_INTERLOCK)) != 0) {
                vdrop(vp);
                CTR2(KTR_VFS, "%s: impossible to lock vnode %p", __func__,
                    vp);
                return (error);
        }
2244:
         */
        vp->v_iflag |= VI_OWEINACT;
        switch (func) {
        case VPUTX_VRELE:
                error = vn_lock(vp, LK_EXCLUSIVE | LK_INTERLOCK);
                VI_LOCK(vp);
                break;
        case VPUTX_VPUT:
                if (VOP_ISLOCKED(vp) != LK_EXCLUSIVE) {
                        error = VOP_LOCK(vp, LK_UPGRADE | LK_INTERLOCK |


Deadlock is observed per FS: other ZFS filesystems within ZFS pool that contains deadlocked FS, are working OK.
>How-To-Repeat:
Create ZFS fs with tons (300k) of RRD files and update them periodically
>Fix:
It seems that setting kern.maxvnodes to higher value may cope with this hang, at least 100.chksetuid can be executed without triggering deadlock.

>Release-Note:
>Audit-Trail:
>Unformatted:



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?201201020815.q028F07t062585>