Date: Tue, 09 Oct 2012 11:29:58 +0300 From: Andriy Gapon <avg@FreeBSD.org> To: "Justin T. Gibbs" <gibbs@scsiguy.com>, Pawel Jakub Dawidek <pjd@FreeBSD.org>, Konstantin Belousov <kib@FreeBSD.org> Cc: fs@FreeBSD.org Subject: Re: ZFS: Deadlock during vnode recycling Message-ID: <5073E086.8090508@FreeBSD.org> Resent-Message-ID: <5073EA39.7080403@FreeBSD.org> In-Reply-To: <5071A071.1020800@FreeBSD.org> References: <76CBA055-021F-458D-8978-E9A973D9B783@scsiguy.com> <506EB43B.8050204@FreeBSD.org> <5071A071.1020800@FreeBSD.org>
next in thread | previous in thread | raw e-mail | index | archive | help
on 07/10/2012 18:32 Andriy Gapon said the following: > In fact here is a real patch that I would like to propose: > http://people.freebsd.org/~avg/zfs-getnewvnode_reserve.diff > > The patch incorporates the kib's patch for extending VFS API that helps to avoid > entering the vnode inactive/reclaim path from getnewvnode. > > The patch should fix the problem reported in this thread and the problem from > "panic: _sx_xlock_hard: recursed on non-recursive sx zfsvfs->z_hold_mtx ..." > thread that runs in parallel. > > Reviews and testing are welcome. > > Here is a draft of a commit message that also provides some additional details: > > zfs: overhaul zfs_freebsd_reclaim and zfs_zget... > > now that we do not need to fear recursion from getnewvnode into > zfs_inactive and zfs_freebsd_reclaim. > This removes the need for the delayed destruction of znodes via taskqueue, > thus making znode/vnode state machine a bit simpler. > Also, try to make zfs_zget saner with respected to doomed vnodes to avoid > a deadlock when zfs_zget is called from zfs_freebsd_recycle. > > To do: pass locking flags parameter to zfs_zget, so that the zfs-vfs glue > code doesn't have to re-lock a vnode but could ask for proper locking > from the very start. > > > The patch also drops some redundant interlock acquisitions, since both > vop_inactive and vop_reclaim are called with exclusive vnode lock held. Turns out that it is a bad idea to get vnode lock in zfs_zget (e.g. via vn_lock or vget). If we come to zfs_zget from zil_commit it may happen that we may try to grab a lock of a vnode that is already locked by a different thread and that thread might be waiting for us to exit zil_commit. Hence a deadlock between the threads. Here is an updated patch: http://people.freebsd.org/~avg/zfs-getnewvnode_reserve.1.diff It reverts part of zfs_zget logic closer to the original code (no vget call). -- Andriy Gapon
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?5073E086.8090508>