Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 09 Oct 2012 11:29:58 +0300
From:      Andriy Gapon <avg@FreeBSD.org>
To:        "Justin T. Gibbs" <gibbs@scsiguy.com>, Pawel Jakub Dawidek <pjd@FreeBSD.org>, Konstantin Belousov <kib@FreeBSD.org>
Cc:        fs@FreeBSD.org
Subject:   Re: ZFS: Deadlock during vnode recycling
Message-ID:  <5073E086.8090508@FreeBSD.org>
Resent-Message-ID: <5073EA39.7080403@FreeBSD.org>
In-Reply-To: <5071A071.1020800@FreeBSD.org>
References:  <76CBA055-021F-458D-8978-E9A973D9B783@scsiguy.com> <506EB43B.8050204@FreeBSD.org> <5071A071.1020800@FreeBSD.org>

next in thread | previous in thread | raw e-mail | index | archive | help
on 07/10/2012 18:32 Andriy Gapon said the following:
> In fact here is a real patch that I would like to propose:
> http://people.freebsd.org/~avg/zfs-getnewvnode_reserve.diff
> 
> The patch incorporates the kib's patch for extending VFS API that helps to avoid
> entering the vnode inactive/reclaim path from getnewvnode.
> 
> The patch should fix the problem reported in this thread and the problem from
> "panic: _sx_xlock_hard: recursed on non-recursive sx zfsvfs->z_hold_mtx ..."
> thread that runs in parallel.
> 
> Reviews and testing are welcome.
> 
> Here is a draft of a commit message that also provides some additional details:
> 
> zfs: overhaul zfs_freebsd_reclaim and zfs_zget...
> 
> now that we do not need to fear recursion from getnewvnode into
> zfs_inactive and zfs_freebsd_reclaim.
> This removes the need for the delayed destruction of znodes via taskqueue,
> thus making znode/vnode state machine a bit simpler.
> Also, try to make zfs_zget saner with respected to doomed vnodes to avoid
> a deadlock when zfs_zget is called from zfs_freebsd_recycle.
> 
> To do: pass locking flags parameter to zfs_zget, so that the zfs-vfs glue
> code doesn't have to re-lock a vnode but could ask for proper locking
> from the very start.
> 
> 
> The patch also drops some redundant interlock acquisitions, since both
> vop_inactive and vop_reclaim are called with exclusive vnode lock held.

Turns out that it is a bad idea to get vnode lock in zfs_zget (e.g. via vn_lock
or vget).  If we come to zfs_zget from zil_commit it may happen that we may try
to grab a lock of a vnode that is already locked by a different thread
and that thread might be waiting for us to exit zil_commit.
Hence a deadlock between the threads.

Here is an updated patch:
http://people.freebsd.org/~avg/zfs-getnewvnode_reserve.1.diff
It reverts part of zfs_zget logic closer to the original code (no vget call).

-- 
Andriy Gapon



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?5073E086.8090508>