Date: Wed, 18 Jan 2012 14:07:27 +0400 From: Eygene Ryabinkin <rea@freebsd.org> To: Kostik Belousov <kostikbel@gmail.com> Cc: fs@freebsd.org Subject: Re: kern/164261: [patch] fix panic with NFS served from NULLFS Message-ID: <3YuyVoVrStnLgh/yaJvBsNy18nw@HbohoBmewgxm0atwUoKO7zhAAgw> In-Reply-To: <20120118041439.GU31224@deviant.kiev.zoral.com.ua> References: <20120117202853.19F65DA81C@void.codelabs.ru> <20120118041439.GU31224@deviant.kiev.zoral.com.ua>
next in thread | previous in thread | raw e-mail | index | archive | help
--4Ckj6UjgE2iN1+kY Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Konstantin, good day. Wed, Jan 18, 2012 at 06:14:39AM +0200, Kostik Belousov wrote: > On Wed, Jan 18, 2012 at 12:28:53AM +0400, Eygene Ryabinkin wrote: > > Patches > > http://codelabs.ru/fbsd/prs/2012-jan-nullfs-LK_SHARED/0001-NULLFS-pro= perly-destroy-node-hash.patch > This one is probably fine, assuming that the hashes are properly cleared > on unmount. Feel free to commit. Will do, thanks! > > and > > http://codelabs.ru/fbsd/prs/2012-jan-nullfs-LK_SHARED/0002-NULLFS-fix= -panics-when-lowervp-is-locked-with-LK_SHA.patch > > will fix the problem (in reality, the first patch is just some > > nitpicking). > And I do not even read this. >=20 > The issue that the backtrace is pointing to seems to be the misuse of vre= le(), > after the vnode lock is switched to null vnode v_lock. Since the vnode th= at > is being thrown out is exclusively locked, cleanup path shall do vput() > instead of vrele(). The short story: at this vrele, vp had returned to its own v_lock as v_vnlo= ck, so it is unlocked here. The long story with some thoughts and questions. If vput() path will end up in null_reclaim(), this seems to be unhelpful: - VOP_RECLAIM() expects exclusively locked vnode, since it was instructed to by vnode_if.src and thus vnode_if.c has ASSERT_VOP_ELOCKED(a->a_vp) in VOP_RECLAIM_APV(); for nullfs vop_islocked is vop_stdislocked() and it checks the lock status of v_vnlock, so anything that comes to null_re= claim will be exclusively locked with *v_vnlock; - null_reclaim() has the following code, {{{ struct vnode *vp =3D ap->a_vp; [...] lockmgr(&vp->v_lock, LK_EXCLUSIVE, NULL) [...]] }}} And when vp->v_lock is equal to *v_vnlock, this will lead to the lockmgr panic, because the thread tries to exclusively lock the object that was already locked by itself and has no recursion rights. If anyone sees flaws in this explanations, please, point me to them. I had recompiled the kernel with your vrele -> vput change inside null_reclaim and with DEBUG_VFS_LOCKS: it resulted in the lock violation =66rom insmntque, http://codelabs.ru/fbsd/prs/2012-jan-nullfs-LK_SHARED/DEBUG_VFS_LOCKS-pan= ic.txt So, it hadn't got to the point where null_reclaim() will come to the game. The problem is that insmntque1() wants the passed vnode to be exclusively locked, but nfsrvd_readdirplus() uses LK_SHARED. By the way, the ASSERT_VOP_ELOCKED() was introduced in r182364 by you: why do you insist on exclusive lock for MP-safe fs? The current interplay of NFS and NULLFS makes me think that either some of filesystems isn't real= ly MP-safe, or the requirement for exclusive locking can be relaxed. But to move on, I had removed ASSERT_VOP_ELOCKED() from insmntque1 and did the test once again. Another panic, now from vputx(), http://codelabs.ru/fbsd/prs/2012-jan-nullfs-LK_SHARED/DEBUG_VFS_LOCKS-no-= MPsafe-check-panic.txt and it hits a really good assertion: we had switched to the native lock (vp->v_lock) and it is not locked at all. So, I think that vrele() is the appropriate call for the race loser return path inside null_nodeget(). > Despite above, vrele(lowervp) call is be fine, despite lowervp also > being locked exclusively, because usecount for the vnode must be > 1, > due to null_hashins() successfully found the ovp in the hash. >=20 > Try the change below. > > diff --git a/sys/fs/nullfs/null_subr.c b/sys/fs/nullfs/null_subr.c > index 319e404..6ba7508 100644 > --- a/sys/fs/nullfs/null_subr.c > +++ b/sys/fs/nullfs/null_subr.c > @@ -252,7 +252,7 @@ null_nodeget(mp, lowervp, vpp) > vrele(lowervp); > vp->v_vnlock =3D &vp->v_lock; > xp->null_lowervp =3D NULL; > - vrele(vp); > + vput(vp); > return (0); > } > *vpp =3D vp; It will also need to remove {{{ panic("null_reclaim: reclaiming a node with no lowervp"); }}} =66rom null_reclaim(), since this function is not ready for null vnodes released by race losers. And my intention to add the code (the second patch) that will not pass such half-baked vnodes into existence was precisely to make null_reclaim to be simple and to add all internal complexity of getting new null vnodes to the null_nodeget() and null_subr.c. --=20 Eygene Ryabinkin ,,,^..^,,, [ Life's unfair - but root password helps! | codelabs.ru ] [ 82FE 06BC D497 C0DE 49EC 4FF0 16AF 9EAE 8152 ECFB | freebsd.org ] --4Ckj6UjgE2iN1+kY Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.18 (FreeBSD) iF4EABEIAAYFAk8Wmd8ACgkQFq+eroFS7PtTFgD/eaU4z/EUV6XNH5/SzCXzt3h4 Bk3//7H6XToBFdpZoLcA/1JgJA3Ki0FKgIQaCTA93iYQnXFKkX8+OzP1p9ob2GC3 =I/5l -----END PGP SIGNATURE----- --4Ckj6UjgE2iN1+kY--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?3YuyVoVrStnLgh/yaJvBsNy18nw>