From owner-freebsd-stable@FreeBSD.ORG Mon Jan 31 20:43:49 2011 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 433DF1065697 for ; Mon, 31 Jan 2011 20:43:49 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from cyrus.watson.org (cyrus.watson.org [65.122.17.42]) by mx1.freebsd.org (Postfix) with ESMTP id 12EE68FC08 for ; Mon, 31 Jan 2011 20:43:49 +0000 (UTC) Received: from bigwig.baldwin.cx (66.111.2.69.static.nyinternet.net [66.111.2.69]) by cyrus.watson.org (Postfix) with ESMTPSA id B5F3046B1A; Mon, 31 Jan 2011 15:43:48 -0500 (EST) Received: from jhbbsd.localnet (unknown [209.249.190.10]) by bigwig.baldwin.cx (Postfix) with ESMTPSA id 0D54C8A027; Mon, 31 Jan 2011 15:43:47 -0500 (EST) From: John Baldwin To: freebsd-stable@freebsd.org Date: Mon, 31 Jan 2011 12:00:29 -0500 User-Agent: KMail/1.13.5 (FreeBSD/7.4-CBSD-20110107; KDE/4.4.5; amd64; ; ) References: In-Reply-To: MIME-Version: 1.0 Content-Type: Text/Plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Message-Id: <201101311200.29897.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.6 (bigwig.baldwin.cx); Mon, 31 Jan 2011 15:43:47 -0500 (EST) X-Virus-Scanned: clamav-milter 0.96.3 at bigwig.baldwin.cx X-Virus-Status: Clean X-Spam-Status: No, score=2.1 required=4.2 tests=BAYES_00,DATE_IN_PAST_03_06, MAY_BE_FORGED,RDNS_DYNAMIC autolearn=no version=3.3.1 X-Spam-Level: ** X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on bigwig.baldwin.cx Cc: John Hickey Subject: Re: nfsd hung on ufs vnode lock X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 31 Jan 2011 20:43:49 -0000 On Friday, January 28, 2011 8:10:41 pm John Hickey wrote: > There was a previous thread about this, but it doesn't look like there wa= s any resolution: >=20 > http://lists.freebsd.org/pipermail/freebsd-stable/2010-May/056986.html >=20 > I run a fileserver for an Emulab (www.emulab.net) system. As such, the e= xports table is constantly modified as experiments are swapped in and=20 out. We also get a lot of researchers using NFS for strange things. In th= is case, the exclusive lock was for a cache directory shared by about=20 36 machines running Ubuntu 8.04 and mounting with NFSv2. Eventually, all o= ur nfsd processes get stuck since the exclusive lock for the directory=20 is never released. I could use any and all pointers on getting this fixed. >=20 > What I am running: >=20 > jjh@users: ~$ uname -a > FreeBSD users.isi.deterlab.net 7.3-RELEASE-p2 FreeBSD 7.3-RELEASE-p2 #9: = Tue Sep 14 16:24:57 PDT 2010 =20 root@users.isi.deterlab.net:/usr/obj/usr/src/sys/USERS7 i386 >=20 > Here are the sleepchains for my system (note that 0xd1f72678 appears twic= e): >=20 > 0xce089cf0: tag syncer, type VNON > usecount 1, writecount 0, refcount 2 mountedhere 0 > flags () > lock type syncer: EXCL (count 1) by thread 0xcdb4b000 (pid 46) >=20 > 0xd1f72678: tag ufs, type VDIR > usecount 2, writecount 0, refcount 67 mountedhere 0 > flags () > v_object 0xd1e90e80 ref 0 pages 1 > lock type ufs: EXCL (count 1) by thread 0xce1146c0 (pid 866) with 62 pe= nding > ino 143173560, on dev mfid0s1f =46rom the stack trace, this vnode is the directory vnode that is the parent of the new file being created. > (kgdb) bt > #0 sched_switch (td=3D0xce1146c0, newtd=3DVariable "newtd" is not availa= ble. > ) at /usr/src/sys/kern/sched_ule.c:1936 > #1 0xc080a4a6 in mi_switch (flags=3DVariable "flags" is not available. > ) at /usr/src/sys/kern/kern_synch.c:444 > #2 0xc0837aab in sleepq_switch (wchan=3DVariable "wchan" is not availabl= e. > ) at /usr/src/sys/kern/subr_sleepqueue.c:497 > #3 0xc08380f6 in sleepq_wait (wchan=3D0xd4176394) at /usr/src/sys/kern/s= ubr_sleepqueue.c:580 > #4 0xc080a92a in _sleep (ident=3D0xd4176394, lock=3D0xc0ceb498, priority= =3D80, wmesg=3D0xc0bb656e "ufs", timo=3D0) at /usr/src/sys/kern/kern_synch.= c:230 > #5 0xc07ea9fa in acquire (lkpp=3D0xcd7375a0, extflags=3DVariable "extfla= gs" is not available. > ) at /usr/src/sys/kern/kern_lock.c:151 > #6 0xc07eb2ec in _lockmgr (lkp=3D0xd4176394, flags=3D8194, interlkp=3D0x= d41763c4, td=3D0xce1146c0, file=3D0xc0bc20c8 "/usr/src/sys/kern/vfs_subr.c"= ,=20 line=3D2062) > at /usr/src/sys/kern/kern_lock.c:384 > #7 0xc0a24765 in ffs_lock (ap=3D0xcd737608) at /usr/src/sys/ufs/ffs/ffs_= vnops.c:377 > #8 0xc0b26876 in VOP_LOCK1_APV (vop=3D0xc0ca4740, a=3D0xcd737608) at vno= de_if.c:1618 > #9 0xc0896d76 in _vn_lock (vp=3D0xd417633c, flags=3D8194, td=3D0xce1146c= 0, file=3D0xc0bc20c8 "/usr/src/sys/kern/vfs_subr.c", line=3D2062) at=20 vnode_if.h:851 Note that, this vnode (vp) doesn't show up in your list above. You can try using my gdb scripts at www.freebsd.org/~jhb/gdb (you want gdb6* and do 'source gdb6'). You can then do 'vprint vp' at this frame and should see lock details about who holds this lock. However, I would not expect the vnode lock for a new i-node to be already held. There's a chance though you are tripping over the bug fixed by these two changes: Author: jhb Date: Fri Jul 16 20:23:24 2010 New Revision: 210173 URL: http://svn.freebsd.org/changeset/base/210173 Log: When the MNTK_EXTENDED_SHARED mount option was added, some filesystems we= re changed to defer the setting of VN_LOCK_ASHARE() (which clears LK_NOSHARE in the vnode lock's flags) until after they had determined if the vnode w= as a FIFO. This occurs after the vnode has been inserted into a VFS hash or some similar table, so it is possible for another thread to find this vno= de via vget() on an i-node number and block on the vnode lock. If the lockm= gr interlock (vnode interlock for vnode locks) is not held when clearing the LK_NOSHARE flag, then the lk_flags field can be clobbered. As a result the thread blocked on the vnode lock may never get woken up. Fix this by holding the vnode interlock while modifying the lock flags in this case. =20 The softupdates code also toggles LK_NOSHARE in one function to close a race with snapshots. Fix this code to grab the interlock while fiddling with lk_flags. Author: jhb Date: Fri Aug 20 20:58:57 2010 New Revision: 211533 URL: http://svn.freebsd.org/changeset/base/211533 Log: Revert 210173 as it did not properly fix the bug. It assumed that the VI_LOCK() for a given vnode was used as the internal interlock for that vnode's v_lock lockmgr lock. This is not the case. Instead, add dedicat= ed routines to toggle the LK_NOSHARE and LK_CANRECURSE flags. These routines lock the lockmgr lock's internal interlock to synchronize the updates to the flags member with other threads attempting to acquire the lock. The VN_LOCK_A*() macros now invoke these routines, and the softupdates code uses these routines to temporarly enable recursion on buffer locks. =20 Reviewed by: kib =2D-=20 John Baldwin