From owner-freebsd-current@FreeBSD.ORG Wed Apr 2 12:49:28 2003 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id D25BF37B401 for ; Wed, 2 Apr 2003 12:49:28 -0800 (PST) Received: from fledge.watson.org (fledge.watson.org [204.156.12.50]) by mx1.FreeBSD.org (Postfix) with ESMTP id DFAE643F85 for ; Wed, 2 Apr 2003 12:49:27 -0800 (PST) (envelope-from robert@fledge.watson.org) Received: from fledge.watson.org (fledge.pr.watson.org [192.0.2.3]) by fledge.watson.org (8.12.9/8.12.9) with SMTP id h32KnnYY038534 for ; Wed, 2 Apr 2003 15:49:49 -0500 (EST) (envelope-from robert@fledge.watson.org) Date: Wed, 2 Apr 2003 15:49:48 -0500 (EST) From: Robert Watson X-Sender: robert@fledge.watson.org To: current@FreeBSD.org Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Subject: Something in NFS server calling vrele() not vput()? X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 02 Apr 2003 20:49:29 -0000 Unfortunately, I don't have too much information here. The scenario is as follows: cboss: NFS file/build server crash2: NFS diskless client I built world on cboss; I then did installworld in crash2. I intended to installworld to a DESTDIR on a local disk on crash2, but I failed to mount it first, so the installworld had both the source and target in NFS. When I realizes this had happened, I proceeded to rm -Rf the DESTDIR tree. Shortly thereafter, rm -Rf hung on a vnode lock, and other processes started to stack up going up the directory tree. I have a little debugging information below that may be relevant--show lockedvnods shoes two directories where the locks are held by rm (0x40a25a0), a later ls (0xc48bd2d0). The last there entries are worring because the refcounts on each of these vnodes is 0, and the VI_FREE flag is set. Earlier in the debugging session, the VI_FREE flag wasn't set, so presumably the vnode was being free'd following a removal (not unlikely with installs, renames, and removals). Interestingly, the last three entries in the locked vnode list were apparently grabbed by the nfs daemon. Unfortunately, we lost of the pid entry in the lock structure so I can't tell if the thread pointer is stale and the struct thread has been reused or not. I suspect given that the nfsd thread pool is pretty much static that the locks were indeed grabbed by NFS, so some NFS operation may be calling vrele() instead of vput() (or the like). Alternatively, perhaps there's a race somewhere during ufs_inactive() between it and an NFS operation? Any other thoughts would be welcome; unfortunately, no core dump is available. Robert N M Watson FreeBSD Core Team, TrustedBSD Projects robert@fledge.watson.org Network Associates Laboratories db> show lockedvnods Locked vnodes 0xc5a2bc8c: tag ufs, type VDIR, usecount 3, writecount 0, refcount 1, flags (VV_OBJBUF), lock type ufs: EXCL (count 1) by thread 0xc48bd2d0 ino 619588, on dev ad0s1g (4, 18) 0xc4fbd6d8: tag ufs, type VDIR, usecount 4, writecount 0, refcount 1, flags (VV_OBJBUF), lock type ufs: EXCL (count 1) by thread 0xc40a25a0 with 1 pending ino 1129851, on dev ad0s1g (4, 18) 0xc4d456d8: tag ufs, type VREG, usecount 1, writecount 0, refcount 0, flags (VV_OBJBUF), lock type ufs: EXCL (count 1) by thread 0xc3f78c30 with 1 pending ino 1129935, on dev ad0s1g (4, 18) 0xc5046248: tag ufs, type VREG, usecount 0, writecount 0, refcount 0, flags (VI_FREE|VV_OBJBUF), lock type ufs: EXCL (count 1) by thread 0xc3f78c30 ino 1215822, on dev ad0s1g (4, 18) 0xc41fe920: tag ufs, type VREG, usecount 0, writecount 0, refcount 0, flags (VI_FREE|VV_OBJBUF), lock type ufs: EXCL (count 1) by thread 0xc3f78c30 ino 1216038, on dev ad0s1g (4, 18) db> cont -> cboss# db> trace 13805 mi_switch(c40a25a0,50,c5c4536c,dba91c40,0) at mi_switch+0x181 msleep(c4d45794,c058a2a0,50,c04fabf4,0) at msleep+0x43c acquire(dba919d0,1000040,600,689bd73c,c40a25a0) at acquire+0xa0 lockmgr(c4d45794,1010002,c4d456d8,c40a25a0,dba919ec) at lockmgr+0x3f7 vop_stdlock(dba91a14,dba919f8,c0440778,dba91a14,dba91a38) at vop_stdlock+0x2c vop_defaultop(dba91a14,dba91a38,c036629e,dba91a14,c4603e10) at vop_defaultop+0x1 8 ufs_vnoperate(dba91a14,c4603e10,dba91a5c,c043cb34,2) at ufs_vnoperate+0x18 vn_lock(c4d456d8,10002,c40a25a0,c034ca3a,c5bd7de2) at vn_lock+0x11e vget(c4d456d8,2,c40a25a0,1064428,c40a25a0) at vget+0x100 vfs_cache_lookup(dba91b54,dba91b80,c0352122,dba91b54,20002) at vfs_cache_lookup+ 0x1ed ufs_vnoperate(dba91b54,20002,c40a25a0,dba91b0c,c40a25a0) at ufs_vnoperate+0x18 lookup(dba91c18,c46f1800,400,dba91c34,c40a25a0) at lookup+0x302 namei(dba91c18,80bd948,60,0,c40a25a0) at namei+0x20b lstat(c40a25a0,dba91d10,8,c40a25a0,2) at lstat+0x52 syscall(2f,2f,2f,80bda00,80b7040) at syscall+0x2aa Xint0x80_syscall() at Xint0x80_syscall+0x1d --- syscall (190, FreeBSD ELF32, lstat), eip = 0x804b45f, esp = 0xbfbff53c, ebp = 0xbfbff5c8 --- (kgdb) inspect ((struct thread *)0xc3f78c30)->td_proc.p_pid $1 = 404 (kgdb) inspect ((struct thread *)0xc3f78c30)->td_proc.p_comm $2 = "nfsd\0er", '\0'