From owner-freebsd-bugs Sat May 24 01:12:05 1997 Return-Path: Received: (from root@localhost) by hub.freebsd.org (8.8.5/8.8.5) id BAA13158 for bugs-outgoing; Sat, 24 May 1997 01:12:05 -0700 (PDT) Received: from spinner.dialix.com.au (spinner.dialix.com.au [192.203.228.67]) by hub.freebsd.org (8.8.5/8.8.5) with ESMTP id BAA13127 for ; Sat, 24 May 1997 01:12:00 -0700 (PDT) Received: from spinner.dialix.com.au (localhost.dialix.com.au [127.0.0.1]) by spinner.dialix.com.au with ESMTP id QAA20320; Sat, 24 May 1997 16:11:23 +0800 (WST) Message-Id: <199705240811.QAA20320@spinner.dialix.com.au> X-Mailer: exmh version 2.0gamma 1/27/96 To: Doug Rabson cc: Tor Egge , freebsd-bugs@hub.freebsd.org Subject: Re: kern/3581: trap 12 in lockstatus() In-reply-to: Your message of "Sat, 24 May 1997 08:53:15 +0100." Date: Sat, 24 May 1997 16:11:22 +0800 From: Peter Wemm Sender: owner-bugs@FreeBSD.ORG X-Loop: FreeBSD.org Precedence: bulk Doug Rabson wrote: > On Sat, 24 May 1997, Tor Egge wrote: > > Some more info: > > > > skarven:~$ ps axl -M /var/crash/vmcore.1 | grep 28620 > > 28620 9144 151711 0 -14 0 972 0 vn_loc DEs p0- 0:00.00 (bash ) > > 28620 9159 151711 257 28 0 384 0 - Z p0- 0:00.00 (desc lient- > > > > Here the vnode for /dev/ttyp0 had two references, one from the > > controlling tty, and one from the system file table. > > > > process 9144 and 9159 got SIGHUP and started to exit. Since process > > 9144 was the session leader, it called vop_revoke which called vgone > > which called vclean. When vclean was called, vp->v_usecount was > > 2. vclean increased vp->usecount to 3. Then vclean blocked in > > vinvalbuf (in order to read inodes from disks, to update time stamps). > > > > Process 9159 then continued to close its file descriptors, reducing > > the reference counts in the system file table to 0 for the > > stdin/stdout/stderr entry which referenced the vnode for > > /dev/ttyp0. Thus vn_close was called, and v_usecount was reduced from > > 3 to 2. Process 9159 then became a zombie. > > > > Then process 9144 continued, called VOP_CLOSE, and a special hack > > removed the controlling terminal and reduced v_usecount from 2 to 1. > > When vrele was called, v_usecount became 0, and vn_lock was called > > with a deadlock as a result. > > There is also the VOP_ISLOCKED race to deal with which, I think, is more > common. I have seen it a couple of times. One solution would be to > change *all* calls to VOP_ISLOCKED to vn_islocked which would check VXLOCK > before calling the filesystem. Another would be to change all VFS' > VOP_ISLOCKED to check VXLOCK. I'm up to my eyeballs in a sweep over the code to implement poll. One of the things that I've noticed is that many of the fs _lock routines simply call the vop_nolock or whatever routines. There are some seriously ugly and /or evil bits of code in there. :-( > -- > Doug Rabson Mail: dfr@nlsystems.com > Nonlinear Systems Ltd. Phone: +44 181 951 1891 > Fax: +44 181 381 1039 Cheers, -Peter