From owner-freebsd-bugs Sat May 24 00:53:20 1997 Return-Path: Received: (from root@localhost) by hub.freebsd.org (8.8.5/8.8.5) id AAA12396 for bugs-outgoing; Sat, 24 May 1997 00:53:20 -0700 (PDT) Received: from nlsystems.com (nlsys.demon.co.uk [158.152.125.33]) by hub.freebsd.org (8.8.5/8.8.5) with ESMTP id AAA12391 for ; Sat, 24 May 1997 00:53:17 -0700 (PDT) Received: from herring.nlsystems.com (herring.nlsystems.com [10.0.0.2]) by nlsystems.com (8.8.5/8.8.5) with SMTP id IAA06972; Sat, 24 May 1997 08:53:15 +0100 (BST) Date: Sat, 24 May 1997 08:53:15 +0100 (BST) From: Doug Rabson To: Tor Egge cc: freebsd-bugs@hub.freebsd.org Subject: Re: kern/3581: trap 12 in lockstatus() In-Reply-To: <199705240254.EAA17882@pat.idt.unit.no> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-bugs@FreeBSD.ORG X-Loop: FreeBSD.org Precedence: bulk On Sat, 24 May 1997, Tor Egge wrote: > > Some more info: > > skarven:~$ ps axl -M /var/crash/vmcore.1 | grep 28620 > 28620 9144 151711 0 -14 0 972 0 vn_loc DEs p0- 0:00.00 (bash) > 28620 9159 151711 257 28 0 384 0 - Z p0- 0:00.00 (desclient- > > Here the vnode for /dev/ttyp0 had two references, one from the > controlling tty, and one from the system file table. > > process 9144 and 9159 got SIGHUP and started to exit. Since process > 9144 was the session leader, it called vop_revoke which called vgone > which called vclean. When vclean was called, vp->v_usecount was > 2. vclean increased vp->usecount to 3. Then vclean blocked in > vinvalbuf (in order to read inodes from disks, to update time stamps). > > Process 9159 then continued to close its file descriptors, reducing > the reference counts in the system file table to 0 for the > stdin/stdout/stderr entry which referenced the vnode for > /dev/ttyp0. Thus vn_close was called, and v_usecount was reduced from > 3 to 2. Process 9159 then became a zombie. > > Then process 9144 continued, called VOP_CLOSE, and a special hack > removed the controlling terminal and reduced v_usecount from 2 to 1. > When vrele was called, v_usecount became 0, and vn_lock was called > with a deadlock as a result. There is also the VOP_ISLOCKED race to deal with which, I think, is more common. I have seen it a couple of times. One solution would be to change *all* calls to VOP_ISLOCKED to vn_islocked which would check VXLOCK before calling the filesystem. Another would be to change all VFS' VOP_ISLOCKED to check VXLOCK. -- Doug Rabson Mail: dfr@nlsystems.com Nonlinear Systems Ltd. Phone: +44 181 951 1891 Fax: +44 181 381 1039