From owner-freebsd-current Mon Jun 28 6:49:59 1999 Delivered-To: freebsd-current@freebsd.org Received: from overcee.netplex.com.au (overcee.netplex.com.au [202.12.86.7]) by hub.freebsd.org (Postfix) with ESMTP id 6D69214DA6 for ; Mon, 28 Jun 1999 06:49:50 -0700 (PDT) (envelope-from peter@netplex.com.au) Received: from netplex.com.au (localhost [127.0.0.1]) by overcee.netplex.com.au (Postfix) with ESMTP id ADBA882; Mon, 28 Jun 1999 21:49:48 +0800 (WST) (envelope-from peter@netplex.com.au) X-Mailer: exmh version 2.0.2 2/24/98 To: Greg Lehey Cc: Kirk McKusick , Matthew Dillon , Alan Cox , Julian Elischer , Mike Smith , "John S. Dyson" , dg@root.com, dyson@iquest.net, current@freebsd.org Subject: Re: Found the startup panic - ccd ( patch included ) In-reply-to: Your message of "Mon, 28 Jun 1999 18:32:06 +0930." <19990628183206.T43194@freebie.lemis.com> Date: Mon, 28 Jun 1999 21:49:48 +0800 From: Peter Wemm Message-Id: <19990628134948.ADBA882@overcee.netplex.com.au> Sender: owner-freebsd-current@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.ORG Greg Lehey wrote: > On Monday, 28 June 1999 at 16:36:31 +0800, Peter Wemm wrote: > > Kirk McKusick wrote: > > [..] > >> Greg Lehey has sent me a panic with the buffer locking in the NFS code. > >> I am too tired to attack it tonight, but will look at it in the morning. > > > > I might have a look if I get a chance.. > > I've been collecting them :-) The first one looks like this: > > Debugger (msg=0xc025e7fb "panic") at ../../i386/i386/db_interface.c:326 > 326 } > #0 Debugger (msg=0xc025e7fb "panic") at ../../i386/i386/db_interface.c:326 > #1 0xc0153474 in panic (fmt=0xc02676a0 "nfs_strategy: buffer %p not locked") at ../../kern/kern_shutdown.c:450 [..] > It happens when I do just about any NFS write; I've been reproducing > it with 'make depend' in the NFS-mounted kernel build directory. I'll > try to get a dump and send you both a message about where you can find > it. All things being equal, this should be fixed, there was a negated assertion in nfs_strategy. :-) > The other one is even more simple to reproduce: > > $ dd if=/dev/da0d of=/dev/null bs=120b > > Debugger (msg=0xc025e7fb "panic") at ../../i386/i386/db_interface.c:326 > 326 } > #0 Debugger (msg=0xc025e7fb "panic") at ../../i386/i386/db_interface.c:326 > #1 0xc0153474 in panic (fmt=0xc025d9a0 "lockmgr: locking against myself") at ../../kern/kern_shutdown.c:450 > #2 0xc014eafb in debuglockmgr (lkp=0xc2368c4c, flags=0x10022, interlkp=0xc02 c4564, p=0xc68aa3c0, > name=0xc0261452 "lockmgr", file=0xc026145a "../../sys/buf.h", line=0x11b) at ../../kern/kern_lock.c:341 > #3 0xc0173fc2 in getblk (vp=0xc6dcd780, blkno=0x20, size=0x2000, slpflag=0x0 , slptimeo=0x0) at ../../sys/buf.h:283 > #4 0xc0172531 in breadn (vp=0xc6dcd780, blkno=0x20, size=0x2000, rablkno=0xc 6deee44, rabsize=0xc6deee48, cnt=0x1, > cred=0x0, bpp=0xc6deee4c) at ../../kern/vfs_bio.c:433 > #5 0xc01864d5 in spec_read (ap=0xc6deeeb4) at ../../miscfs/specfs/spec_vnops .c:308 > #6 0xc01f8378 in ufsspec_read (ap=0xc6deeeb4) at ../../ufs/ufs/ufs_vnops.c:1 826 > #7 0xc01f8931 in ufs_vnoperatespec (ap=0xc6deeeb4) at ../../ufs/ufs/ufs_vnop s.c:2327 > #8 0xc0180684 in vn_read (fp=0xc0f56d80, uio=0xc6deeefc, cred=0xc0f52200, fl ags=0x0) at vnode_if.h:303 > #9 0xc015f157 in dofileread (p=0xc68aa3c0, fp=0xc0f56d80, fd=0x3, buf=0x805d 000, nbyte=0xf000, > offset=0xffffffffffffffff, flags=0x0) at ../../kern/sys_generic.c:179 > #10 0xc015f067 in read (p=0xc68aa3c0, uap=0xc6deef80) at ../../kern/sys_gener ic.c:111 > #11 0xc022aee6 in syscall (frame={tf_fs = 0x2f, tf_es = 0x2f, tf_ds = 0x2f, t f_edi = 0xbfbfcfdc, tf_esi = 0xbfbfcfc8, > tf_ebp = 0xbfbfcf90, tf_isp = 0xc6deefd4, tf_ebx = 0xbfbfcfc8, tf_edx = 0x4, tf_ecx = 0x80580a0, tf_eax = 0x3, > tf_trapno = 0x16, tf_err = 0x2, tf_eip = 0x8049eec, tf_cs = 0x1f, tf_ef lags = 0x246, tf_esp = 0xbfbfcf80, > tf_ss = 0x2f}) at ../../i386/i386/trap.c:1056 > #12 0xc021c960 in Xint0x80_syscall () > #13 0x8048c6d in ?? () > #14 0x80480e9 in ?? () > > For some reason, breadn() doesn't seem to work any more. It works > find with bread(), but I haven't localized how the code differs. Hmm... > I've applied Kirk's patch to lockmgr, which seemed to relate to the > breadn() problem, but it then found a different set of flags to pass > to lockmgr (LK_SLEEPFAIL instead of LK_NOWAIT) and still cause the > panic. There are also some serious problems with the swap pager.. For starters, BUF_KERNPROC() internally checks that B_ASYNC is set because that's when biodone() frees the lock. However, B_CALL may or may not free the lock too, for example swp_pager_async_iodone() in swap_pager.c. Also, the swap IO clustering/chainging etc is troubled since it turns on and off B_ASYNC and does it's own freeing and waiting. I haven't got my brain around it yet but I don't think it's going to be too hard to fix. Incidently, I do find Matt's code and comments quite nice to work with compared to some obscure areas. > Greg Cheers, -Peter -- Peter Wemm - peter@FreeBSD.org; peter@yahoo-inc.com; peter@netplex.com.au To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-current" in the body of the message