Date: Tue, 27 Sep 2005 22:32:01 +0100 (BST) From: Robert Watson <rwatson@FreeBSD.org> To: Rob Watt <rob.watt@gmail.com> Cc: Rob Watt <rob@hudson-trading.com>, mikep@hudson-trading.com, freebsd-amd64@freebsd.org, freebsd-hackers@freebsd.org, Jason Carroll <jason@hudson-trading.com> Subject: Re: freebsd-5.4-stable panics Message-ID: <20050927222624.R34322@fledge.watson.org> In-Reply-To: <cf6c78405092714227722d534@mail.gmail.com> References: <da4a53d805092310237d732554@mail.gmail.com> <20050925115912.H11229@fledge.watson.org> <20050927140535.G50334@daemon.mistermishap.net> <20050927203128.S61419@fledge.watson.org> <cf6c78405092714227722d534@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Tue, 27 Sep 2005, Rob Watt wrote: > this is the piece of code that was referenced by the ip: > > (gdb) l *0xffffffff803b88ca > 0xffffffff803b88ca is in nfsrv_lookup (/usr/src/sys/nfsserver/nfs_serv.c:670). > 665 NFSD_UNLOCK(); > 666 mtx_lock(&Giant); /* VFS */ > 667 if (dirp) > 668 vrele(dirp); > 669 NDFREE(&nd, NDF_ONLY_PNBUF); > 670 if (ndp->ni_startdir) > 671 vrele(ndp->ni_startdir); > 672 if (ndp->ni_vp) > 673 vput(ndp->ni_vp); > 674 mtx_unlock(&Giant); /* VFS */ > > we are not running nfsd (although we do use nfs and nfsiod), and none of > our processes should have been accessing nfs. Our processes are run from > an nfs mount but do not access any nfs mounted files. That code is in the NFS server lookup code, so should be called as a result of a lookup by a remote client. If the NFS server is not in use on the machine, this is most likely this is a quirk of gdb and instruction pointers, a run-time kernel/compile-time kernel mismatch, or something really nasty. ndp should really never be NULL there, as it's used frequently prior to that point. Let's hope for one of the former few options. >> Do you have a testbed or set of test hosts set up so you can >> non-disruptively test change sets, btw? > > yes we have 3 dual dual-core machines and 1 dual single-core machine > that we can use to test with. Great. As mentioned I'll be offline for about the next 48 hours, but back after then. If we can get a nice clean crash out of this, would really be best. If it's top panicking, it could well be due to a bug in the process monitoring code, in kern_proc. We've run into bugs a few times there in the past, generally associated with threading or races in process creation/teardown, in which partially initialized (or torn down) processes are accessed by another thread and are in an unexpected state. Thanks, Robert N M Watson
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20050927222624.R34322>