From owner-freebsd-hackers@FreeBSD.ORG  Tue Sep 27 21:32:02 2005
Return-Path: <owner-freebsd-hackers@FreeBSD.ORG>
X-Original-To: freebsd-hackers@freebsd.org
Delivered-To: freebsd-hackers@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 6718C16A41F;
	Tue, 27 Sep 2005 21:32:02 +0000 (GMT)
	(envelope-from rwatson@FreeBSD.org)
Received: from cyrus.watson.org (cyrus.watson.org [204.156.12.53])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 9C40A43D5A;
	Tue, 27 Sep 2005 21:32:01 +0000 (GMT)
	(envelope-from rwatson@FreeBSD.org)
Received: from fledge.watson.org (fledge.watson.org [204.156.12.50])
	by cyrus.watson.org (Postfix) with ESMTP id 2D79C46B0D;
	Tue, 27 Sep 2005 17:32:01 -0400 (EDT)
Date: Tue, 27 Sep 2005 22:32:01 +0100 (BST)
From: Robert Watson <rwatson@FreeBSD.org>
X-X-Sender: robert@fledge.watson.org
To: Rob Watt <rob.watt@gmail.com>
In-Reply-To: <cf6c78405092714227722d534@mail.gmail.com>
Message-ID: <20050927222624.R34322@fledge.watson.org>
References: <da4a53d805092310237d732554@mail.gmail.com> 
	<20050925115912.H11229@fledge.watson.org>
	<20050927140535.G50334@daemon.mistermishap.net>
	<20050927203128.S61419@fledge.watson.org>
	<cf6c78405092714227722d534@mail.gmail.com>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
Cc: Rob Watt <rob@hudson-trading.com>, mikep@hudson-trading.com,
	freebsd-amd64@freebsd.org, freebsd-hackers@freebsd.org,
	Jason Carroll <jason@hudson-trading.com>
Subject: Re: freebsd-5.4-stable panics
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
	<freebsd-hackers.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>, 
	<mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>,
	<mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 27 Sep 2005 21:32:02 -0000

On Tue, 27 Sep 2005, Rob Watt wrote:

> this is the piece of code that was referenced by the ip:
>
> (gdb) l *0xffffffff803b88ca
> 0xffffffff803b88ca is in nfsrv_lookup (/usr/src/sys/nfsserver/nfs_serv.c:670).
> 665             NFSD_UNLOCK();
> 666             mtx_lock(&Giant);       /* VFS */
> 667             if (dirp)
> 668                     vrele(dirp);
> 669             NDFREE(&nd, NDF_ONLY_PNBUF);
> 670             if (ndp->ni_startdir)
> 671                     vrele(ndp->ni_startdir);
> 672             if (ndp->ni_vp)
> 673                     vput(ndp->ni_vp);
> 674             mtx_unlock(&Giant);     /* VFS */
>
> we are not running nfsd (although we do use nfs and nfsiod), and none of 
> our processes should have been accessing nfs. Our processes are run from 
> an nfs mount but do not access any nfs mounted files.

That code is in the NFS server lookup code, so should be called as a 
result of a lookup by a remote client.  If the NFS server is not in use on 
the machine, this is most likely this is a quirk of gdb and instruction 
pointers, a run-time kernel/compile-time kernel mismatch, or something 
really nasty.  ndp should really never be NULL there, as it's used 
frequently prior to that point.  Let's hope for one of the former few 
options.

>> Do you have a testbed or set of test hosts set up so you can 
>> non-disruptively test change sets, btw?
>
> yes we have 3 dual dual-core machines and 1 dual single-core machine 
> that we can use to test with.

Great.  As mentioned I'll be offline for about the next 48 hours, but back 
after then.  If we can get a nice clean crash out of this, would really be 
best.  If it's top panicking, it could well be due to a bug in the process 
monitoring code, in kern_proc.  We've run into bugs a few times there in 
the past, generally associated with threading or races in process 
creation/teardown, in which partially initialized (or torn down) processes 
are accessed by another thread and are in an unexpected state.

Thanks,

Robert N M Watson