From owner-freebsd-current  Wed Feb 19 20:44:41 2003
Delivered-To: freebsd-current@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 92B3637B401
	for <current@freebsd.org>; Wed, 19 Feb 2003 20:44:39 -0800 (PST)
Received: from heron.mail.pas.earthlink.net (heron.mail.pas.earthlink.net [207.217.120.189])
	by mx1.FreeBSD.org (Postfix) with ESMTP id E8A5C43F3F
	for <current@freebsd.org>; Wed, 19 Feb 2003 20:44:38 -0800 (PST)
	(envelope-from tlambert2@mindspring.com)
Received: from pool0298.cvx22-bradley.dialup.earthlink.net ([209.179.199.43] helo=mindspring.com)
	by heron.mail.pas.earthlink.net with asmtp (SSLv3:RC4-MD5:128)
	(Exim 3.33 #1)
	id 18liZX-00030T-00; Wed, 19 Feb 2003 20:44:36 -0800
Message-ID: <3E545CD8.184323A9@mindspring.com>
Date: Wed, 19 Feb 2003 20:43:04 -0800
From: Terry Lambert <tlambert2@mindspring.com>
X-Mailer: Mozilla 4.79 [en] (Win98; U)
X-Accept-Language: en
MIME-Version: 1.0
To: Craig Boston <craig@xfoil.gank.org>
Cc: Lars Eggert <larse@ISI.EDU>, current@freebsd.org,
	Poul-Henning Kamp <phk@critter.freebsd.dk>
Subject: Re: panic starting gnome
References: <3E52BB14.2040309@isi.edu> <3E532F61.653A09B0@mindspring.com>
		 <3E5408B0.9030300@isi.edu> <1045713737.612.22.camel@localhost>
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
X-ELNK-Trace: b1a02af9316fbb217a47c185c03b154d40683398e744b8a4ea4d192f22cf40f155bb15043fc22b31667c3043c0873f7e350badd9bab72f9c350badd9bab72f9c
Sender: owner-freebsd-current@FreeBSD.ORG
Precedence: bulk
List-ID: <freebsd-current.FreeBSD.ORG>
List-Archive: <http://docs.freebsd.org/mail/> (Web Archive)
List-Help: <mailto:majordomo@FreeBSD.ORG?subject=help> (List Instructions)
List-Subscribe: <mailto:majordomo@FreeBSD.ORG?subject=subscribe%20freebsd-current>
List-Unsubscribe: <mailto:majordomo@FreeBSD.ORG?subject=unsubscribe%20freebsd-current>
X-Loop: FreeBSD.ORG

Craig Boston wrote:
> Well, I haven't had much luck tracking down the exact cause.  For some
> reason I haven't been able to figure out, all of my crash dumps jump
> directly from vn_open_cred (line 185 of vfs_vnops.c) to calltrap().  The
> namei call doesn't show up in the stack at all, almost like the function
> is being inlined.  I'm only using -O, which shouldn't inline anything
> not explicitly declared as such.

Nope.  The problem is a NULL pointer dereference, apparently into
the proc structure, which is a NULL proc pointer.

> Anyway, using a cvsup binary search I've managed to narrow it down
> some.  The problem did not exist before midnight UTC on 2003-04-15.  It
> does exist on midnight UTC 2003-04-16.  I've been digging through the
> commit logs for that day, but it seems it was a busy day for the VFS
> code with lots of commits.  Since it always happens after an fdfree(),
> I'm leaning toward a large (number of files) commit by alfred@ having to
> do with a lock order reversal and adding a mutex associated with freeing
> filedesc structures.  Just a guess, though.

FWIW, I arrived at the same place, given Lars' debugging information,
though it was only my most likely suspect.  There are some changes
that went in for KSE, as well, but I'm pretty sure they were after
last Wednesday.


> Reproducing the problem seems to be as simple as killing any process
> that has an open, locked file on an NFS volume.  A simple
> 
> gconfd-1 &
> sleep 5; killall -9 gconfd-1
> 
> does it every time for me.  I assume this would also happen if a process
> calls exit() without closing all of it's fds first; probably why
> starting GNOME or booting diskless is enough to tickle it.

Yes, this is most likely.

-- Terry

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-current" in the body of the message