From owner-freebsd-stable@FreeBSD.ORG  Sat Dec  1 18:41:19 2007
Return-Path: <owner-freebsd-stable@FreeBSD.ORG>
Delivered-To: freebsd-stable@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 3212D16A417;
	Sat,  1 Dec 2007 18:41:19 +0000 (UTC)
	(envelope-from gpalmer@freebsd.org)
Received: from noop.in-addr.com (unknown [IPv6:2001:5c0:8fff:fffe::214d])
	by mx1.freebsd.org (Postfix) with ESMTP id E06DC13C465;
	Sat,  1 Dec 2007 18:41:18 +0000 (UTC)
	(envelope-from gpalmer@freebsd.org)
Received: from gjp by noop.in-addr.com with local (Exim 4.54 (FreeBSD))
	id 1IyXCv-0007rH-VS; Sat, 01 Dec 2007 13:36:53 -0500
Date: Sat, 1 Dec 2007 13:36:53 -0500
From: Gary Palmer <gpalmer@freebsd.org>
To: Jeremy Chadwick <koitsu@FreeBSD.org>
Message-ID: <20071201183653.GA19555@in-addr.com>
References: <20071107191611.GA1400@eos.sc1.parodius.com>
	<20071107232328.GA1678@eos.sc1.parodius.com>
	<20071126022136.GA1564@eos.sc1.parodius.com>
	<20071201112856.GA79861@eos.sc1.parodius.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20071201112856.GA79861@eos.sc1.parodius.com>
Cc: freebsd-stable@freebsd.org
Subject: Re: RELENG_6 kernel panic + savecore(8) problem
X-BeenThere: freebsd-stable@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Production branch of FreeBSD source code <freebsd-stable.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-stable>, 
	<mailto:freebsd-stable-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-stable>
List-Post: <mailto:freebsd-stable@freebsd.org>
List-Help: <mailto:freebsd-stable-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-stable>,
	<mailto:freebsd-stable-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 01 Dec 2007 18:41:19 -0000

On Sat, Dec 01, 2007 at 03:28:56AM -0800, Jeremy Chadwick wrote:
> On Sun, Nov 25, 2007 at 06:21:36PM -0800, Jeremy Chadwick wrote:
> > Tracing pid 3 tid 100001 td 0xc7c6ed80
> > kdb_enter(c06e475e,c073ade0,c06efb55,e6876bc8,100,...) at kdb_enter+0x30
> > panic(c06efb55,ce04b280,100,c07156c0,0,...) at panic+0xce
> > handle_written_inodeblock(c858d200,dbda0a70,c07388e4,c06a3e4a,e6876c30,...) at handle_written_inodeblock+0x5df
> > softdep_disk_write_complete(dbda0a70,c0652591,c80e65ac,e6876c94,c04e16c4,...) at softdep_disk_write_complete+0xf1
> > bufdone(dbda0a70,0,e6876ca8,c04e3e06,c80e65ac,...) at bufdone+0x7e
> > g_vfs_done(c80e65ac,0,0,c7d28200,c80a418c) at g_vfs_done+0xc6
> > biodone(c80e65ac,c0738808,24c,c06dff1c,64,...) at biodone+0xb2
> > g_io_schedule_up(c7c6ed80,4c,c7c6d218,c04e1bbc,e6876d24,...) at g_io_schedule_up+0x89
> > g_up_procbody(0,e6876d38,0,0,0,...) at g_up_procbody+0x7a
> > fork_exit(c04e1bbc,0,e6876d38) at fork_exit+0x7a
> > fork_trampoline() at fork_trampoline+0x8
> 
> To anyone who's familiar with the functions in the above backtrace:
> 
> Could the above panic be caused by exhaustion of memory allocated to the
> dirhash code (UFS_DIRHASH)?  I can provide details if needed, but
> thought I'd ask something somewhat vague for starters.  :-)

The panic message that you cut from the above text is

panic: handle_written_inodeblock: live inodedep

In version 1.181.2.17 of ffs_softdep.c (the current copy I have) that panic
happens at line 4664 when it attempts to free an inodedep structure
and fails because the structure is still needed for some reason.  From
the comments in the softdep.h file:

 * The "inodedep" structure tracks the set of dependencies associated
 * with an inode. 

So its a softupdates related panic relating to an I/O to an inode that
has completed.  I can't see how dirhash could have caused this.

To see why savecore() isn't saving your cores you might want to check
syslog.  savecore() should log to syslog at LOG_ERR priority in the
DAEMON facility.  Changing 

savecore_flags

in /etc/rc.conf to be "-vv" might show up what the problem is if the
box panic's and fails to save core again (it might also make boot
a lot messier on the console)

Regards,

Gary

P.S. I'm no softupates expert so I don't know what circumstances
caused the panic in the first place.