From owner-freebsd-stable@FreeBSD.ORG Sat Dec 1 18:41:19 2007 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 3212D16A417; Sat, 1 Dec 2007 18:41:19 +0000 (UTC) (envelope-from gpalmer@freebsd.org) Received: from noop.in-addr.com (unknown [IPv6:2001:5c0:8fff:fffe::214d]) by mx1.freebsd.org (Postfix) with ESMTP id E06DC13C465; Sat, 1 Dec 2007 18:41:18 +0000 (UTC) (envelope-from gpalmer@freebsd.org) Received: from gjp by noop.in-addr.com with local (Exim 4.54 (FreeBSD)) id 1IyXCv-0007rH-VS; Sat, 01 Dec 2007 13:36:53 -0500 Date: Sat, 1 Dec 2007 13:36:53 -0500 From: Gary Palmer To: Jeremy Chadwick Message-ID: <20071201183653.GA19555@in-addr.com> References: <20071107191611.GA1400@eos.sc1.parodius.com> <20071107232328.GA1678@eos.sc1.parodius.com> <20071126022136.GA1564@eos.sc1.parodius.com> <20071201112856.GA79861@eos.sc1.parodius.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20071201112856.GA79861@eos.sc1.parodius.com> Cc: freebsd-stable@freebsd.org Subject: Re: RELENG_6 kernel panic + savecore(8) problem X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 01 Dec 2007 18:41:19 -0000 On Sat, Dec 01, 2007 at 03:28:56AM -0800, Jeremy Chadwick wrote: > On Sun, Nov 25, 2007 at 06:21:36PM -0800, Jeremy Chadwick wrote: > > Tracing pid 3 tid 100001 td 0xc7c6ed80 > > kdb_enter(c06e475e,c073ade0,c06efb55,e6876bc8,100,...) at kdb_enter+0x30 > > panic(c06efb55,ce04b280,100,c07156c0,0,...) at panic+0xce > > handle_written_inodeblock(c858d200,dbda0a70,c07388e4,c06a3e4a,e6876c30,...) at handle_written_inodeblock+0x5df > > softdep_disk_write_complete(dbda0a70,c0652591,c80e65ac,e6876c94,c04e16c4,...) at softdep_disk_write_complete+0xf1 > > bufdone(dbda0a70,0,e6876ca8,c04e3e06,c80e65ac,...) at bufdone+0x7e > > g_vfs_done(c80e65ac,0,0,c7d28200,c80a418c) at g_vfs_done+0xc6 > > biodone(c80e65ac,c0738808,24c,c06dff1c,64,...) at biodone+0xb2 > > g_io_schedule_up(c7c6ed80,4c,c7c6d218,c04e1bbc,e6876d24,...) at g_io_schedule_up+0x89 > > g_up_procbody(0,e6876d38,0,0,0,...) at g_up_procbody+0x7a > > fork_exit(c04e1bbc,0,e6876d38) at fork_exit+0x7a > > fork_trampoline() at fork_trampoline+0x8 > > To anyone who's familiar with the functions in the above backtrace: > > Could the above panic be caused by exhaustion of memory allocated to the > dirhash code (UFS_DIRHASH)? I can provide details if needed, but > thought I'd ask something somewhat vague for starters. :-) The panic message that you cut from the above text is panic: handle_written_inodeblock: live inodedep In version 1.181.2.17 of ffs_softdep.c (the current copy I have) that panic happens at line 4664 when it attempts to free an inodedep structure and fails because the structure is still needed for some reason. From the comments in the softdep.h file: * The "inodedep" structure tracks the set of dependencies associated * with an inode. So its a softupdates related panic relating to an I/O to an inode that has completed. I can't see how dirhash could have caused this. To see why savecore() isn't saving your cores you might want to check syslog. savecore() should log to syslog at LOG_ERR priority in the DAEMON facility. Changing savecore_flags in /etc/rc.conf to be "-vv" might show up what the problem is if the box panic's and fails to save core again (it might also make boot a lot messier on the console) Regards, Gary P.S. I'm no softupates expert so I don't know what circumstances caused the panic in the first place.