From owner-freebsd-stable@FreeBSD.ORG Tue May 12 18:26:50 2009 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 60F361065670; Tue, 12 May 2009 18:26:50 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from cyrus.watson.org (cyrus.watson.org [65.122.17.42]) by mx1.freebsd.org (Postfix) with ESMTP id 317068FC29; Tue, 12 May 2009 18:26:49 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from bigwig.baldwin.cx (66.111.2.69.static.nyinternet.net [66.111.2.69]) by cyrus.watson.org (Postfix) with ESMTPSA id AC3F646B5C; Tue, 12 May 2009 14:26:49 -0400 (EDT) Received: from jhbbsd.hudson-trading.com (unknown [209.249.190.8]) by bigwig.baldwin.cx (Postfix) with ESMTPA id 766B08A026; Tue, 12 May 2009 14:26:48 -0400 (EDT) From: John Baldwin To: Riccardo Torrini Date: Tue, 12 May 2009 14:26:43 -0400 User-Agent: KMail/1.9.7 References: <20090507155012.GW21112@tiger.fi.esaote.it> <200905121144.21406.jhb@freebsd.org> <20090512161025.GO21112@tiger.fi.esaote.it> In-Reply-To: <20090512161025.GO21112@tiger.fi.esaote.it> MIME-Version: 1.0 Content-Disposition: inline Message-Id: <200905121426.43467.jhb@freebsd.org> Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.0.1 (bigwig.baldwin.cx); Tue, 12 May 2009 14:26:48 -0400 (EDT) X-Virus-Scanned: clamav-milter 0.95 at bigwig.baldwin.cx X-Virus-Status: Clean X-Spam-Status: No, score=-2.5 required=4.2 tests=AWL,BAYES_00,RDNS_NONE autolearn=no version=3.2.5 X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on bigwig.baldwin.cx Cc: scottl@freebsd.org, siedar@nplay.pl, freebsd-stable@freebsd.org Subject: Re: kern/130330: [mpt] [panic] Panic and reboot machine MPT ... X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 12 May 2009 18:26:51 -0000 On Tuesday 12 May 2009 12:10:25 pm Riccardo Torrini wrote: > On Tue, May 12, 2009 at 11:44:20AM -0400, John Baldwin wrote: > > > If you can get a stack trace, that would be most helpful. > > My guess is that the recovery thread is holding the mpt lock > > and calling some CAM routine which attempts to relock it via > > cam_periph_lock(). A stack trace would be most telling in > > that case. > > Rebooted, inserted 2nd disk (copied by hand, sorry for delay) > > mpt0: External Bus Reset Detected > mpt0:vol0(mpt:0:0:0): Phisycal Disk Status Changed > mpt0:vol0(mpt:0:0:0): Phisycal Disk Status Changed (yes, two times) > Kernel page fault with the following non-sleepable lock held: > exclusive sleep mutex mpt r = 0 (0xc4001004) locked @ \ > /usr/src/sys/cam/cam_xpt.c:7153 > KBD: enter: witness_warn > [ thread pid 19 tid 100018 ] > Stopped at kdb_enter_why+0x3a: movl $0,kbd_why > > db> bt > Tracing pid 19 tid 100018 td 0xc3fb8880 > [...] > --- trap 0xc, eip = 0xc0438f4e, esp = 0xc43b2b98, ebp = 0xc43b2bb0 --- > xpt_done(c404f400,c0719000,5,5,0,...) at xpt_done+0x1b > xpt_scan_bus(c3f39a80,c4045400,c06cfa7a,c072f824,c4011914,...) \ > at xpt_scan_bus+0x39f > camisr_runqueue(c4001004,0,c06cfa7a,1bf1,0,...) \ > at camisr_runqueue+0x38a > camisr(0,0,c06e99fb,4b6,c3f39a68,...) at camisr+0x10d > ithread_loop() > fork_exit() > fork_trampoline() > > > Still at db> prompt =) Hmm, this is a different panic. :( You could perhaps try bzero()'ing the ccb before calling xpt_setup_ccb() in mpt_raid_thread() but the old code didn't do that either (it just used M_WAITOK w/o M_ZERO). -- John Baldwin