From owner-freebsd-scsi@FreeBSD.ORG Tue Feb 10 06:49:11 2009 Return-Path: Delivered-To: freebsd-scsi@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 1B6AC106564A for ; Tue, 10 Feb 2009 06:49:11 +0000 (UTC) (envelope-from spork@bway.net) Received: from xena.bway.net (xena.bway.net [216.220.96.26]) by mx1.freebsd.org (Postfix) with ESMTP id C78808FC0C for ; Tue, 10 Feb 2009 06:49:10 +0000 (UTC) (envelope-from spork@bway.net) Received: (qmail 62099 invoked by uid 0); 10 Feb 2009 06:49:09 -0000 Received: from unknown (HELO toasty.nat.fasttrackmonkey.com) (spork@96.57.144.66) by smtp.bway.net with (DHE-RSA-AES256-SHA encrypted) SMTP; 10 Feb 2009 06:49:09 -0000 Date: Tue, 10 Feb 2009 01:49:09 -0500 (EST) From: Charles Sprickman X-X-Sender: spork@toasty.nat.fasttrackmonkey.com To: Scott Long In-Reply-To: <49911C68.6030203@samsco.org> Message-ID: References: <49911C68.6030203@samsco.org> User-Agent: Alpine 2.00 (OSX 1167 2008-08-23) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: freebsd-scsi@freebsd.org Subject: Re: 7.1 Panic on degraded disk w/mpt X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 10 Feb 2009 06:49:11 -0000 On Mon, 9 Feb 2009, Scott Long wrote: > Charles Sprickman wrote: >> (posted on -stable already, no takers - added info: full dmesg, crash info >> from panic when array finished rebuilding, some comments on dmesg) >> >> Howdy, >> >> I dug around and can't find a PR on this, and the only other report I saw >> was in this mailing list post that has no replies: >> >> http://www.nabble.com/7.1-BETA2-panic-on-mpt-degrade-td20183173.html >> >> The hardware is a Dell PowerEdge 860 with the Dell/LSI SAS5 controller: >> >> mpt0: port 0xec00-0xecff mem >> 0xfe9fc000-0xfe9fffff,0xfe9e0000-0xfe9effff irq 16 at device 8.0 on pci2 >> mpt0: MPI Version=1.5.13.0 >> >> The panic is repeatable by forcing the array into a degraded state. When >> the array finishes rebuilding, the box also panics. >> >> Here's my best shot at getting info out of kgdb (panic on array going to >> degraded state): > > I wonder if the MPT card is temporarily detaching and then reattaching > the logical drive when the rebuild completes. IIRC, just before the panic there is a bunch of CAM debug splattered across the monitor. I can run down to the garage and snap a few pics of the monitor after detaching a drive. > The info you posted is inconclusive here. CAM (the FreeBSD SCSI layer) > has had some problems handling device detaches, but we've been very > fortunate to have someone examining and fixing this recently. Yeah, I was looking at the commit log for cam_xpt.c and someone has been very busy... > Would it be possible for you to upgrade to the most recent 8-CURRENT > tree, and re-run your test? If not, I'll see about generating a > patchset against 7.1. Can I get away with just updating the kernel, or is there a simple way to build a live-cd? I don't want to screw with userland, but I'd boot a kernel if that's not too rough - but I suppose my 7.1 kgdb would not know what to do with the dump, right? On the bright side, the controller is not getting so scrambled by the panic that it can no longer write the crashdump. That's a positive! I'm going to go panic it again, I'm getting curious about the messages before the panic... Thanks, Charles > Scott >