From owner-freebsd-bugs@FreeBSD.ORG Mon Feb 4 22:00:05 2008 Return-Path: Delivered-To: freebsd-bugs@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 5846716A417 for ; Mon, 4 Feb 2008 22:00:05 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id 45F2013C474 for ; Mon, 4 Feb 2008 22:00:05 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (gnats@localhost [127.0.0.1]) by freefall.freebsd.org (8.14.2/8.14.2) with ESMTP id m14M0580076327 for ; Mon, 4 Feb 2008 22:00:05 GMT (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.2/8.14.1/Submit) id m14M05wH076326; Mon, 4 Feb 2008 22:00:05 GMT (envelope-from gnats) Date: Mon, 4 Feb 2008 22:00:05 GMT Message-Id: <200802042200.m14M05wH076326@freefall.freebsd.org> To: freebsd-bugs@FreeBSD.org From: Jeremy Chadwick Cc: Subject: Re: kern/108924: [ar] Panics when Intel MatrixRAID RAID1 is degraded X-BeenThere: freebsd-bugs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: Jeremy Chadwick List-Id: Bug reports List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 04 Feb 2008 22:00:05 -0000 The following reply was made to PR kern/108924; it has been noted by GNATS. From: Jeremy Chadwick To: bug-followup@FreeBSD.org, taras@elantech.ru Cc: sos@freebsd.org, delphij@FreeBSD.org Subject: Re: kern/108924: [ar] Panics when Intel MatrixRAID RAID1 is degraded Date: Mon, 4 Feb 2008 13:55:01 -0800 Wow, this is a fairly old problem with no solution in over a year? Here's some additional details from my testing. This is easily reproducable. I'll work on getting a kernel with DDB/KDB so one can do backtraces via serial console; I can provide access to this if need be. Details: * FreeBSD 7.0-RC1 (and previous 7.0 releases) * Supermicro SuperServer 5015M-T (Supermicro PDSMI+ motherboard) * Built-in Intel ICH7 controller * Hot-swap backplane (which works when disks are JBOD and not using MatrixRAID) Installed i386 FreeBSD on ar0 without a problem: ad4: 190782MB at ata2-master SATA150 ad6: 190782MB at ata3-master SATA150 ar0: 190779MB status: READY ar0: disk0 READY (master) using ad4 at ata2-master ar0: disk1 READY (mirror) using ad6 at ata3-master But I attempted a hard failure of a disk, and reattachment of that disk, FreeBSD eventually made the entire mirror unusable. Here's the steps I took: 1) Removed ad4 disk - Kernel said: ar0: WARNING - mirror protection lost. RAID1 array in DEGRADED mode subdisk4: detached ad4: detached 2) atacontrol list - no sign of ad4 on ATA channel 2 3) atacontrol status ar0 ar0: ATA RAID1 status: DEGRADED subdisks: 0 ---- MISSING 1 ad6 ONLINE 4) I then decided to copy some data to the array while degraded, just to make sure data got re-mirrored after bringing ad4 back online. 5) cp /boot/kernel/kernel /usr/test 6) Plugged ad4 disk back in - Disk LED came on for a second, then went off - No messages from kernel 7) atacontrol list - no sign of ad4 on ATA channel 2 8) atacontrol attach ata2 atacontrol: ioctl(IOCATAATTACH): File exists - LED on ad4 disk suddenly turns on and is lit constantly - gstat showed no activity on ad4 9) atacontrol status ar0 - same as previous run 10) atacontrol reinit ata2 no device present - LED on ad4 disk shut off 11) atacontrol status ar0 - same as previous run 12) atacontrol reinit ata2 - same as previous run 13) atacontrol detach ata2 14) atacontrol attach ata2 no device present - Kernel said: ata2: [ITHREAD] 15) atacontrol detach ata2 16) atacontrol attach ata2 no device present - Kernel said: ata2: [ITHREAD] 17) atacontrol reinit ata2 no device present 18) atacontrol list - no sign of ad4 on ATA channel 2 19) atacontrol detach ata2 20) atacontrol reinit ata2 - Kernel immediately paniced, and machine rebooted. - Intel RAID BIOS showed disk 0 (ad4) as "Offline Member", but disk statistics (size) were available, meaning the disk was visible and accessible - Array labelled as "Degraded" in BIOS 21) Booted into FreeBSD - Kernel started, and said: ad4: 190782MB at ata2-master SATA150 ad6: 190782MB at ata3-master SATA150 - Kernel immediately paniced; ar0 is never shown. - Process which paniced is 0 (swapper) - Single-user mode crashes at same point - Power-cycling doesn't help This thread also complains about similar issues: http://lists.freebsd.org/pipermail/freebsd-questions/2006-February/114274.html This really needs some focus. I'd be more than happy to purchase and donate new hardware for testing if required. -- | Jeremy Chadwick jdc at parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB |