From owner-freebsd-questions@FreeBSD.ORG Mon Mar 2 00:12:32 2009 Return-Path: <owner-freebsd-questions@FreeBSD.ORG> Delivered-To: questions@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id BB08010656BD for <questions@freebsd.org>; Mon, 2 Mar 2009 00:12:32 +0000 (UTC) (envelope-from alex@schnarff.com) Received: from mho-02-bos.mailhop.org (mho-02-bos.mailhop.org [63.208.196.179]) by mx1.freebsd.org (Postfix) with ESMTP id 7A3818FC28 for <questions@freebsd.org>; Mon, 2 Mar 2009 00:12:32 +0000 (UTC) (envelope-from alex@schnarff.com) Received: from [65.102.233.117] (helo=www.schnarff.com) by mho-02-bos.mailhop.org with esmtpsa (TLSv1:AES256-SHA:256) (Exim 4.68) (envelope-from <alex@schnarff.com>) id 1Ldufr-0000AJ-HJ for questions@freebsd.org; Sun, 01 Mar 2009 23:02:19 +0000 Received: (qmail 69296 invoked by uid 80); 1 Mar 2009 23:00:06 -0000 Received: from alex-fios (alex-fios [173.79.6.3]) by mail.schnarff.com (Horde Framework) with HTTP; Sun, 01 Mar 2009 18:00:06 -0500 X-Mail-Handler: MailHop Outbound by DynDNS X-Originating-IP: 65.102.233.117 X-Report-Abuse-To: abuse@dyndns.com (see http://www.dyndns.com/services/mailhop/outbound_abuse.html for abuse reporting information) X-MHO-User: U2FsdGVkX18iNgmN5efFbKHtJjJDfiZUMmZIqLzjqn8= Message-ID: <20090301180006.19402mvtopuv9go4@mail.schnarff.com> Date: Sun, 01 Mar 2009 18:00:06 -0500 From: Alex Kirk <alex@schnarff.com> To: questions@freebsd.org MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; DelSp="Yes"; format="flowed" Content-Disposition: inline Content-Transfer-Encoding: quoted-printable User-Agent: Internet Messaging Program (IMP) H3 (4.3) / FreeBSD-7.0 Cc: Subject: RAID Gone Wild - One Array Split Into Two X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: User questions <freebsd-questions.freebsd.org> List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-questions>, <mailto:freebsd-questions-request@freebsd.org?subject=unsubscribe> List-Archive: <http://lists.freebsd.org/pipermail/freebsd-questions> List-Post: <mailto:freebsd-questions@freebsd.org> List-Help: <mailto:freebsd-questions-request@freebsd.org?subject=help> List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-questions>, <mailto:freebsd-questions-request@freebsd.org?subject=subscribe> X-List-Received-Date: Mon, 02 Mar 2009 00:12:33 -0000 First off, I realize that this may be more of a lower-level hardware =20 question than is appropriate to ask here, but I'm at a real loss, and =20 have no idea who else to ask...so I apologize in advance if I'm being =20 a pest. That said: I've got a FreeBSD 7.0/stable box that is used as the =20 development server for a live system I administer. It recently crapped =20 out on me (the dev box), and I realized that its power supply had =20 kicked the bucket. After going out and replacing the power supply, it =20 booted right back up, I ssh'd in, and when I ran my first userland =20 command - "w", FWIW - it froze up solid. I got one more SSH session in =20 attempting to figure out WTF was going on before it wouldn't even log =20 me in any more. After a couple of hard reboots, I decided to attach a monitor to it to =20 see what was going on. It turns out that the RAID5 array on the system =20 had really lost its mind - all four devices that were part of the =20 array were listed as being offline, which of course meant that the =20 system could no longer boot (as it was booting off of the RAID). The =20 controller is an integrated Intel Matrix DHC7R, built onto the =20 motherboard. I looked around the web a bit to try to figure out how to fix this, =20 and ran across a couple of forum posts (which I can unfortunately no =20 longer seem to find) suggesting that this particular controller was =20 prone to an issue where hard power-downs would sometimes make the =20 drives go offline, and that I needed to boot from CD to re-initialize =20 them into their previous state. I tried first with an Ubuntu Linux CD =20 I had handy - which promptly freaked out and dropped me into an =20 emergency shell - and then the FreeBSD 7.0 boot-only disc. The latter =20 was a bit more helpful, because I got this diagnostic: ar0: WARNING - parity protection lost, RAID5 array in DEGRADED mode ar0: 715418MB <Intel MatrixRAID RAID5 (stripe 64KB)> status: DEGRADED ar0: disk0 READY using ad4 at ata2-master ar0: disk1 READY using ad8 at ata4-master ar0: disk2 READY using ad6 at ata3-master ar0: disk3 DOWN no device found for this subdisk ar1: 715418MB <Intel MatrixRAID RAID5 (stripe 64KB)> status: BROKEN ar1: disk0 DOWN no device found for this subdisk ar1: disk1 DOWN no device found for this subdisk ar1: disk2 DOWN no device found for this subdisk ar1: disk3 READY using ad10 at ata5-master Now I can see that my problem is that I've somehow got *two* RAID =20 devices, both improperly configured, whereas I'd only had one before. Does anyone have a clue how I can fix this, preferably while retaining =20 my data? I could wipe the box if necessary, but I'd really prefer not =20 to, as that would be a huge pain in the butt. Thanks, Alex Kirk ---------------------------------------------------------------- This message was sent using IMP, the Internet Messaging Program.