From owner-freebsd-questions@FreeBSD.ORG  Mon Mar  2 00:12:32 2009
Return-Path: <owner-freebsd-questions@FreeBSD.ORG>
Delivered-To: questions@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id BB08010656BD
	for <questions@freebsd.org>; Mon,  2 Mar 2009 00:12:32 +0000 (UTC)
	(envelope-from alex@schnarff.com)
Received: from mho-02-bos.mailhop.org (mho-02-bos.mailhop.org [63.208.196.179])
	by mx1.freebsd.org (Postfix) with ESMTP id 7A3818FC28
	for <questions@freebsd.org>; Mon,  2 Mar 2009 00:12:32 +0000 (UTC)
	(envelope-from alex@schnarff.com)
Received: from [65.102.233.117] (helo=www.schnarff.com)
	by mho-02-bos.mailhop.org with esmtpsa (TLSv1:AES256-SHA:256)
	(Exim 4.68) (envelope-from <alex@schnarff.com>) id 1Ldufr-0000AJ-HJ
	for questions@freebsd.org; Sun, 01 Mar 2009 23:02:19 +0000
Received: (qmail 69296 invoked by uid 80); 1 Mar 2009 23:00:06 -0000
Received: from alex-fios (alex-fios [173.79.6.3]) by mail.schnarff.com
	(Horde Framework) with HTTP; Sun, 01 Mar 2009 18:00:06 -0500
X-Mail-Handler: MailHop Outbound by DynDNS
X-Originating-IP: 65.102.233.117
X-Report-Abuse-To: abuse@dyndns.com (see
	http://www.dyndns.com/services/mailhop/outbound_abuse.html for
	abuse reporting information)
X-MHO-User: U2FsdGVkX18iNgmN5efFbKHtJjJDfiZUMmZIqLzjqn8=
Message-ID: <20090301180006.19402mvtopuv9go4@mail.schnarff.com>
Date: Sun, 01 Mar 2009 18:00:06 -0500
From: Alex Kirk <alex@schnarff.com>
To: questions@freebsd.org
MIME-Version: 1.0
Content-Type: text/plain;
	charset=ISO-8859-1;
	DelSp="Yes";
	format="flowed"
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable
User-Agent: Internet Messaging Program (IMP) H3 (4.3) / FreeBSD-7.0
Cc: 
Subject: RAID Gone Wild - One Array Split Into Two
X-BeenThere: freebsd-questions@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: User questions <freebsd-questions.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-questions>, 
	<mailto:freebsd-questions-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-questions>
List-Post: <mailto:freebsd-questions@freebsd.org>
List-Help: <mailto:freebsd-questions-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-questions>, 
	<mailto:freebsd-questions-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 02 Mar 2009 00:12:33 -0000

First off, I realize that this may be more of a lower-level hardware =20
question than is appropriate to ask here, but I'm at a real loss, and =20
have no idea who else to ask...so I apologize in advance if I'm being =20
a pest.

That said: I've got a FreeBSD 7.0/stable box that is used as the =20
development server for a live system I administer. It recently crapped =20
out on me (the dev box), and I realized that its power supply had =20
kicked the bucket. After going out and replacing the power supply, it =20
booted right back up, I ssh'd in, and when I ran my first userland =20
command - "w", FWIW - it froze up solid. I got one more SSH session in =20
attempting to figure out WTF was going on before it wouldn't even log =20
me in any more.

After a couple of hard reboots, I decided to attach a monitor to it to =20
see what was going on. It turns out that the RAID5 array on the system =20
had really lost its mind - all four devices that were part of the =20
array were listed as being offline, which of course meant that the =20
system could no longer boot (as it was booting off of the RAID). The =20
controller is an integrated Intel Matrix DHC7R, built onto the =20
motherboard.

I looked around the web a bit to try to figure out how to fix this, =20
and ran across a couple of forum posts (which I can unfortunately no =20
longer seem to find) suggesting that this particular controller was =20
prone to an issue where hard power-downs would sometimes make the =20
drives go offline, and that I needed to boot from CD to re-initialize =20
them into their previous state. I tried first with an Ubuntu Linux CD =20
I had handy - which promptly freaked out and dropped me into an =20
emergency shell - and then the FreeBSD 7.0 boot-only disc. The latter =20
was a bit more helpful, because I got this diagnostic:

ar0: WARNING - parity protection lost, RAID5 array in DEGRADED mode
ar0: 715418MB <Intel MatrixRAID RAID5 (stripe 64KB)> status: DEGRADED
ar0: disk0 READY using ad4 at ata2-master
ar0: disk1 READY using ad8 at ata4-master
ar0: disk2 READY using ad6 at ata3-master
ar0: disk3 DOWN no device found for this subdisk
ar1: 715418MB <Intel MatrixRAID RAID5 (stripe 64KB)> status: BROKEN
ar1: disk0 DOWN no device found for this subdisk
ar1: disk1 DOWN no device found for this subdisk
ar1: disk2 DOWN no device found for this subdisk
ar1: disk3 READY using ad10 at ata5-master

Now I can see that my problem is that I've somehow got *two* RAID =20
devices, both improperly configured, whereas I'd only had one before.

Does anyone have a clue how I can fix this, preferably while retaining =20
my data? I could wipe the box if necessary, but I'd really prefer not =20
to, as that would be a huge pain in the butt.

Thanks,
Alex Kirk


----------------------------------------------------------------
This message was sent using IMP, the Internet Messaging Program.