Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 01 Mar 2009 18:00:06 -0500
From:      Alex Kirk <alex@schnarff.com>
To:        questions@freebsd.org
Subject:   RAID Gone Wild - One Array Split Into Two
Message-ID:  <20090301180006.19402mvtopuv9go4@mail.schnarff.com>

next in thread | raw e-mail | index | archive | help
First off, I realize that this may be more of a lower-level hardware =20
question than is appropriate to ask here, but I'm at a real loss, and =20
have no idea who else to ask...so I apologize in advance if I'm being =20
a pest.

That said: I've got a FreeBSD 7.0/stable box that is used as the =20
development server for a live system I administer. It recently crapped =20
out on me (the dev box), and I realized that its power supply had =20
kicked the bucket. After going out and replacing the power supply, it =20
booted right back up, I ssh'd in, and when I ran my first userland =20
command - "w", FWIW - it froze up solid. I got one more SSH session in =20
attempting to figure out WTF was going on before it wouldn't even log =20
me in any more.

After a couple of hard reboots, I decided to attach a monitor to it to =20
see what was going on. It turns out that the RAID5 array on the system =20
had really lost its mind - all four devices that were part of the =20
array were listed as being offline, which of course meant that the =20
system could no longer boot (as it was booting off of the RAID). The =20
controller is an integrated Intel Matrix DHC7R, built onto the =20
motherboard.

I looked around the web a bit to try to figure out how to fix this, =20
and ran across a couple of forum posts (which I can unfortunately no =20
longer seem to find) suggesting that this particular controller was =20
prone to an issue where hard power-downs would sometimes make the =20
drives go offline, and that I needed to boot from CD to re-initialize =20
them into their previous state. I tried first with an Ubuntu Linux CD =20
I had handy - which promptly freaked out and dropped me into an =20
emergency shell - and then the FreeBSD 7.0 boot-only disc. The latter =20
was a bit more helpful, because I got this diagnostic:

ar0: WARNING - parity protection lost, RAID5 array in DEGRADED mode
ar0: 715418MB <Intel MatrixRAID RAID5 (stripe 64KB)> status: DEGRADED
ar0: disk0 READY using ad4 at ata2-master
ar0: disk1 READY using ad8 at ata4-master
ar0: disk2 READY using ad6 at ata3-master
ar0: disk3 DOWN no device found for this subdisk
ar1: 715418MB <Intel MatrixRAID RAID5 (stripe 64KB)> status: BROKEN
ar1: disk0 DOWN no device found for this subdisk
ar1: disk1 DOWN no device found for this subdisk
ar1: disk2 DOWN no device found for this subdisk
ar1: disk3 READY using ad10 at ata5-master

Now I can see that my problem is that I've somehow got *two* RAID =20
devices, both improperly configured, whereas I'd only had one before.

Does anyone have a clue how I can fix this, preferably while retaining =20
my data? I could wipe the box if necessary, but I'd really prefer not =20
to, as that would be a huge pain in the butt.

Thanks,
Alex Kirk


----------------------------------------------------------------
This message was sent using IMP, the Internet Messaging Program.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20090301180006.19402mvtopuv9go4>