Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 28 Nov 2006 17:39:29 +0100
From:      Palle Girgensohn <girgen@FreeBSD.org>
To:        gayn.winters@bristolsystems.com, hardware@freebsd.org
Subject:   RE: no file system after replacing bad RAID drive
Message-ID:  <E03FB902373D6F28723EDF19@rambutan.pingpong.net>
In-Reply-To: <04dc01c71308$ecc9e930$6501a8c0@workdog>
References:  <04dc01c71308$ecc9e930$6501a8c0@workdog>

next in thread | previous in thread | raw e-mail | index | archive | help


--On tisdag, november 28, 2006 08.18.59 -0800 Gayn Winters 
<gayn.winters@bristolsystems.com> wrote:

>> -----Original Message-----
>> From: Palle Girgensohn [mailto:girgen@FreeBSD.org]
>> Sent: Tuesday, November 28, 2006 7:38 AM
>> To: gayn.winters@bristolsystems.com; hardware@freebsd.org
>> Subject: RE: no file system after replacing bad RAID drive
>>
>>
>>
>>
>> --On tisdag, november 28, 2006 07.16.47 -0800 Gayn Winters
>> <gayn.winters@bristolsystems.com> wrote:
>>
>> >> -----Original Message-----
>> >>
>> >> Hi!
>> >>
>> >> We just got Dell to replace a bad drive in a RAID5 cluster.
>> >> After reboot,
>> >> megarc says their all online, but FreeBSD cannot find a file
>> >> system on the
>> >> logical drive. It worked before replacing (in degraded mode).
>> >>
>> >> It is a Dell 2850 with a Perc4/I (really a LSILogic MegaRAID)
>> >> controller
>> >> running FreeBSD 6.0
>> >>
>> >> Any ideas how to get the file system back? bsdlabel finds no
>> >> label, and
>> >> since the OS does not find the label during startup, there
>> >> are no devices.
>> >> I tried some megarc and camcontrol commands, but nothing
>> >> seems to help.
>> >>
>> >> Any ideas appreciated. Thanks,
>> >> Palle
>> >
>> > I have not been able to get my PERC4 to rebuild while
>> online.  Have you
>> > tried stopping the boot (control M as I recall) to get into
>> the PERC4
>> > firmware and rebuilding the RAID from there?  My rebuild
>> (3x150GB RAID5)
>> > takes about 14 hours - totally offline.
>>
>> Well, before replacing the disk, the file system worked OK
>> (only degraded,
>> no redundancy). The Dell guy replaced the disk while the
>> system was shut
>> off, Dell thinks that may have something to do with, but it
>> sounds strange
>> to me. When disk was inserted, it seemed like it was
>> rebuilding, since all
>> disks in the cluster where flashing vividely.
>>
>> I tried a rebuild with the system on line, by first setting
>> the new disk
>> off line and then
>>
>> # megarc -physoff -a0 pd'['0:3']'
>> # megarc -doRbld -a0 -RbldArray'[0:3]' -ShowProg
>>
>> it took a couple of hours, maybe four, but not 14. Perhaps it
>> is not OK, I
>> dunno.
>>
>>
>> My problem is that it cannot find a bsd label on the disk,
>> and hence no
>> devices are created:
>>
>> $ ls -l /dev/amrd1*
>> crw-r-----  1 root  operator    0,  66 Nov 28 10:09 /dev/amrd1
>> $
>>
>> fdisk looks OK:
>>
>> # fdisk amrd1
>> ******* Working on device /dev/amrd1 *******
>> parameters extracted from in-core disklabel are:
>> cylinders=35669 heads=255 sectors/track=63 (16065 blks/cyl)
>>
>> Figures below won't work with BIOS for partitions not in cyl 1
>> parameters to be used for BIOS calculations are:
>> cylinders=35669 heads=255 sectors/track=63 (16065 blks/cyl)
>>
>> fdisk: invalid fdisk partition table found
>> Media sector size is 512
>> Warning: BIOS sector numbering starts with sector 1
>> Information from DOS bootblock is:
>> The data for partition 1 is:
>> sysid 165 (0xa5),(FreeBSD/NetBSD/386BSD)
>>     start 63, size 573022422 (279796 Meg), flag 80 (active)
>>         beg: cyl 0/ head 1/ sector 1;
>>         end: cyl 852/ head 254/ sector 63
>> The data for partition 2 is:
>> <UNUSED>
>> The data for partition 3 is:
>> <UNUSED>
>> The data for partition 4 is:
>> <UNUSED>
>>
>> but I cannot look for a bsdlabel, since there is no slice?
>> There used to be
>> a /dev/amrd1s1d that was one single slice for the entire
>> disk. It should be
>> possible to reproduce it, but how and what happens if I do this?
>>
>> If I enter sysinstall, it is all empty, no slice and
>> naturally label, so I
>> need to add a partition slice in the fdisk submenu, and then
>> I can create a
>> bsd label. Problem is, I want to data back... :-/
>>
>> /Palle
>>
>
> Hi Palle,
>
> You might be missing my key point.  What you are tying to do is to use
> some OS tool.  In my experience with Dell's PERC4 cards (which are not
> exactly LSI Logic cards - in fact LSI Logic will not support these cards
> and advises to only use Dell firmware with them) you have to get into
> the PERC4 firmware setup to do a rebuild.  Watch carefully what is
> displayed on the screen when you reboot your system.  You need to get
> into the PERC4 firmware BEFORE the opportunity to get into the BIOS
> setup (because the last thing the BIOS does is call INT 21 to run the
> OS.)  Watch the screen.  I think the key combination to get into the
> PERC4 firmware is Control M, but it may be Alt M.  I can't remember for
> sure and I don't feel like rebooting the server of mine that uses the
> PERC4.  It says on the screen as the system boots.  As usual you have to
> be pretty quick pressing the keys to get into this setup.
>
> Once you are into the PERC4 firmware, you will get a nice display of
> your physical drives.  Most likely, it will still say "degraded", and it
> will indicate which physical drive is causing the problem. (This better
> be the drive you replaced, or you replaced the wrong drive!)  You can
> rebuild the new drive from here.  This rebuild is before the BIOS does
> its setup thing.  Don't let the system try to boot any OS - not even
> from a recovery disk, because then you have missed the PERC4 firmware
> setup!
>
> In my experience, the time it takes to do a rebuild increases as your
> (or my) drives fill up.  This makes sense because the redundancy
> calculation is trivial if all the blocks on the other n-1 drives are
> zeroes.  My drives are getting rather full.  During the rebuild, you
> will get a percent complete display.
>
> Good luck!
>
> -gayn
>
> Bristol Systems Inc.
> 714/532-6776
> www.bristolsystems.com


OK, I guess I'll have to try this. The megarc (check ports, 
sysutisl/megarc) is a utility from LSI, but as you say, it does run from 
the OS. The odd thing is, even with the logical drive in a degrade state, 
it should really work, since I have two out of three disk in a RAID 5 
cluster. It worked fine before I replaced the bad drive. Now, it doesn't 
work. Odd. :(

/Palle


Thanks for your help






Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?E03FB902373D6F28723EDF19>