Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 28 Nov 2006 08:18:59 -0800
From:      "Gayn Winters" <gayn.winters@bristolsystems.com>
To:        "'Palle Girgensohn'" <girgen@FreeBSD.org>, <hardware@freebsd.org>
Subject:   RE: no file system after replacing bad RAID drive
Message-ID:  <04dc01c71308$ecc9e930$6501a8c0@workdog>
In-Reply-To: <66DACABD698EBC4779CC4429@rambutan.pingpong.net>

next in thread | previous in thread | raw e-mail | index | archive | help
> -----Original Message-----
> From: Palle Girgensohn [mailto:girgen@FreeBSD.org] 
> Sent: Tuesday, November 28, 2006 7:38 AM
> To: gayn.winters@bristolsystems.com; hardware@freebsd.org
> Subject: RE: no file system after replacing bad RAID drive
> 
> 
> 
> 
> --On tisdag, november 28, 2006 07.16.47 -0800 Gayn Winters 
> <gayn.winters@bristolsystems.com> wrote:
> 
> >> -----Original Message-----
> >>
> >> Hi!
> >>
> >> We just got Dell to replace a bad drive in a RAID5 cluster.
> >> After reboot,
> >> megarc says their all online, but FreeBSD cannot find a file
> >> system on the
> >> logical drive. It worked before replacing (in degraded mode).
> >>
> >> It is a Dell 2850 with a Perc4/I (really a LSILogic MegaRAID)
> >> controller
> >> running FreeBSD 6.0
> >>
> >> Any ideas how to get the file system back? bsdlabel finds no
> >> label, and
> >> since the OS does not find the label during startup, there
> >> are no devices.
> >> I tried some megarc and camcontrol commands, but nothing
> >> seems to help.
> >>
> >> Any ideas appreciated. Thanks,
> >> Palle
> >
> > I have not been able to get my PERC4 to rebuild while 
> online.  Have you
> > tried stopping the boot (control M as I recall) to get into 
> the PERC4
> > firmware and rebuilding the RAID from there?  My rebuild 
> (3x150GB RAID5)
> > takes about 14 hours - totally offline.
> 
> Well, before replacing the disk, the file system worked OK 
> (only degraded, 
> no redundancy). The Dell guy replaced the disk while the 
> system was shut 
> off, Dell thinks that may have something to do with, but it 
> sounds strange 
> to me. When disk was inserted, it seemed like it was 
> rebuilding, since all 
> disks in the cluster where flashing vividely.
> 
> I tried a rebuild with the system on line, by first setting 
> the new disk 
> off line and then
> 
> # megarc -physoff -a0 pd'['0:3']'
> # megarc -doRbld -a0 -RbldArray'[0:3]' -ShowProg
> 
> it took a couple of hours, maybe four, but not 14. Perhaps it 
> is not OK, I 
> dunno.
> 
> 
> My problem is that it cannot find a bsd label on the disk, 
> and hence no 
> devices are created:
> 
> $ ls -l /dev/amrd1*
> crw-r-----  1 root  operator    0,  66 Nov 28 10:09 /dev/amrd1
> $
> 
> fdisk looks OK:
> 
> # fdisk amrd1
> ******* Working on device /dev/amrd1 *******
> parameters extracted from in-core disklabel are:
> cylinders=35669 heads=255 sectors/track=63 (16065 blks/cyl)
> 
> Figures below won't work with BIOS for partitions not in cyl 1
> parameters to be used for BIOS calculations are:
> cylinders=35669 heads=255 sectors/track=63 (16065 blks/cyl)
> 
> fdisk: invalid fdisk partition table found
> Media sector size is 512
> Warning: BIOS sector numbering starts with sector 1
> Information from DOS bootblock is:
> The data for partition 1 is:
> sysid 165 (0xa5),(FreeBSD/NetBSD/386BSD)
>     start 63, size 573022422 (279796 Meg), flag 80 (active)
>         beg: cyl 0/ head 1/ sector 1;
>         end: cyl 852/ head 254/ sector 63
> The data for partition 2 is:
> <UNUSED>
> The data for partition 3 is:
> <UNUSED>
> The data for partition 4 is:
> <UNUSED>
> 
> but I cannot look for a bsdlabel, since there is no slice? 
> There used to be 
> a /dev/amrd1s1d that was one single slice for the entire 
> disk. It should be 
> possible to reproduce it, but how and what happens if I do this?
> 
> If I enter sysinstall, it is all empty, no slice and 
> naturally label, so I 
> need to add a partition slice in the fdisk submenu, and then 
> I can create a 
> bsd label. Problem is, I want to data back... :-/
> 
> /Palle
> 

Hi Palle,

You might be missing my key point.  What you are tying to do is to use
some OS tool.  In my experience with Dell's PERC4 cards (which are not
exactly LSI Logic cards - in fact LSI Logic will not support these cards
and advises to only use Dell firmware with them) you have to get into
the PERC4 firmware setup to do a rebuild.  Watch carefully what is
displayed on the screen when you reboot your system.  You need to get
into the PERC4 firmware BEFORE the opportunity to get into the BIOS
setup (because the last thing the BIOS does is call INT 21 to run the
OS.)  Watch the screen.  I think the key combination to get into the
PERC4 firmware is Control M, but it may be Alt M.  I can't remember for
sure and I don't feel like rebooting the server of mine that uses the
PERC4.  It says on the screen as the system boots.  As usual you have to
be pretty quick pressing the keys to get into this setup.

Once you are into the PERC4 firmware, you will get a nice display of
your physical drives.  Most likely, it will still say "degraded", and it
will indicate which physical drive is causing the problem. (This better
be the drive you replaced, or you replaced the wrong drive!)  You can
rebuild the new drive from here.  This rebuild is before the BIOS does
its setup thing.  Don't let the system try to boot any OS - not even
from a recovery disk, because then you have missed the PERC4 firmware
setup!

In my experience, the time it takes to do a rebuild increases as your
(or my) drives fill up.  This makes sense because the redundancy
calculation is trivial if all the blocks on the other n-1 drives are
zeroes.  My drives are getting rather full.  During the rebuild, you
will get a percent complete display.  

Good luck!

-gayn

Bristol Systems Inc.
714/532-6776
www.bristolsystems.com 





Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?04dc01c71308$ecc9e930$6501a8c0>