Date: Sun, 25 Nov 2007 14:58:05 -0800 From: David Newman <dnewman@networktest.com> To: Ted Mittelstaedt <tedm@toybox.placo.com> Cc: freebsd-questions@freebsd.org Subject: Re: dealing with a failing drive Message-ID: <4749FDFD.8010002@networktest.com> In-Reply-To: <BMEDLGAENEKCJFGODFOCOEBKCFAA.tedm@toybox.placo.com> References: <BMEDLGAENEKCJFGODFOCOEBKCFAA.tedm@toybox.placo.com>
next in thread | previous in thread | raw e-mail | index | archive | help
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 11/24/07 12:39 PM, Ted Mittelstaedt wrote: > The output of idacontrol show will show if one of the > hard disks in the SmartArray has failed. Your choice with > a hardware array is to either run it with redundancy or not. > (ie: raid5 or mirroring or striping) You have to choose > which is more important for you. > > IMHO it is very foolish to stripe an array that you have > critical data on and assume that you can predict a failure > of a disk using smart or other monitoring, and replace it > in advance of a failure. If your concern is redundancy, then > add more disks to the array and create a raid 5 or a mirror. > Then ignore all the predictive junk and let the array card > concern itself with detecting if a drive has failed. Run > idacontrol periodically out of a script that checks for a > failure of a disk and e-mails you if there is one. Thanks, this is good advice, but it doesn't answer the specific questions I had: 1. How to diagnose the health of a *physical* disk that's part of a RAID array (RAID1, in this case) in an old Compaq Proliant server? 2. Is it normal for idacontrol to generate soft write errors? Backstory here is that Proliant server #1 generated beaucoup hard and soft read and write errors and eventually locked up. I thought it was one of the disks but replacing one at a time didn't help. So I took both disks and put them in identical Proliant server #2. Ergo, I would conclude server #1's RAID controller flaked out. idacontrol is useful for telling the health of the logical disk. What it doesn't tell me (or maybe I just don't see it) is whether the physical disks are ok, and those "soft write errors" concern me. I had a failure situation, and need to figure out whether just the controller was bad or whether I need to replace at least one disk too. Thanks again! dn -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.3 (Darwin) iD8DBQFHSf39yPxGVjntI4IRAp1yAJ4vMV9FkeaBsHRr/Z5WpCL27wJ3tACfS+pT 3UVlscnQUZhe8ulHksKDWsY= =Om7/ -----END PGP SIGNATURE-----
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4749FDFD.8010002>