Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 14 Nov 2007 20:16:12 -0500
From:      "jdow" <jdow@earthlink.net>
To:        <freebsd-questions@freebsd.org>
Subject:   Re: dealing with a failing drive
Message-ID:  <018c01c82725$1dfcd040$29a5a8c0@Thing>
References:  <4736593E.1090905@networktest.com><64c038660711102109x2ea186afjdd219292d8eed700@mail.gmail.com><47372644.4060201@networktest.com><20071112161416.GB98697@gizmo.acns.msu.edu><47388CCE.6080201@networktest.com> <20071112175351.GA99195@gizmo.acns.msu.edu>

next in thread | previous in thread | raw e-mail | index | archive | help
From: "Jerry McAllister" <jerrymc@msu.edu>
Sent: Monday, November 12, 2007 12:53


> On Mon, Nov 12, 2007 at 09:26:38AM -0800, David Newman wrote:
> 
>> On 11/12/07 8:14 AM, Jerry McAllister wrote:
>> 
>> > An update: After doing what you suggest (leaving in the "good" disk,
>> > adding a new disk, RAID rebuilding) I still got soft write errors --
>> > with *either one* of the disks I tried.
>> > 
>> > Then I tried putting both disks in an identical server and they came up
>> > fine, no read or write errors.
>> > 
>> > Ergo, the bad RAID controller is bad and the disks may be OK.
>> > 
>> >> Probably not.
>> >> Generally, if the RAID controller is bad, you will see errors
>> >> all over and not it just one place, tho I suppose it is possible.
>> >> Check and see what it reports as error locations and see if they
>> >> move around any.
>> 
>> Jerry, thanks for your response.
>> 
>> After 36 hours of running the same disks in a different, identical
>> machine there hasn't been a single read or write error. I'm hardly a
>> storage expert but from the evidence I have I'm inclined to believe the
>> root cause was a bad RAID controller and not failed disks.
> 
> That is not much proof. 
> The different machine would probably be accessing the disks in
> a different way, either slightly different positioning or using
> different space.   Also, 36 hours is not really much time.

Dn, I have had a Promise controller that was bad. I kept getting errors
at one specific location on two disks out of three on a RAID 5. The
system continued to operate. When I finally spent the time to nail it
down to the controller I found the Promise people more than anxious to
get the beast for a postmortem. It had been bad for me from day one. It
would take about a week to a month for the problem to appear. After the
6th disk showing the problem at the same block number the coin dropped
in my sometimes overly slow mind.

{^_-}    Joanne



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?018c01c82725$1dfcd040$29a5a8c0>