Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 12 Sep 2008 11:50:41 -0400
From:      "Zaphod Beeblebrox" <zbeeble@gmail.com>
To:        "Karl Pielorz" <kpielorz_lst@tdx.co.uk>
Cc:        freebsd-hackers@freebsd.org, Jeremy Chadwick <koitsu@freebsd.org>
Subject:   Re: ZFS w/failing drives - any equivalent of Solaris FMA?
Message-ID:  <5f67a8c40809120850u60c23fc4m7c4c1341fb2c4966@mail.gmail.com>
In-Reply-To: <3BE629D093001F6BA2C6791C@Slim64.dmpriest.net.uk>
References:  <C984A6E7B1C6657CD8C4F79E@Slim64.dmpriest.net.uk> <20080912132102.GB56923@icarus.home.lan> <3BE629D093001F6BA2C6791C@Slim64.dmpriest.net.uk>

next in thread | previous in thread | raw e-mail | index | archive | help
On Fri, Sep 12, 2008 at 10:34 AM, Karl Pielorz <kpielorz_lst@tdx.co.uk>wrote:


> --On 12 September 2008 06:21 -0700 Jeremy Chadwick <koitsu@FreeBSD.org>
> wrote:
>
>  As far as I know, there is no such "standard" mechanism in FreeBSD.  If
>> the drive falls off the bus entirely (e.g. detached), I would hope ZFS
>> would notice that.  I can imagine it (might) also depend on if the disk
>> subsystem you're using is utilising CAM or not (e.g. disks should be daX
>> not adX); Scott Long might know if something like this is implemented in
>> CAM.  I'm fairly certain nothing like this is implemented in ata(4).
>>
>
> For ATA, at the moment - I don't think it'll notice even if a drive
> detaches. I think like my system the other day, it'll just keep issuing I/O
> commands to the drive, even if it's disappeared (it might get much 'quicker
> failures' if the device has 'gone' to the point of FreeBSD just quickly
> returning 'fail' for every request).


Since I had the opportunity, I tested this recently for both CAM and ATA.
Now the RAID engine was gmirror in both cases (my production hardware
doesn't do ZFS yet), but I expect the reaction to be somewhat the same.

Both systems were Dell 1U's.  One, an R200, had SATA disks attached to a
plain SATA controller.  I believe it may have supported RAID1, but I didn't
use that functionality.  When a drive was removed from it, it stalled for
some time (30 minutes?) and then resumed working.  by the time I could type
on the machine again, gmirror had decided that the drive was gone and marked
the mirror as degraded.

The other system was a 1950-III with a SCSI SAS controller attached to an
SAS hot-swap backplane.  The drives themselves were 750G SATA drives.
Yanking one of them resulted in about 5 seconds of disruption followed by
gmirror realizing the problem and marking the mirror degraded.

Neither system was heavily loaded during the test.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?5f67a8c40809120850u60c23fc4m7c4c1341fb2c4966>