Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 10 Aug 2017 16:22:59 +0100
From:      Frank Leonhardt <frank2@fjl.co.uk>
To:        freebsd-hardware@freebsd.org
Subject:   Re: Do I need SAS drives?..
Message-ID:  <25450400-4ba2-76d4-605c-fce37c1c905b@fjl.co.uk>
In-Reply-To: <CAOtMX2heJM1ekUowWdrri8x40JYFcYoQDrj0U45qSekO0C-ezQ@mail.gmail.com>
References:  <4DFBCE11-913A-4FC9-937D-463B4D49816C@aldan.algebra.com> <CAOtMX2jeUbSm535Zvd_7aHfQao-dMs5zbU0o3GRWk%2BcmW1Nq=g@mail.gmail.com> <E50CE928-23D0-4415-A82C-FE2EA3D52512@gmail.com> <CAOtMX2heJM1ekUowWdrri8x40JYFcYoQDrj0U45qSekO0C-ezQ@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On 10/08/2017 15:01, Alan Somers wrote:
> Really interesting answer Alan, thank you very much !
>> Slightly off-topic but I take this opportunity,
>> how do you check SAS drives health ?
>> I personally cron a background long test every 2 weeks (using smartmontools).
>> I did not experience SAS drive error yet, so not sure how this behaves.
>> Does the drive reports to FreeBSD when its read or write error rate cross
>> a threshold (so that we can replace it before it fails) ?
>> Or perhaps smartd will do ?
>>
>> As an example below a SAS error counter log returned by smartctl :
>>      Errors Corrected by          Total   Correction    Gigabytes    Total
>>          ECC         rereads/    errors   algorithm     processed    uncorrected
>>      fast | delayed  rewrites  corrected  invocations  [10^9 bytes]  errors
>> read:   0       49        0        49     233662     73743.588           0
>> write:  0        3        0         3      83996      9118.895           0
>> verify: 0        0        0         0      28712         0.000           0
>>
>> Thank you !
>>
>> Ben
> smartmontools is probably the best way to read SAS error logs.
> Interpreting them can be hard, though.  The Backblaze blog is probably
> the best place to get current advice.  But the easiest thing to do is
> certainly to wait until something fails hard.  With ZFS, you can have
> up to 3 drives' worth of redundancy, and hotspares too.

I concur with Alan. Trying to predict drive failure is a mug's game. 
Very through research (e.g. Google, 2007) has shown it's a waste of time 
trying.

With ZFS (or geom mirror) a drive will be "failed" as soon as there's a 
problem and you can get notification using a cron job that sends an 
email if the output of zpool status (or gmirror status ) contains 
"DEGRADED".

That said, I've found it useful to use smartctl to pick up when a drive 
is overheating, usually due to fan failure. You might also find the new 
(11.0+?) sesutil handy to monitor components on a SAS expander IF YOU 
HAVE ONE. Things like fans and temperature sensors are readable this way.

Regards, Frank.




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?25450400-4ba2-76d4-605c-fce37c1c905b>