Date: Thu, 10 Aug 2017 16:22:59 +0100 From: Frank Leonhardt <frank2@fjl.co.uk> To: freebsd-hardware@freebsd.org Subject: Re: Do I need SAS drives?.. Message-ID: <25450400-4ba2-76d4-605c-fce37c1c905b@fjl.co.uk> In-Reply-To: <CAOtMX2heJM1ekUowWdrri8x40JYFcYoQDrj0U45qSekO0C-ezQ@mail.gmail.com> References: <4DFBCE11-913A-4FC9-937D-463B4D49816C@aldan.algebra.com> <CAOtMX2jeUbSm535Zvd_7aHfQao-dMs5zbU0o3GRWk%2BcmW1Nq=g@mail.gmail.com> <E50CE928-23D0-4415-A82C-FE2EA3D52512@gmail.com> <CAOtMX2heJM1ekUowWdrri8x40JYFcYoQDrj0U45qSekO0C-ezQ@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On 10/08/2017 15:01, Alan Somers wrote: > Really interesting answer Alan, thank you very much ! >> Slightly off-topic but I take this opportunity, >> how do you check SAS drives health ? >> I personally cron a background long test every 2 weeks (using smartmontools). >> I did not experience SAS drive error yet, so not sure how this behaves. >> Does the drive reports to FreeBSD when its read or write error rate cross >> a threshold (so that we can replace it before it fails) ? >> Or perhaps smartd will do ? >> >> As an example below a SAS error counter log returned by smartctl : >> Errors Corrected by Total Correction Gigabytes Total >> ECC rereads/ errors algorithm processed uncorrected >> fast | delayed rewrites corrected invocations [10^9 bytes] errors >> read: 0 49 0 49 233662 73743.588 0 >> write: 0 3 0 3 83996 9118.895 0 >> verify: 0 0 0 0 28712 0.000 0 >> >> Thank you ! >> >> Ben > smartmontools is probably the best way to read SAS error logs. > Interpreting them can be hard, though. The Backblaze blog is probably > the best place to get current advice. But the easiest thing to do is > certainly to wait until something fails hard. With ZFS, you can have > up to 3 drives' worth of redundancy, and hotspares too. I concur with Alan. Trying to predict drive failure is a mug's game. Very through research (e.g. Google, 2007) has shown it's a waste of time trying. With ZFS (or geom mirror) a drive will be "failed" as soon as there's a problem and you can get notification using a cron job that sends an email if the output of zpool status (or gmirror status ) contains "DEGRADED". That said, I've found it useful to use smartctl to pick up when a drive is overheating, usually due to fan failure. You might also find the new (11.0+?) sesutil handy to monitor components on a SAS expander IF YOU HAVE ONE. Things like fans and temperature sensors are readable this way. Regards, Frank.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?25450400-4ba2-76d4-605c-fce37c1c905b>