Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 3 Aug 2016 16:59:13 +0100
From:      Arthur Chance <freebsd@qeng-ho.org>
To:        Jon Radel <jon@radel.com>, "Brandon J. Wandersee" <brandon.wandersee@gmail.com>
Cc:        "William A. Mahaffey III" <wam@hiwaay.net>, FreeBSD Questions !!!! <freebsd-questions@freebsd.org>
Subject:   Re: Ominous smartd messages ....
Message-ID:  <c20fef42-b57c-6842-0de8-0e9418ee7d50@qeng-ho.org>
In-Reply-To: <7f1afc31-7eda-ba4c-41ea-046a091d6055@radel.com>
References:  <e5a65f8a-27a0-65e7-42db-28bef824e0c0@hiwaay.net> <117bb75c-aa6a-d562-c971-d0bab742f5ad@radel.com> <8637mmdkah.fsf@WorkBox.Home> <7f1afc31-7eda-ba4c-41ea-046a091d6055@radel.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On 03/08/2016 15:09, Jon Radel wrote:
> On 8/3/16 10:00 AM, Brandon J. Wandersee wrote:
>>
>> Jon Radel writes:
>>
>>> I've read reasonable sounding commentary from people running very, very
>>> large collections of hard drives that there is a high enough correlation
>>> between this error and the drive going to heck sooner rather than later
>>> that they take this as a sign to replace.  [Can't find reference right now.]
>>
>> While there's no way to know from the error message alone just what will
>> happen to the disk in the coming days, the general reasoning is this:
>> sectors are not physically segregated. They all sit on the same
>> platter. Several bad sectors occuring in a short period might be a sign
>> of physical fault in the platter, and if that fault is real then stress
>> from the platter spinning will likely cause that fault to spread. So
>> some people conclude that the appearance of several bad sectors in a
>> short period should just be a signal to replace the disk immediately.
>>
> 
> If I remember the discussion well enough (sad that I can't find it) my
> use of "correlation" was precise.  They actually manage enough drives
> (thousands) and kept enough records to allow for statistical analysis
> which indicate that this smartd error correlates very well with failure
> within [I wish I could remember] timeframe.
> 
> Do please excuse the utter lack of footnotes.  :-(
> 

I think everyone is probably thinking of Backblaze. This is their latest
summary of drive statistics

https://www.backblaze.com/blog/hard-drive-failure-rates-q2-2016/

And this is their take on which SMART metrics matter

https://www.backblaze.com/blog/hard-drive-smart-stats/

-- 
Moore's Law of Mad Science: Every eighteen months, the minimum IQ
necessary to destroy the world drops by one point.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?c20fef42-b57c-6842-0de8-0e9418ee7d50>