Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 22 Jun 2015 01:30:47 +0200
From:      Willem Jan Withagen <wjw@digiware.nl>
To:        Quartz <quartz@sneakertech.com>
Cc:        fs@freebsd.org
Subject:   Re: This diskfailure should not panic a system, but just disconnect disk from ZFS
Message-ID:  <55874927.80807@digiware.nl>
In-Reply-To: <5587236A.6020404@sneakertech.com>
References:  <5585767B.4000206@digiware.nl> <5587236A.6020404@sneakertech.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On 21/06/2015 22:49, Quartz wrote:
> Also:
> 
>> And thus I'd would have expected that ZFS would disconnect /dev/da0 and
>> then switch to DEGRADED state and continue, letting the operator fix the
>> broken disk.
> 
>> Next question to answer is why this WD RED on:
> 
>> got hung, and nothing for this shows in SMART....
> 
> You have a raidz2, which means THREE disks need to go down before the
> pool is unwritable. The problem is most likely your controller or power
> supply, not your disks.

But still I would expect the volume to become degraded if one of the
disks goes into the error state? It is real nice that it is still
'raidz1' but it does need to get fixed...

> Also2: don't rely too much on SMART for determining drive health. Google
> released a paper a few years ago revealing that half of all drives die
> without reporting SMART errors.
> 
> http://research.google.com/archive/disk_failures.pdf

This article is mainly about forcasting disk failure based on SMART
numbers.... Because first "failures" in SMART do nor require one to
immediately replace the disk. The common idea is, if the numbers grow,
expect the device to break.

I was just looking at the counters to see if the disk had logged just
any fact of info/warning/error that could have anything to do with the
problem I have.

Thanx,
--WjW




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?55874927.80807>