Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 22 Jun 2015 17:41:19 +0200
From:      Willem Jan Withagen <wjw@digiware.nl>
To:        Michelle Sullivan <michelle@sorbs.net>,  Quartz <quartz@sneakertech.com>
Cc:        fs@freebsd.org
Subject:   Re: This diskfailure should not panic a system, but just disconnect disk from ZFS
Message-ID:  <55882C9F.8020507@digiware.nl>
In-Reply-To: <558769B5.601@sorbs.net>
References:  <5585767B.4000206@digiware.nl> <5587236A.6020404@sneakertech.com> <558769B5.601@sorbs.net>

next in thread | previous in thread | raw e-mail | index | archive | help
On 22/06/2015 03:49, Michelle Sullivan wrote:
> Quartz wrote:
>> Also:
>>
>>> And thus I'd would have expected that ZFS would disconnect /dev/da0 and
>>> then switch to DEGRADED state and continue, letting the operator fix the
>>> broken disk.
>>
>>> Next question to answer is why this WD RED on:
>>
>>> got hung, and nothing for this shows in SMART....
>>
>> You have a raidz2, which means THREE disks need to go down before the
>> pool is unwritable. The problem is most likely your controller or
>> power supply, not your disks.
>>
> Never make such assumptions...
> 
> I have worked in a professional environment where 9 of 12 disks failed
> within 24 hours of each other....  They were all supposed to be from
> different batches but due to an error they came from the same batch and
> the environment was so tightly controlled and the work-load was so
> similar that MTBF was almost identical on all 11 disks in the array...
> the only disk that lasted more than 2 weeks over the failure was the
> hotspare...!
> 

Scary (non)-statistics....
Theories are always nice, but this sort of experiences make your hair go
grey overnight.

--WjW



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?55882C9F.8020507>