Date: Mon, 22 Jun 2015 17:41:19 +0200 From: Willem Jan Withagen <wjw@digiware.nl> To: Michelle Sullivan <michelle@sorbs.net>, Quartz <quartz@sneakertech.com> Cc: fs@freebsd.org Subject: Re: This diskfailure should not panic a system, but just disconnect disk from ZFS Message-ID: <55882C9F.8020507@digiware.nl> In-Reply-To: <558769B5.601@sorbs.net> References: <5585767B.4000206@digiware.nl> <5587236A.6020404@sneakertech.com> <558769B5.601@sorbs.net>
next in thread | previous in thread | raw e-mail | index | archive | help
On 22/06/2015 03:49, Michelle Sullivan wrote: > Quartz wrote: >> Also: >> >>> And thus I'd would have expected that ZFS would disconnect /dev/da0 and >>> then switch to DEGRADED state and continue, letting the operator fix the >>> broken disk. >> >>> Next question to answer is why this WD RED on: >> >>> got hung, and nothing for this shows in SMART.... >> >> You have a raidz2, which means THREE disks need to go down before the >> pool is unwritable. The problem is most likely your controller or >> power supply, not your disks. >> > Never make such assumptions... > > I have worked in a professional environment where 9 of 12 disks failed > within 24 hours of each other.... They were all supposed to be from > different batches but due to an error they came from the same batch and > the environment was so tightly controlled and the work-load was so > similar that MTBF was almost identical on all 11 disks in the array... > the only disk that lasted more than 2 weeks over the failure was the > hotspare...! > Scary (non)-statistics.... Theories are always nice, but this sort of experiences make your hair go grey overnight. --WjW
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?55882C9F.8020507>