From owner-freebsd-fs@FreeBSD.ORG Mon Jun 22 15:50:55 2015 Return-Path: Delivered-To: freebsd-fs@nevdull.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 9DE6E1DB for ; Mon, 22 Jun 2015 15:50:55 +0000 (UTC) (envelope-from wjw@digiware.nl) Received: from hub.freebsd.org (hub.freebsd.org [IPv6:2001:1900:2254:206c::16:88]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "hub.freebsd.org", Issuer "hub.freebsd.org" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 7F2A4625 for ; Mon, 22 Jun 2015 15:50:55 +0000 (UTC) (envelope-from wjw@digiware.nl) Received: by hub.freebsd.org (Postfix) id 74C031DA; Mon, 22 Jun 2015 15:50:55 +0000 (UTC) Delivered-To: fs@nevdull.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 73E0F1D9 for ; Mon, 22 Jun 2015 15:50:55 +0000 (UTC) (envelope-from wjw@digiware.nl) Received: from smtp.digiware.nl (smtp.digiware.nl [31.223.170.169]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 3512A623 for ; Mon, 22 Jun 2015 15:50:54 +0000 (UTC) (envelope-from wjw@digiware.nl) Received: from rack1.digiware.nl (unknown [127.0.0.1]) by smtp.digiware.nl (Postfix) with ESMTP id 11BA916A401; Mon, 22 Jun 2015 17:50:51 +0200 (CEST) X-Virus-Scanned: amavisd-new at digiware.nl Received: from smtp.digiware.nl ([127.0.0.1]) by rack1.digiware.nl (rack1.digiware.nl [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id WXS0Lf198po6; Mon, 22 Jun 2015 17:50:23 +0200 (CEST) Received: from [192.168.101.176] (vpn.ecoracks.nl [31.223.170.173]) by smtp.digiware.nl (Postfix) with ESMTPA id E44FB16A402; Mon, 22 Jun 2015 17:41:17 +0200 (CEST) Message-ID: <55882C9F.8020507@digiware.nl> Date: Mon, 22 Jun 2015 17:41:19 +0200 From: Willem Jan Withagen User-Agent: Mozilla/5.0 (Windows NT 6.3; WOW64; rv:31.0) Gecko/20100101 Thunderbird/31.7.0 MIME-Version: 1.0 To: Michelle Sullivan , Quartz CC: fs@freebsd.org Subject: Re: This diskfailure should not panic a system, but just disconnect disk from ZFS References: <5585767B.4000206@digiware.nl> <5587236A.6020404@sneakertech.com> <558769B5.601@sorbs.net> In-Reply-To: <558769B5.601@sorbs.net> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 22 Jun 2015 15:50:55 -0000 On 22/06/2015 03:49, Michelle Sullivan wrote: > Quartz wrote: >> Also: >> >>> And thus I'd would have expected that ZFS would disconnect /dev/da0 and >>> then switch to DEGRADED state and continue, letting the operator fix the >>> broken disk. >> >>> Next question to answer is why this WD RED on: >> >>> got hung, and nothing for this shows in SMART.... >> >> You have a raidz2, which means THREE disks need to go down before the >> pool is unwritable. The problem is most likely your controller or >> power supply, not your disks. >> > Never make such assumptions... > > I have worked in a professional environment where 9 of 12 disks failed > within 24 hours of each other.... They were all supposed to be from > different batches but due to an error they came from the same batch and > the environment was so tightly controlled and the work-load was so > similar that MTBF was almost identical on all 11 disks in the array... > the only disk that lasted more than 2 weeks over the failure was the > hotspare...! > Scary (non)-statistics.... Theories are always nice, but this sort of experiences make your hair go grey overnight. --WjW