From owner-freebsd-questions@FreeBSD.ORG  Sat Jul 20 20:26:40 2013
Return-Path: <owner-freebsd-questions@FreeBSD.ORG>
Delivered-To: freebsd-questions@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by hub.freebsd.org (Postfix) with ESMTP id E6C33695
 for <freebsd-questions@freebsd.org>; Sat, 20 Jul 2013 20:26:40 +0000 (UTC)
 (envelope-from feenberg@nber.org)
Received: from mail2.nber.org (mail2.nber.org [66.251.72.79])
 by mx1.freebsd.org (Postfix) with ESMTP id 8D484286
 for <freebsd-questions@freebsd.org>; Sat, 20 Jul 2013 20:26:40 +0000 (UTC)
Received: from nber6 (nber6.nber.org [66.251.72.76])
 by mail2.nber.org (8.14.4/8.14.4) with ESMTP id r6KKNtm0057534;
 Sat, 20 Jul 2013 16:23:55 -0400 (EDT)
 (envelope-from feenberg@nber.org)
Date: Sat, 20 Jul 2013 16:07:40 -0400 (EDT)
From: Daniel Feenberg <feenberg@nber.org>
X-X-Sender: feenberg@nber6
To: "Steve O'Hara-Smith" <steve@sohara.org>
Subject: Re: to gmirror or to ZFS
In-Reply-To: <20130720201214.90206565e00675611996176d@sohara.org>
Message-ID: <Pine.GSO.4.64.1307201558490.29785@nber6>
References: <4DFBC539-3CCC-4B9B-AB62-7BB846F18530@gmail.com>
 <alpine.BSF.2.00.1307152211180.74094@wonkity.com>
 <976836C5-F790-4D55-A80C-5944E8BC2575@gmail.com>
 <51E51558.50302@ShaneWare.Biz> <51E52190.7020008@fjl.co.uk>
 <CAOaKuAVULVuZxtExp=mNi-J7kMNbsxbLJVsv8nKTA0-Ru6M3+w@mail.gmail.com>
 <6CE5718E-2646-4D8C-AF98-37384B8851C5@mac.com>
 <CAOaKuAU8nhaoq+6hCVkB+b-ppiBvYPKANdWJRnYcmKaPdecwZA@mail.gmail.com>
 <DCC017BE-A293-4C1B-8B6F-D9AF6F50125B@mac.com> <51EAC56C.4030801@fjl.co.uk>
 <20130720201214.90206565e00675611996176d@sohara.org>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
X-Anti-Virus: Kaspersky Anti-Virus for Linux Mail Server 5.6.39/RELEASE,
 bases: 20130720 #10639694, check: 20130720 clean
Cc: frank2@fjl.co.uk, freebsd-questions@freebsd.org
X-BeenThere: freebsd-questions@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: User questions <freebsd-questions.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-questions>, 
 <mailto:freebsd-questions-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-questions>
List-Post: <mailto:freebsd-questions@freebsd.org>
List-Help: <mailto:freebsd-questions-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-questions>, 
 <mailto:freebsd-questions-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 20 Jul 2013 20:26:41 -0000


On Sat, 20 Jul 2013, Steve O'Hara-Smith wrote:

> On Sat, 20 Jul 2013 18:14:20 +0100
> Frank Leonhardt <frank2@fjl.co.uk> wrote:
>
>> It's worth noting, as a warning for anyone who hasn't been there, that
>> the number of times a second drive in a RAID system fails during a
>> rebuild is higher than would be expected. During a rebuild the remaining
>> drives get thrashed, hot, and if they're on the edge, that's when
>> they're going to go. And at the most inconvenient time. Okay - obvious
>> when you think about it, but this tends to be too late.
>
> 	Having the cabinet stuffed full of nominally identical drives
> bought at the same time from the same supplier tends to add to the
> probability that more than one drive is on the edge when one goes. It's a
> pity there are now only two manufacturers of spinning rust.

Often this is presummed to be the reason for double failures close in 
time, also common mode failures such as environment, a defective power 
supply or excess voltage can be blamed. I have to think that the most 
common "cause" for a second failure soon after the first is that a failed 
drive often isn't detected until a particular sector is read or written. 
Since the resilvering reads and writes every sector on multiple disks, 
including unused sectors, it can "detect" latent problems that may have 
existed since the drive was new but which haven't been used for data yet, 
or have gone bad since the last write, but haven't been read since.

The ZFS scrub processes only sectors with data, so it provides only 
partial protection against double failures.

Daniel Feenberg
NBER


>
> -- 
> Steve O'Hara-Smith <steve@sohara.org>
> _______________________________________________
> freebsd-questions@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-questions
> To unsubscribe, send any mail to "freebsd-questions-unsubscribe@freebsd.org"
>