From owner-freebsd-questions@FreeBSD.ORG Sat Jul 20 20:26:40 2013 Return-Path: Delivered-To: freebsd-questions@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id E6C33695 for ; Sat, 20 Jul 2013 20:26:40 +0000 (UTC) (envelope-from feenberg@nber.org) Received: from mail2.nber.org (mail2.nber.org [66.251.72.79]) by mx1.freebsd.org (Postfix) with ESMTP id 8D484286 for ; Sat, 20 Jul 2013 20:26:40 +0000 (UTC) Received: from nber6 (nber6.nber.org [66.251.72.76]) by mail2.nber.org (8.14.4/8.14.4) with ESMTP id r6KKNtm0057534; Sat, 20 Jul 2013 16:23:55 -0400 (EDT) (envelope-from feenberg@nber.org) Date: Sat, 20 Jul 2013 16:07:40 -0400 (EDT) From: Daniel Feenberg X-X-Sender: feenberg@nber6 To: "Steve O'Hara-Smith" Subject: Re: to gmirror or to ZFS In-Reply-To: <20130720201214.90206565e00675611996176d@sohara.org> Message-ID: References: <4DFBC539-3CCC-4B9B-AB62-7BB846F18530@gmail.com> <976836C5-F790-4D55-A80C-5944E8BC2575@gmail.com> <51E51558.50302@ShaneWare.Biz> <51E52190.7020008@fjl.co.uk> <6CE5718E-2646-4D8C-AF98-37384B8851C5@mac.com> <51EAC56C.4030801@fjl.co.uk> <20130720201214.90206565e00675611996176d@sohara.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Anti-Virus: Kaspersky Anti-Virus for Linux Mail Server 5.6.39/RELEASE, bases: 20130720 #10639694, check: 20130720 clean Cc: frank2@fjl.co.uk, freebsd-questions@freebsd.org X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 20 Jul 2013 20:26:41 -0000 On Sat, 20 Jul 2013, Steve O'Hara-Smith wrote: > On Sat, 20 Jul 2013 18:14:20 +0100 > Frank Leonhardt wrote: > >> It's worth noting, as a warning for anyone who hasn't been there, that >> the number of times a second drive in a RAID system fails during a >> rebuild is higher than would be expected. During a rebuild the remaining >> drives get thrashed, hot, and if they're on the edge, that's when >> they're going to go. And at the most inconvenient time. Okay - obvious >> when you think about it, but this tends to be too late. > > Having the cabinet stuffed full of nominally identical drives > bought at the same time from the same supplier tends to add to the > probability that more than one drive is on the edge when one goes. It's a > pity there are now only two manufacturers of spinning rust. Often this is presummed to be the reason for double failures close in time, also common mode failures such as environment, a defective power supply or excess voltage can be blamed. I have to think that the most common "cause" for a second failure soon after the first is that a failed drive often isn't detected until a particular sector is read or written. Since the resilvering reads and writes every sector on multiple disks, including unused sectors, it can "detect" latent problems that may have existed since the drive was new but which haven't been used for data yet, or have gone bad since the last write, but haven't been read since. The ZFS scrub processes only sectors with data, so it provides only partial protection against double failures. Daniel Feenberg NBER > > -- > Steve O'Hara-Smith > _______________________________________________ > freebsd-questions@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-questions > To unsubscribe, send any mail to "freebsd-questions-unsubscribe@freebsd.org" >