From owner-freebsd-fs@FreeBSD.ORG  Fri Jul  6 07:08:52 2012
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
	by hub.freebsd.org (Postfix) with ESMTP id 8E464106564A
	for <freebsd-fs@freebsd.org>; Fri,  6 Jul 2012 07:08:52 +0000 (UTC)
	(envelope-from zbeeble@gmail.com)
Received: from mail-ob0-f182.google.com (mail-ob0-f182.google.com
	[209.85.214.182])
	by mx1.freebsd.org (Postfix) with ESMTP id 546778FC12
	for <freebsd-fs@freebsd.org>; Fri,  6 Jul 2012 07:08:52 +0000 (UTC)
Received: by obbun3 with SMTP id un3so18306148obb.13
	for <freebsd-fs@freebsd.org>; Fri, 06 Jul 2012 00:08:46 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
	h=mime-version:in-reply-to:references:date:message-id:subject:from:to
	:cc:content-type;
	bh=K3uMguqvk54dxcqQDrL/XGLFJkJqK/nB3qUQFX8zd/A=;
	b=KwTzVDR7whTh+X132KMDhRlbuv6ahTmYzzKJqcg9Prkc3e/06MBcXodBF1JTZ2B7WL
	LBGwvYGU8/fcKPBSZ717P3gY9xS6Q+nJKyYff+nw9E3wXFY6IeCkhuB0UMcrchJZ04sY
	wh+8O5QvkRLdV8M9QeVvKs+wSGTSMedMsAaN+kaZMuMarWRkaW7gnPX8dGZkNfbmpnH3
	OJBqhAcxUgHAcF3M6V07eBhezlO4SuiJtpYkHWl05Z+DT1RtRkTegK6YZbNggQJ/qaAl
	ummR+xPpWf8pPlIwKj/2kUdTa/oV1s+NwpcFZEA/aAueyIKPm2vCaMMPJC6cIACgQPEI
	YJnQ==
MIME-Version: 1.0
Received: by 10.60.25.6 with SMTP id y6mr30155815oef.42.1341558526432; Fri, 06
	Jul 2012 00:08:46 -0700 (PDT)
Received: by 10.76.135.67 with HTTP; Fri, 6 Jul 2012 00:08:46 -0700 (PDT)
In-Reply-To: <1341537402.58301.YahooMailClassic@web122504.mail.ne1.yahoo.com>
References: <1341537402.58301.YahooMailClassic@web122504.mail.ne1.yahoo.com>
Date: Fri, 6 Jul 2012 03:08:46 -0400
Message-ID: <CACpH0MemwZDCXsh4USzeFHUO8fbW09TSOYyVPa2dWmKc8N+=_Q@mail.gmail.com>
From: Zaphod Beeblebrox <zbeeble@gmail.com>
To: Jason Usher <jusher71@yahoo.com>
Content-Type: text/plain; charset=ISO-8859-1
Cc: freebsd-fs@freebsd.org
Subject: Re: vdev/pool math with combined raidzX vdevs...
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 06 Jul 2012 07:08:52 -0000

Is there some penalty for not googling some basic stats course?  OK.
This is from memory (hint: you probably should google).

p(f) ... the probably of failure of one drive over some unit time (say
one year).  A two drive RAID-0 array has probability p(2dr0) = 2 *
p(f) + p(f).  That is (for the logic guys): the array fails if either
drive fails.  A two drive RAID-1 array has probability p(2dr1) = p(f)
* p(f) ... that is: the array fails only if both drives fail.  These
are simple probabilities.  It doesn't count that the RAID-1 case can
be said to be more complex ... ie: given a certain failure
distribution and a certain replacement distribution, what is the
chance of total failure given a single drive failure (ie: 2nd drive
failing before you replace the first drive to fail).

... but you get the jist.  If no replacements are allowed and 10% of
drives fail in a year, the R0 array has a 20% chance of failing in 1
year and the R1 array has a 2% chance

... this also says nothing about the fact that the drives are the same
and have done mostly the same reads and writes and that their failure
may not be independant.

Geez... it's getting complex.

Now... we start getting into the hard stuff.  For a RAID-Z(1) array,
you want to think about the possibility of 2 failures out of 12 drives
(or ... if you're feeling up to it, the probability of the first
failure and then the probability of the second failure given the first
before you can replace it).  p(12drz1) = 12 * p(f) * 11 * p(f)  --- if
no replacements are allowed and drive failures are independent.  To
kick it up a notch, the "11 * p(f)" can be replaced with eleven times
the probability of failure before replacement --- which you can
calculate with your MTBF tables and your service level for replacing
drives in the array.

Similarly,

p(12drz2) = 12 * p(f) * 11 * p(f) * 10 * p(f)
p(12drz3) = 12 * p(f) * 11 * p(f) * 10 * p(f) * 9 * p(f)

... again with those assumptions are more complex probabilities given
your replacement strategy.

... so, again with simplistic assumptions,

p(36drz3 --- 12 drives, 3 groups) = p(12drz3) * 3

A "vanilla" RAID-Z2 (if I make an assumption to what you're saying) is:

p(36drz2) = 36 * p(f) * 35 * p(f)

... but I can't directly answer you question without knowing a) the
structure of the RAID-Z2 array and p(f).  If we use a 1% figure for
p(f), then P(36drz3,12,3) = 0.035% and p(36drz2) = 4.3%

... that is the raid-Z2 case (one group of 36 drives, two redundant
--- which is crazy) is 4.3% likely to fail where the 3-group RAID-Z3
is only 0.035% likely to fail.  As a more sane comparison,
p(36drz2,12,3) = 3.8%

now it's worth saying that all these calculations assume that you
never come to replace drives in the array and that drive failures are
independent... neither of these is likely true.  If you had (say) a 4
hour 7/24 contract with someone, the chances of more drives failing
before a failed drive is replaced are much smaller.  As for the
independence of drive failures... that's a discussion over beer.

Put simply, you add the probabilities of things where any can cause
the failure (either drive of R0 failing, any one of the 3 plexes of a
complex array failing) and you multiply things where all must fail to
produce failure.