From owner-freebsd-questions@FreeBSD.ORG  Sat Jun 28 11:03:45 2008
Return-Path: <owner-freebsd-questions@FreeBSD.ORG>
Delivered-To: freebsd-questions@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 7DAD1106566C
	for <freebsd-questions@freebsd.org>;
	Sat, 28 Jun 2008 11:03:45 +0000 (UTC)
	(envelope-from pieter@degoeje.nl)
Received: from smtp.utwente.nl (unknown
	[IPv6:2001:610:1908:1000:204:23ff:feb5:7e66])
	by mx1.freebsd.org (Postfix) with ESMTP id 7A03D8FC17
	for <freebsd-questions@freebsd.org>;
	Sat, 28 Jun 2008 11:03:44 +0000 (UTC)
	(envelope-from pieter@degoeje.nl)
Received: from lux.student.utwente.nl (lux.student.utwente.nl [130.89.170.81])
	by smtp.utwente.nl (8.12.10/SuSE Linux 0.7) with ESMTP id
	m5SB2L3L019414; Sat, 28 Jun 2008 13:02:21 +0200
From: Pieter de Goeje <pieter@degoeje.nl>
To: freebsd-questions@freebsd.org
Date: Sat, 28 Jun 2008 13:02:20 +0200
User-Agent: KMail/1.9.7
References: <20080628005702.2137bb8c@gom.home>
In-Reply-To: <20080628005702.2137bb8c@gom.home>
MIME-Version: 1.0
Content-Type: text/plain;
  charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline
Message-Id: <200806281302.20814.pieter@degoeje.nl>
X-UTwente-MailScanner-Information: Scanned by MailScanner. Contact
	servicedesk@icts.utwente.nl for more information.
X-UTwente-MailScanner: Found to be clean
X-UTwente-MailScanner-From: pieter@degoeje.nl
X-Spam-Status: No
Cc: prad <prad@towardsfreedom.com>
Subject: Re: first pre-emptive raid
X-BeenThere: freebsd-questions@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: User questions <freebsd-questions.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-questions>, 
	<mailto:freebsd-questions-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-questions>
List-Post: <mailto:freebsd-questions@freebsd.org>
List-Help: <mailto:freebsd-questions-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-questions>, 
	<mailto:freebsd-questions-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 28 Jun 2008 11:03:45 -0000

On Saturday 28 June 2008, prad wrote:
> 3. it seems that geom just does striping and mirroring, but vinum
> offers more configurability and is really the preferred choice?

Geom also does raid 3 and disk concatenation (JBOD) (see the geom(8) manpage). 
I think geom is preferred because it is better tested (in later versions of 
FreeBSD) and easier to setup.

>
> 4.1 with 4 18G drives one thought is to do a raid1, but we really
> don't  want 3 identical copies. is the only way to have 2 36G mirrors,
> by using raid0+1 or raid1+0?

If you want one logical "disk" you could also mirror both pairs and use 
gconcat to add their sizes together.

>
> 4.2 another possibility is to do raid0, but is that ever wise unless
> you desperately need the space since in our situation you run a 1/4
> chance of going down completely?

Indeed, the chances have quadrupled.

>
> 4.3 is striping or mirroring faster as far as i/o goes (or does the
> difference really matter)? i would have thought the former, but the
> handbook says "Striping requires somewhat more effort to locate the
> data, and it can cause additional I/O load where a transfer is spread
> over multiple disks" #20.3

Both are faster when reading data. Raid 0 is faster when writing. When data 
blocks are spread over N disks, it is possible to achieve sequential read 
speeds N times faster than a simple JBOD configuration would do. However, the 
system also needs N times more bandwith to the disks to achieve this. If the 
disks are on a limited speed shared bus, one could imagine that the overhead 
of the extra I/O commands needed to do raid0 actually impairs performance.

>
> 4.4 vinum introduces raid5 with striping and data integrity, but
> exactly what are the parity blocks? furthermore, since the data is
> striped, how can the parity blocks rebuild anything from a hard drive
> that has crashed? surely, the data from each drive can't be duplicated
> somehow over all the drives though #20.5.2 Redundant Data Storage has
> me scratching my head! if there is complete mirroring, wouldn't the
> disk space be cut in half as with raid1?

Parity is calculated using the following formula: 

parity = data0 XOR data1 XOR data2

Where data0..2 are datablocks striped over the disks, thus we need four disks 
to hold our data (3 for data 1 for parity).

Now the disk with datablock 0 dies. To get the data back we simply need to 
solve the previous formula for data0:

data0 = parity XOR data1 XOR data2

and for data1, 2 (in case the other disks die):

data1 = parity XOR data0 XOR data2
data2 = parity XOR data0 XOR data1

This scales easily with bigger numbers of disks. Another use of parity data is 
to check data integrity. If for some reason the calculated parity of 
a "stripe" is no longer matching the on-disk parity data, then there must be 
an error.

Note that is is easy to see the similarity of raid0 and raid5; basically raid5 
is raid0 plus extra parity data for redundancy, resulting in being able to 
recover from 1 disk failure.

-- 
Pieter de Goeje