From owner-freebsd-questions@FreeBSD.ORG Sat Jun 28 11:03:45 2008 Return-Path: Delivered-To: freebsd-questions@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 7DAD1106566C for ; Sat, 28 Jun 2008 11:03:45 +0000 (UTC) (envelope-from pieter@degoeje.nl) Received: from smtp.utwente.nl (unknown [IPv6:2001:610:1908:1000:204:23ff:feb5:7e66]) by mx1.freebsd.org (Postfix) with ESMTP id 7A03D8FC17 for ; Sat, 28 Jun 2008 11:03:44 +0000 (UTC) (envelope-from pieter@degoeje.nl) Received: from lux.student.utwente.nl (lux.student.utwente.nl [130.89.170.81]) by smtp.utwente.nl (8.12.10/SuSE Linux 0.7) with ESMTP id m5SB2L3L019414; Sat, 28 Jun 2008 13:02:21 +0200 From: Pieter de Goeje To: freebsd-questions@freebsd.org Date: Sat, 28 Jun 2008 13:02:20 +0200 User-Agent: KMail/1.9.7 References: <20080628005702.2137bb8c@gom.home> In-Reply-To: <20080628005702.2137bb8c@gom.home> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200806281302.20814.pieter@degoeje.nl> X-UTwente-MailScanner-Information: Scanned by MailScanner. Contact servicedesk@icts.utwente.nl for more information. X-UTwente-MailScanner: Found to be clean X-UTwente-MailScanner-From: pieter@degoeje.nl X-Spam-Status: No Cc: prad Subject: Re: first pre-emptive raid X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 28 Jun 2008 11:03:45 -0000 On Saturday 28 June 2008, prad wrote: > 3. it seems that geom just does striping and mirroring, but vinum > offers more configurability and is really the preferred choice? Geom also does raid 3 and disk concatenation (JBOD) (see the geom(8) manpage). I think geom is preferred because it is better tested (in later versions of FreeBSD) and easier to setup. > > 4.1 with 4 18G drives one thought is to do a raid1, but we really > don't want 3 identical copies. is the only way to have 2 36G mirrors, > by using raid0+1 or raid1+0? If you want one logical "disk" you could also mirror both pairs and use gconcat to add their sizes together. > > 4.2 another possibility is to do raid0, but is that ever wise unless > you desperately need the space since in our situation you run a 1/4 > chance of going down completely? Indeed, the chances have quadrupled. > > 4.3 is striping or mirroring faster as far as i/o goes (or does the > difference really matter)? i would have thought the former, but the > handbook says "Striping requires somewhat more effort to locate the > data, and it can cause additional I/O load where a transfer is spread > over multiple disks" #20.3 Both are faster when reading data. Raid 0 is faster when writing. When data blocks are spread over N disks, it is possible to achieve sequential read speeds N times faster than a simple JBOD configuration would do. However, the system also needs N times more bandwith to the disks to achieve this. If the disks are on a limited speed shared bus, one could imagine that the overhead of the extra I/O commands needed to do raid0 actually impairs performance. > > 4.4 vinum introduces raid5 with striping and data integrity, but > exactly what are the parity blocks? furthermore, since the data is > striped, how can the parity blocks rebuild anything from a hard drive > that has crashed? surely, the data from each drive can't be duplicated > somehow over all the drives though #20.5.2 Redundant Data Storage has > me scratching my head! if there is complete mirroring, wouldn't the > disk space be cut in half as with raid1? Parity is calculated using the following formula: parity = data0 XOR data1 XOR data2 Where data0..2 are datablocks striped over the disks, thus we need four disks to hold our data (3 for data 1 for parity). Now the disk with datablock 0 dies. To get the data back we simply need to solve the previous formula for data0: data0 = parity XOR data1 XOR data2 and for data1, 2 (in case the other disks die): data1 = parity XOR data0 XOR data2 data2 = parity XOR data0 XOR data1 This scales easily with bigger numbers of disks. Another use of parity data is to check data integrity. If for some reason the calculated parity of a "stripe" is no longer matching the on-disk parity data, then there must be an error. Note that is is easy to see the similarity of raid0 and raid5; basically raid5 is raid0 plus extra parity data for redundancy, resulting in being able to recover from 1 disk failure. -- Pieter de Goeje