From owner-freebsd-questions@FreeBSD.ORG Tue Jul 29 08:27:21 2014 Return-Path: Delivered-To: freebsd-questions@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 0B259525 for ; Tue, 29 Jul 2014 08:27:21 +0000 (UTC) Received: from sdf.lonestar.org (mx.sdf.org [192.94.73.24]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mx.sdf.org", Issuer "SDF.ORG" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id CDC7F28E6 for ; Tue, 29 Jul 2014 08:27:20 +0000 (UTC) Received: from sdf.org (IDENT:bennett@sdf.lonestar.org [192.94.73.15]) by sdf.lonestar.org (8.14.8/8.14.5) with ESMTP id s6T8RCnr011824 (using TLSv1/SSLv3 with cipher DHE-RSA-AES256-GCM-SHA384 (256 bits) verified NO) for ; Tue, 29 Jul 2014 08:27:13 GMT Received: (from bennett@localhost) by sdf.org (8.14.8/8.12.8/Submit) id s6T8RCrl014461 for freebsd-questions@freebsd.org; Tue, 29 Jul 2014 03:27:12 -0500 (CDT) From: Scott Bennett Message-Id: <201407290827.s6T8RCrl014461@sdf.org> Date: Tue, 29 Jul 2014 03:27:12 -0500 To: freebsd-questions@freebsd.org Subject: gvinum raid5 vs. ZFS raidz User-Agent: Heirloom mailx 12.4 7/29/08 MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 29 Jul 2014 08:27:21 -0000 I want to set up a couple of software-based RAID devices across identically sized partitions on several disks. At first I thought that gvinum's raid5 would be the way to go, but now that I have finally found and read some information about raidz, I am unsure which to choose. My current, and possibly wrong, understanding about the two methods' most important features (to me, at least) can be summarized as follows. raid5 raidz Has parity checking, but any parity Has parity checking *and* errors identified are assumed to be frequently spaced checksums errors in the parity blocks themselves, that are checked when data so if errors occur in data blocks, the blocks are read, so errors are errors can be detected, but not unless and detected relatively quickly. a "checkparity" operation is done. Errors Checksums enable identification in parity blocks can be fixed by a of where the error exists and "rebuildparity" operation, but a automatic repair of erroneous "rebuildparity" will merely cement errors bytes in either data blocks or in data blocks by creating parity blocks parity blocks. to match/confirm the erroneous data. (This also appears to be the case for graid3 devices.) Can be expanded by the addition of more Can only be expanded by spindles via a "gvinum grow" operation. replacing all components with larger components. The number of component devices cannot be changed, so the percentage of space tied up in parity cannot be changed. Does not support migration to any other Does not support migration RAID levels or their equivalents. between raidz levels, even by (N.B. The exception to this limitation adding drives to support the seems to be to create a mirror of a raid5 increased space required for device, effectively migrating to a RAID5+1 parity blocks. configuration.) Does not support additional parity Supports one (raidz2) or two dimensions a la RAID6. (raidz3) additional parity dimensions if one or two extra components is designated for such purpose when the raidz device is created. Fast performance because each block Slower performance because each is on a separate spindle from the block is spread across all from the previous and next blocks. spindles a la RAID3, so many simultaneous I/O operations are required for each block. ----------------------- I hoped to start with a minimal number of components and eventually add more components to increase the space available in the raid5 or raidz devices. Increasing their sizes that way would also increase the total percentage of space in the devices devoted to data rather than parity, as well as improving the performance enhancement of the striping. For various reasons, having to replace all component spindles with larger-capacity components is not a viable method of increasing the size of the raid5 or raidz devices in my case. That would appear to rule out raidz. OTOH, the very large-capacity drives available in the last two or three years appear not to be very reliable(*) compared to older drives of 1 TB or smaller capacities. gvinum's raid5 appears not to offer good protection against, nor any repair of, damaged data blocks. Ideally, one ought to be able to create a minimal device with the space equivalent of one device devoted to parity, whose space and/or dimensions of parity could be increased later by the addition of more spindles. Given that there appears to be no software available under FreeBSD to support that ideal, I am currently stumped as to which available way to go. I would appreciate anyone with actual experience with gvinum's raid5 or ZFS raidz (preferably with both) correcting any errors in my understanding as described above and also offering suggestions as to how best to resolve my quandary. Thanks to three failed external drives and apparently not fully reliable replacements, compounded by a bad ports update two or three months ago, I have no functioning X11 and no space set up any longer in which to build ports to fix the X11 problem, so I really want to get the disk situation settled ASAP. Trying to keep track of everything using only syscons and window(1) is wearing my patience awfully thin. (*) [Last year I got two defective 3 TB drives in a row from Seagate. I ended up settling for a 2 TB Seagate that is still running fine AFAIK. While that process was going on, I bought three 2 TB Seagate drives in external cases with USB 3.0 interfaces, two of which failed outright after about 12 months and have been replaced with two refurbished drives under warranty. While waiting for those replacements to arrive, I bought a 2 TB Samsung drive in an external case with a USB 3.0 interface. I discovered by chance that copying very large files to these drives is an error-prone process. A roughly 1.1 TB file on the one surviving external Seagate drive from last year's purchase of three, when copied to the Samsung drive, showed no I/O errors during the copy operation. However, a comparison check using "cmp -l -z originalfile copyoforiginal" shows quite a few places where the contents don't match. The same procedure applied to one of the refurbished Seagates gives similar results, although the locations and numbers of differing bytes are different from those on the Samsung drive. The same procedure applied to the other refurbished drive resulted in a good copy the first time, but a later repetition ended up with a copied file that differed from the original by a single bit in each of two widely separated places in the files. These problems have raised the priority of a self-healing RAID device in my mind. I have to say that these are new experiences to me. The disk drives, controllers, etc. that I grew up with all had parity checking in the hardware, including the data encoded on the disks, so single-bit errors anywhere in the process showed up as hardware I/O errors instantly. If the errors were not eliminated during a limited number of retries, they ended up as permanent I/O errors that a human would have to resolve at some point. FWIW, I also discovered that I cannot run two such multi-hour-long copy operations in parallel using two separate pairs of drives. Running them together seems to go okay for a while, but eventually always results in a panic. This is on 9.2-STABLE (r264339). I know that that is not up to date, but I can't do anything about that until my disk hardware situation is settled.] Thanks in advance for any help, information, advice, etc. I'm running out of hair to tear out at this point. :-( Scott Bennett, Comm. ASMELG, CFIAG ********************************************************************** * Internet: bennett at sdf.org *xor* bennett at freeshell.org * *--------------------------------------------------------------------* * "A well regulated and disciplined militia, is at all times a good * * objection to the introduction of that bane of all free governments * * -- a standing army." * * -- Gov. John Hancock, New York Journal, 28 January 1790 * **********************************************************************