From owner-freebsd-fs@FreeBSD.ORG Thu Jul 26 07:47:17 2007 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 0AA3D16A417 for ; Thu, 26 Jul 2007 07:47:17 +0000 (UTC) (envelope-from M.S.Powell@salford.ac.uk) Received: from abbe.salford.ac.uk (abbe.salford.ac.uk [146.87.0.10]) by mx1.freebsd.org (Postfix) with SMTP id 7462B13C468 for ; Thu, 26 Jul 2007 07:47:16 +0000 (UTC) (envelope-from M.S.Powell@salford.ac.uk) Received: (qmail 33792 invoked by uid 98); 26 Jul 2007 08:47:15 +0100 Received: from 146.87.255.121 by abbe.salford.ac.uk (envelope-from , uid 401) with qmail-scanner-2.01 (clamdscan: 0.90/3775. spamassassin: 3.1.8. Clear:RC:1(146.87.255.121):. Processed in 0.064179 secs); 26 Jul 2007 07:47:15 -0000 Received: from rust.salford.ac.uk (HELO rust.salford.ac.uk) (146.87.255.121) by abbe.salford.ac.uk (qpsmtpd/0.3x.614) with SMTP; Thu, 26 Jul 2007 08:47:15 +0100 Received: (qmail 68571 invoked by uid 1002); 26 Jul 2007 07:47:13 -0000 Received: from localhost (sendmail-bs@127.0.0.1) by localhost with SMTP; 26 Jul 2007 07:47:13 -0000 Date: Thu, 26 Jul 2007 08:47:13 +0100 (BST) From: "Mark Powell" To: Doug Rabson In-Reply-To: <1185434973.3698.18.camel@herring.rabson.org> Message-ID: <20070726083224.O68220@rust.salford.ac.uk> References: <20070725174715.9F47E5B3B@mail.bitblocks.com> <1185389856.3698.11.camel@herring.rabson.org> <20070726075607.W68220@rust.salford.ac.uk> <1185434973.3698.18.camel@herring.rabson.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: freebsd-fs@freebsd.org, Pawel Jakub Dawidek Subject: Re: ZfS & GEOM with many odd drive sizes X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 26 Jul 2007 07:47:17 -0000 On Thu, 26 Jul 2007, Doug Rabson wrote: > When its reading, RAID-Z only has to read the blocks which contain data > - the parity block is only read if either the vdev is in degraded mode > after a drive failure or one (two for RAID-Z2) of the data block reads > fails. Yes, but that article does not mention reading parity. What it's saying is that every block is striped across multiple drives. The checksum for that block thus applies to data which is on multiple drives. Therefore to checksum a block you have to read all the parts of the block from every drive except one in the RAIDz array: "This makes read performance of a RAID-Z pool be the same as that of a single disk, even if you only needed a small read from block D." > For pools which contain a single RAID-Z or RAID-Z2 group, this is > probably a performance issue. Larger pools containing multiple RAID-Z > groups can spread the load to improve this. This isn't something that's immediately obvious, coming from fixed stripe size raid5. Now it seems that the variable stripe size has a rather serious performance penalty. It seems that if you have 8 drives, it'd be much more prudent to make two RAIDz of 3+1 rather than one of 6+2. Cheers. -- Mark Powell - UNIX System Administrator - The University of Salford Information Services Division, Clifford Whitworth Building, Salford University, Manchester, M5 4WT, UK. Tel: +44 161 295 4837 Fax: +44 161 295 5888 www.pgp.com for PGP key