From owner-freebsd-geom@FreeBSD.ORG Sun May 6 13:01:21 2007 Return-Path: X-Original-To: freebsd-geom@freebsd.org Delivered-To: freebsd-geom@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 8F1AF16A402 for ; Sun, 6 May 2007 13:01:21 +0000 (UTC) (envelope-from pjd@garage.freebsd.pl) Received: from mail.garage.freebsd.pl (arm132.internetdsl.tpnet.pl [83.17.198.132]) by mx1.freebsd.org (Postfix) with ESMTP id EB7B513C483 for ; Sun, 6 May 2007 13:01:18 +0000 (UTC) (envelope-from pjd@garage.freebsd.pl) Received: by mail.garage.freebsd.pl (Postfix, from userid 65534) id 33901456AB; Sun, 6 May 2007 15:01:17 +0200 (CEST) Received: from localhost (154.81.datacomsa.pl [195.34.81.154]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mail.garage.freebsd.pl (Postfix) with ESMTP id ECEF145681; Sun, 6 May 2007 15:01:10 +0200 (CEST) Date: Sun, 6 May 2007 15:00:40 +0200 From: Pawel Jakub Dawidek To: Dag-Erling Sm??rgrav Message-ID: <20070506130040.GB2138@garage.freebsd.pl> References: <171980743.20070504223126@uzvik.kiev.ua> <125507.38194.qm@web30304.mail.mud.yahoo.com> <86fy6bqocr.fsf@dwp.des.no> <20070505233053.GE16398@garage.freebsd.pl> <867irmqntm.fsf@dwp.des.no> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="kXdP64Ggrk/fb43R" Content-Disposition: inline In-Reply-To: <867irmqntm.fsf@dwp.des.no> X-PGP-Key-URL: http://people.freebsd.org/~pjd/pjd.asc X-OS: FreeBSD 7.0-CURRENT i386 User-Agent: mutt-ng/devel-r804 (FreeBSD) X-Spam-Checker-Version: SpamAssassin 3.0.4 (2005-06-05) on mail.garage.freebsd.pl X-Spam-Level: X-Spam-Status: No, score=-2.6 required=3.0 tests=BAYES_00 autolearn=ham version=3.0.4 Cc: Ivan Voras , freebsd-geom@freebsd.org Subject: Re: graid5 after-reboot problem X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 06 May 2007 13:01:21 -0000 --kXdP64Ggrk/fb43R Content-Type: text/plain; charset=iso-8859-2 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Sun, May 06, 2007 at 11:54:45AM +0200, Dag-Erling Sm??rgrav wrote: > Pawel Jakub Dawidek writes: > > RAID3 is also write-hole safe, btw. >=20 > How? Any write to a RAID3 requires writing the data to one of the > data disks *and* updating the parity disk. The "write hole" problem is so important in RAID5, because RAID5 parity block to update data block. There are few stages of writting a block in RAID5: 1. Read old content of the block you want to write. 2. Read corresponding parity block. 3. XOR parity with old content. 4. XOR parity with new content. 5. Write new content. 6. Write parity. (This could be done by avoiding parity and reading all corresponding data block, but it's way too inefficient, so this short-cut is most popular.) When you lose the power between 5 and 6, you parity will be corrupted and will stay corrupted forever, because none of the further writes will update it correctly (the only exception is when you do full stripe write, then you don't read old parity, just calculate it, because you have all data blocks needed). This is so much different in RAID3. In RAID3 you always do full stripe writes, so it looks like this: 1. Write data to all data disks and parity disk at once. Of course 1 is not atomic, but when you have a power failure, graid3 will synchronize parity component, but even if you decide not to do it, next write to this block will fix inconsistency, which is not the case for RAID5. RAIDZ also does full stripe writes, just like RAID3, but its COW model is what gives always consistent data and not full stripe writes. Also note, that using gjournal on top of graid3 will fix non-atomicity, but gjournal on top of RAID5 won't fix RAID5 non-atomicity. All in all, write hole is not that dangerous if you remember to synchronize parity on unclean shutdown and this is need for RAID5, RAID3, RAID1, RAID4, RAID6, etc. for RAID5 it is just most visible and you can't avoid resynchronization even when you use things like gjournal. --=20 Pawel Jakub Dawidek http://www.wheel.pl pjd@FreeBSD.org http://www.FreeBSD.org FreeBSD committer Am I Evil? Yes, I Am! --kXdP64Ggrk/fb43R Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (FreeBSD) iD8DBQFGPdF4ForvXbEpPzQRAiacAJwOTjE7x85KElucTySsIlGeWwiZPACglyB1 7Nj1cE5shScEvCIhtxufcv8= =g7c8 -----END PGP SIGNATURE----- --kXdP64Ggrk/fb43R--