Date: Sun, 6 May 2007 15:00:40 +0200 From: Pawel Jakub Dawidek <pjd@FreeBSD.org> To: Dag-Erling Sm??rgrav <des@des.no> Cc: Ivan Voras <ivoras@fer.hr>, freebsd-geom@freebsd.org Subject: Re: graid5 after-reboot problem Message-ID: <20070506130040.GB2138@garage.freebsd.pl> In-Reply-To: <867irmqntm.fsf@dwp.des.no> References: <171980743.20070504223126@uzvik.kiev.ua> <125507.38194.qm@web30304.mail.mud.yahoo.com> <f1i3s4$j4n$1@sea.gmane.org> <86fy6bqocr.fsf@dwp.des.no> <20070505233053.GE16398@garage.freebsd.pl> <867irmqntm.fsf@dwp.des.no>
next in thread | previous in thread | raw e-mail | index | archive | help
--kXdP64Ggrk/fb43R Content-Type: text/plain; charset=iso-8859-2 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Sun, May 06, 2007 at 11:54:45AM +0200, Dag-Erling Sm??rgrav wrote: > Pawel Jakub Dawidek <pjd@FreeBSD.org> writes: > > RAID3 is also write-hole safe, btw. >=20 > How? Any write to a RAID3 requires writing the data to one of the > data disks *and* updating the parity disk. The "write hole" problem is so important in RAID5, because RAID5 parity block to update data block. There are few stages of writting a block in RAID5: 1. Read old content of the block you want to write. 2. Read corresponding parity block. 3. XOR parity with old content. 4. XOR parity with new content. 5. Write new content. 6. Write parity. (This could be done by avoiding parity and reading all corresponding data block, but it's way too inefficient, so this short-cut is most popular.) When you lose the power between 5 and 6, you parity will be corrupted and will stay corrupted forever, because none of the further writes will update it correctly (the only exception is when you do full stripe write, then you don't read old parity, just calculate it, because you have all data blocks needed). This is so much different in RAID3. In RAID3 you always do full stripe writes, so it looks like this: 1. Write data to all data disks and parity disk at once. Of course 1 is not atomic, but when you have a power failure, graid3 will synchronize parity component, but even if you decide not to do it, next write to this block will fix inconsistency, which is not the case for RAID5. RAIDZ also does full stripe writes, just like RAID3, but its COW model is what gives always consistent data and not full stripe writes. Also note, that using gjournal on top of graid3 will fix non-atomicity, but gjournal on top of RAID5 won't fix RAID5 non-atomicity. All in all, write hole is not that dangerous if you remember to synchronize parity on unclean shutdown and this is need for RAID5, RAID3, RAID1, RAID4, RAID6, etc. for RAID5 it is just most visible and you can't avoid resynchronization even when you use things like gjournal. --=20 Pawel Jakub Dawidek http://www.wheel.pl pjd@FreeBSD.org http://www.FreeBSD.org FreeBSD committer Am I Evil? Yes, I Am! --kXdP64Ggrk/fb43R Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (FreeBSD) iD8DBQFGPdF4ForvXbEpPzQRAiacAJwOTjE7x85KElucTySsIlGeWwiZPACglyB1 7Nj1cE5shScEvCIhtxufcv8= =g7c8 -----END PGP SIGNATURE----- --kXdP64Ggrk/fb43R--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20070506130040.GB2138>