Date: Sun, 13 Jul 2003 14:14:53 -0500 From: Dan Nelson <dnelson@allantgroup.com> To: Andrea Venturoli <ml.ventu@flashnet.it> Cc: freebsd-questions@freebsd.org Subject: Re: vinum and hot-swapping Message-ID: <20030713191453.GF23909@dan.emsphone.com> In-Reply-To: <200307130245.h6D2j8HB000556@soth.ventu> References: <200307130245.h6D2j8HB000556@soth.ventu>
next in thread | previous in thread | raw e-mail | index | archive | help
In the last episode (Jul 13), Andrea Venturoli said: > ** Reply to note from "Greg 'groggy' Lehey" <grog@freebsd.org> Sat, 12 Jul 2003 17:13:29 +0930 > > The real performance penalty for RAID-5 is simply that writes require > > so much I/O. Expect 25% of the write performance of RAID-0. > > Ok, I must ask this: Shouldn't SCSI system allow paralell writes on > different disks? If so, why so much penalty? Parallel I/Os are already being used. A short write on a RAID-5 array requires you to 1) Read the original block and the parity block (done in parallel) 2) XOR the parity block with the original block and the new block 3) Write the new block and the parity block (done in parallel) Which means that you're doing 4 times the I/O that a plain RAID-5 read would do. There's no getting around this problem for small random writes. Repeated writes to the same locations only cost two writes, since the original and parity blocks are probably still in cache. There is a threshold point where this stops being an issue, however. When your write size becomes larger than the raid-5 stripe width (stripe size * number of data disks), you can simply calculate the parity block directly and not have to read anything. At this point, raid-5 magically becomes as efficient as raid-0 :) I don't believe vinum can optimize full-stripe writes, though, since FreeBSD can only do I/O in 64k max chunks, and since vunum is software instead of battery-backed hardware RAID, it cannot hold off on multiple writes until the stripe fills up. Most hardware RAIDs do parity-block caching and long write optimizations. -- Dan Nelson dnelson@allantgroup.com
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20030713191453.GF23909>