Date: Sun, 28 Jun 2009 14:02:03 +0300 From: Dan Naumov <dan.naumov@gmail.com> To: Andrew Snow <andrew@modulus.org> Cc: freebsd-fs@freebsd.org, freebsd-geom@freebsd.org Subject: Re: read/write benchmarking: UFS2 vs ZFS vs EXT3 vs ZFS RAIDZ vs Linux MDRAID Message-ID: <cf9b1ee00906280402g40dcd4b2p81dbf18612495d02@mail.gmail.com> In-Reply-To: <4A4747A0.6040902@modulus.org> References: <cf9b1ee00906261636m5d09966ag6d7e1b7557ada709@mail.gmail.com> <4A4725FA.80505@modulus.org> <cf9b1ee00906280330s1f500266xdcbfb1462deda7f8@mail.gmail.com> <4A4747A0.6040902@modulus.org>
next in thread | previous in thread | raw e-mail | index | archive | help
"Now we come to the crucial decision ZFS has made for raidz and raidz2: in raidz and raidz2, the data block is striped across all of the disks. Instead of a model where a parity stripe is a bunch of data blocks, each with an independent checksum, ZFS stripes a single data block (and its parity), with a single checksum, across all the disks (or as many of them as necessary). This is a rational implementation decision, but when combined with the need to verify checksums, it has an important consequence: in ZFS, reads always involve all disks, because ZFS always must verify the data block's checksum, which requires reading all of the data block, which is spread across all of the drives. This is unlike normal RAID-5 or RAID-6, in which a small enough read will only touch one drive, and means that adding more disks to a ZFS raidz pool does not increase how many random reads you can do per second. (A normal RAID-5 or RAID-6 array has a (theoretical) random read IO capacity equal to the sum of the random IO operations rate of each of the disks in the array, and so adding another disk adds its IOPs per second to your read capacity. A ZFS raidz or raidz2 pool instead has a capacity equal to the slowest disk's IOPs per second, and adding another disk does nothing to help. Effectively a raidz ZFS gives you a single disk's read IOPs per second rate.)" This was on a blog of a SUN engineer (although a post from a few years ago), unfortunately I don't have the link, I actually had to go through my posting history on the Ars Technica forum to even find this quote in the first place. If the situation has changed and the above quote no longer holds true, it would be nice if someone more knowledgeable on the performance implications could elaborate what kind of performance is to be expected on a raidz system :) - Sincerely, Dan Naumov On Sun, Jun 28, 2009 at 1:36 PM, Andrew Snow<andrew@modulus.org> wrote: >> What's confusing is that your results are actually out of place with >> how ZFS numbers are supposed to look, not mine :) When using ZFS >> RAIDZ, due to the way parity checking works in ZFS, your pool is >> SUPPOSED to have throughput of the average single disk from that pool >> and not some numbers growing skyhigh in a linear fashion. > > Could you please elaborate on this and explain it? > > - Andrew >
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?cf9b1ee00906280402g40dcd4b2p81dbf18612495d02>