Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 14 Jul 2003 10:02:58 +0930
From:      Greg 'groggy' Lehey <grog@FreeBSD.org>
To:        Dan Nelson <dnelson@allantgroup.com>
Cc:        freebsd-questions@freebsd.org
Subject:   Re: vinum and hot-swapping
Message-ID:  <20030714003258.GP94666@wantadilla.lemis.com>
In-Reply-To: <20030713191453.GF23909@dan.emsphone.com>
References:  <200307130245.h6D2j8HB000556@soth.ventu> <20030713191453.GF23909@dan.emsphone.com>

next in thread | previous in thread | raw e-mail | index | archive | help

--QKpLca3blcvhMJ0W
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline

On Sunday, 13 July 2003 at 14:14:53 -0500, Dan Nelson wrote:
> In the last episode (Jul 13), Andrea Venturoli said:
>> ** Reply to note from "Greg 'groggy' Lehey" <grog@freebsd.org> Sat, 12 Jul 2003 17:13:29 +0930
>>> The real performance penalty for RAID-5 is simply that writes require
>>> so much I/O.  Expect 25% of the write performance of RAID-0.
>>
>> Ok, I must ask this: Shouldn't SCSI system allow paralell writes on
>> different disks? If so, why so much penalty?
>
> Parallel I/Os are already being used.  A short write on a RAID-5 array
> requires you to
>
> 1) Read the original block and the parity block (done in parallel)
>
> 2) XOR the parity block with the original block and the new block

(which takes no time at all).

> 3) Write the new block and the parity block (done in parallel)
>
> Which means that you're doing 4 times the I/O that a plain RAID-5 read
> would do.

I think the confusion is that people think that, because the I/O
transfers in (1) and (2) are in parallel, this is only twice the time,
not four times.  That's true from a latency point of view, but not
from a throughput point of view.

> There's no getting around this problem for small random writes.
> Repeated writes to the same locations only cost two writes, since
> the original and parity blocks are probably still in cache.

Vinum currently doesn't cache the blocks.  That's an issue I'm
thinking about.

> There is a threshold point where this stops being an issue, however.
> When your write size becomes larger than the raid-5 stripe width
> (stripe size * number of data disks), you can simply calculate the
> parity block directly and not have to read anything.  At this point,
> raid-5 magically becomes as efficient as raid-0 :)
>
> I don't believe vinum can optimize full-stripe writes, though, since
> FreeBSD can only do I/O in 64k max chunks,

128 kB.

> and since vunum is software instead of battery-backed hardware RAID,
> it cannot hold off on multiple writes until the stripe fills up.

It could, but it would be dangerous.  I've been thinking of offering
it as an option (how often do systems really go down?).

Greg
--
When replying to this message, please copy the original recipients.
If you don't, I may ignore the reply or reply to the original recipients.
For more information, see http://www.lemis.com/questions.html
See complete headers for address and phone numbers

--QKpLca3blcvhMJ0W
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.0 (FreeBSD)

iD8DBQE/Efo6IubykFB6QiMRAksHAJ9agRzogy77IF8tebtYpEOujslC0QCfbB/U
Mr6+9RdZxhFbSUPtqGcJ2ik=
=RPy1
-----END PGP SIGNATURE-----

--QKpLca3blcvhMJ0W--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20030714003258.GP94666>