Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 1 Nov 2004 11:48:06 +0100
From:      Brad Knowles <brad@stop.mail-abuse.org>
To:        "Alastair D'Silva" <freebsd@newmillennium.net.au>
Cc:        current@freebsd.org
Subject:   Re: Gvinum RAID5 performance
Message-ID:  <p06002006bdabc1160a6a@[10.0.1.3]>
In-Reply-To: <1099286568.4185c82881654@picard.newmillennium.net.au>
References:  <002401c4bf9c$c4fee8e0$0201000a@riker> <p06002002bdab24905ad8@[10.0.1.3]> <1099286568.4185c82881654@picard.newmillennium.net.au>

next in thread | previous in thread | raw e-mail | index | archive | help
At 4:22 PM +1100 2004-11-01, Alastair D'Silva wrote:

>                                                                 The offshoot
>  of this is that to ensure data integrity, a background process is run
>  periodically to verify the parity.

	That's not the way that RAID-5 is supposed to work, at least not 
the way I understand it.  I would be very unhappy if I was using a 
disk storage subsystem that was configured for RAID-5 and then found 
out it was working in this manner.  At the very least, I don't 
believe that we could/should do this by default, and adding code to 
perform in this manner seems to me to be unnecessary complexity.

	Keep in mind that if you've got a five disk RAID-5 array, then 
for any given block, four of those disks are data and would have to 
be accessed on every read operation anyway, and only one disk would 
be parity.  The more disks you have in your RAID array, the lower the 
parity to data ratio, and the less benefit you would get from 
checking parity in background.

>  Alternatively, simply buffering the (whole) stripe in memory may be 
>enough, as
>  subsequent reads from the same stripe will be fed from memory, rather than
>  resulting in another disk I/O (why didn't the on-disk cache feed 
>this request?

	Most disks do now have track caches, and they do read and write 
entire tracks at once.  However, given the multititudes of 
permutations that go on with data addressing (including bad sector 
mapping, etc...), what the disk thinks of as a "track" may have 
absolutely no relationship whatsoever to what the OS or driver sees 
as related or contiguous data.

	Therefore, the track cache may not contribute in any meaningful 
way to what the RAID-5 implementation needs in terms of a stripe 
cache.  Moreover, the RAID-5 implementation already knows that it 
needs to do a full read/write of the entire stripe every time it 
accesses or writes data to that stripe, and this could easily have 
destructive interference with the on-disk track cache.


	Even if you could get away from reading from all disks in the 
stripe (and delaying the parity calculations to a background 
process), you're not going to get away from writing to all disks in 
the stripe, because those parity bits have to be written at the same 
time as the data and you cannot afford a lazy evaluation here.

>  I did notice that the read from a single drive resulted in that 
>drive's access
>  light being locked on solid, while reading from the plex caused all drives to
>  flicker rather than being solid).

	I'd be willing to guess that this is because of the way data is 
distributed across the disks and the parity calculations that are 
going on as the data is being accessed.  Fundamentally, RAID-5 is not 
going to be as fast as directly reading the underlying disk.

>  I think both approaches have the ability to increase overall reliability
>  as well as improve performance since the drives will not be worked as hard.

	A "lazy read parity" RAID-5 implementation might have slightly 
increased performance over normal RAID-5 on the same 
controller/subsystem configuration, but the added complexity seems to 
be counter-productive.  If there is a hardware vendor that does this, 
I'd be willing to bet that it's a lame marketing gimmick and little 
more.

-- 
Brad Knowles, <brad@stop.mail-abuse.org>

"Those who would give up essential Liberty, to purchase a little
temporary Safety, deserve neither Liberty nor Safety."

     -- Benjamin Franklin (1706-1790), reply of the Pennsylvania
     Assembly to the Governor, November 11, 1755

   SAGE member since 1995.  See <http://www.sage.org/>; for more info.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?p06002006bdabc1160a6a>