Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 02 Feb 2000 08:31:21 -0700
From:      "Justin T. Gibbs" <gibbs@FreeBSD.org>
To:        Greg Lehey <grog@lemis.com>
Cc:        "Justin T. Gibbs" <gibbs@FreeBSD.org>, Gary Palmer <gjp@in-addr.com>, scsi@FreeBSD.org, up@3.am, Wilko Bulte <wilko@yedi.iaf.nl>
Subject:   Re: Definitions of RAID levels (was: hardware vs software stripping) 
Message-ID:  <200002021531.IAA00607@caspian.plutotech.com>
In-Reply-To: Your message of "Wed, 02 Feb 2000 12:33:17 %2B1030." <20000202123317.P55303@freebie.lemis.com> 

next in thread | previous in thread | raw e-mail | index | archive | help
>>> My understanding is that RAID-3, effectively striping at a sub-sector=

>>> level, can give much higher data rates without buffering, and that's
>>> its raison d'=EAtre.
>>
>> If you stripe at the sub-sector level, you must perform RMW.  This mak=
es
>> absolutely no sense.
>
>I think you're misunderstanding my use of the term "stripe".  I'm not
>talking about "transactions" here, I'm talking about layout.  If I
>have a 9 disk RAID-[345] set with a stripe size of 64 bytes, I can
>read one sector from each of the 8 data disks and have a total of 8
>sectors.  I can do the same thing if each disk contains an individual
>bit of a byte.  Older disk and drum technology used a very similar
>method (multiple heads) to speed up transfer times.  With relatively
>simple hardware support, this would make a lot of sense, and if RAID-3
>is really what you say, it makes me wonder why people haven't thought
>of this alternative.

Take a look at this diagram:

http://sunsite.berkeley.edu/Dienst/UI/2.0/Page/ncstrl.ucb/CSD-87-391/16

They don't use "minimum transaction size", they use "transfer units".
Its the same thing.  In this example, the "transfer unit" is a sector.
In Pluto's system, the effective sector size is 64K (its too inefficient
to perform sector I/O) and the "transfer unit" is a block of video frames=
=2E
If we are recording uncompressed video, a video frame is ~500K.  This mea=
ns
you must read more than one of the drives in a stripe in order to get the=

entire frame, but it may be possible to not read them all to get just a
single frame.  This is what I meant by our system allowing independent
access, but my assertion that it didn't buy us anything.  Putting all
of a particular frame's data on a single drive would yield too much
latency for random frame fetches, so we don't use that layout.

The main point I've been trying to make in all of this is that the data
need not be bit or byte striped.  In the example in the Berkeley paper,
the disk strip size is 1/4th of a sector.  The distinction is all based o=
n
what your "record" size is and whether you can store records without
crossing disk boundaries so it makes sense to allow independent access.

>> If your transaction is larger, perhaps you satisfy it by modifying 1 o=
r
>> more full stripes and only partially modifying the border stripes.
>> The point is still the same.
>
>Well, I can't see that.  You're saying that RAID-4 stripes should be a
>multiple of the transaction size, and I'm saying the "transaction"
>size is variable.  The "point" seems to be that this is the main
>difference in your definitions of RAID-3 and RAID-4.

If a user requests to read 64K of a file on an 8K file system, can you
not see that on a system where each block is on an independent spindle
and those 64K happen to be contiguous that you are not forced to make
more than 10 read transactions even if your stripe covered more disks
than that?  If, on the other hand, you striped each 8k block across all
drives, you'd not have that luxury.  That is the difference.

>> That is the difference.
>
>>From what?  This thread was about the differences between RAID-3 and
>RAID-4.  I don't see anything different in these definitions except
>possibly the software.

I give up then.  I don't know how to make it any clearer.

>> The same way it is defined in the dictionary.  =

>
>OK, let me get a dictionary:
>
>	       4.  The action of passing or making over a thing from
>                   one person, thing, or state to another.

This is the correct definition.  Add "minimum sized" in front of
transaction and you should get the right idea.  If you don't like
that term, use "unit transfer".

>OK, and you could do this without changing the physical layout?  In
>that case, I'd suggest this is RAID-4, not RAID-3.  Note that the text
>you quote states:

It is only RAID-4 if you can access a single disk and get all of the
required data to do something useful.  This is not the case in our
system.

>   Unlike RAID Level 3, however, a RAID Level 4 array's member disks
>   are independently accessible.
>
>This still suggests to me that there is something about RAID-3 layout,
>not the software implementation, which makes it impossible to access
>drives individually.

I've already covered why this is the case.  There was also the assumption=

in the past that the spindles would be synchronized.  This is no longer
the typical case.

Anyway, that's all I have to say about RAID levels.  I'm sorry I ever
brought it up.

--
Justin




To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-scsi" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200002021531.IAA00607>