Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 29 Oct 2001 19:16:07 +1030
From:      Greg Lehey <grog@FreeBSD.org>
To:        Bruce Evans <bde@zeta.org.au>
Cc:        Matthew Jacob <mjacob@feral.com>, Doug Rabson <dfr@nlsystems.com>, Luigi Rizzo <rizzo@aciri.org>, John Baldwin <jhb@FreeBSD.org>, Jonathan Lemon <jlemon@FreeBSD.org>, cvs-all@FreeBSD.org, cvs-committers@FreeBSD.org
Subject:   Re: RAID-5 parity calculations (was: cvs commit: src/sys/dev/fxp if_fx)
Message-ID:  <20011029191607.C19178@monorchid.lemis.com>
In-Reply-To: <20011029190516.G9442-100000@delplex.bde.org>; from bde@zeta.org.au on Mon, Oct 29, 2001 at 07:20:38PM %2B1100
References:  <20011029100728.D88146@monorchid.lemis.com> <20011029190516.G9442-100000@delplex.bde.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Monday, 29 October 2001 at 19:20:38 +1100, Bruce Evans wrote:
> On Mon, 29 Oct 2001, Greg Lehey wrote:
>
>> On Sunday, 28 October 2001 at 22:57:33 +1100, Bruce Evans wrote:
>>> On Sat, 27 Oct 2001, Greg Lehey wrote:
>>>
>>>> On Thursday, 25 October 2001 at 15:24:06 -0700, Matt Jacob wrote:
>>>>>
>>>>> And the fastest software RAID-V I've known was at NASA/Ames on the
>>>>> Convex 3280s- they used the otherwise unused vector units for parity
>>>>> calculations- this gave write performance for a 22 wide stripe on a
>>>>> terabyte fileystem to be at about 88% of theoretical maximum, which
>                                         ^^^^^^^^^^^^^^^^^^^^^^^^^^
>>>>> sure aint' bad.
>>>>
>>>> The parity calculations for RAID-5 are several orders of magnitude
>>>> faster than the disk accesses.  Even on a 486, they took hardly any
>>>> time.
>>>
>>> Actually, a 486 can't possibly have been more than about one order of
>>> magnitude faster than the disk accesses, since main memory was only
>>> that much faster (usually less).  My 486DX2/66 has 15MB/sec main memory
>>> and a 2MB/sec disk.  It would be possible to upgrade the disk (but not
>>> the memory).  Then the disk would want to transfer at about half an
>>> order of magnitude faster then the memory.
>>
>> My claims are based on measurements, not theory.
>
> So are mine.  Since we are talking about the theoretical maximum, only
> certain measurements are relevant.

My point was that any practical results which contradict the
theoretical would cast doubt on the theory:

> Actually, a 486 can't possibly have been more than about one order
> of magnitude faster than the disk accesses,

It's a valid point, however, that if you're really writing
sequentially to a RAID-5 stripe, things change.  In particular, you
would use a different algorithm which would write whole stripes only.
Vinum doesn't do this, because this kind of access remains almost
completely theoretical, and it would harm the normal case of random
accesses.

>> You're forgetting that most of the transfer time is in positioning.
>> That's why (in the original message) I mentioned the transfer size.  A
>> 2 MB/s disk is fast for those days; I've seen more like 800 kB/s. Even
>> accepting your values, the average seek time is 10 ms (check with
>> rawio if you have a different expectation).  Such a disk, doing
>> transfers of 6 kB, will perform about 75 random transfers per second,
>> or about 450 kB/s.  (By comparison, a disk with 800 kB/s transfer rate
>> would perform about 57 transfers).
>
> I didn't forget this.  It's not interesting that the disk can be slowed
> down by a huge factor by writing tinygrams.

It's immensely interesting.  That's the way most things work nowadays.
Look at the output of iostat or a similar tool.  Sure, it's a lot
sexier to talk about sequential transfers, but they're extremely
uncommon.

> I also intentionally didn't mention that main memory speed might not
> be a factor because the i/o is already pessimized by using PIO.
> (The extra main memory accesses for parity computations may reduce
> the main memory accesses for PIO.)

Yes, that's a valid point, and I was also ignoring PIO.  My tests were
done with SCSI (AHA 1542B in the case of the 486).

Greg
--
See complete headers for address and phone numbers

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe cvs-all" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20011029191607.C19178>