Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 21 Mar 1999 08:44:36 +1030
From:      Greg Lehey <grog@lemis.com>
To:        Tom <tom@sdf.com>
Cc:        Nick Hilliard <nick@iol.ie>, freebsd-scsi@FreeBSD.ORG
Subject:   Re: dpt raid-5 performance
Message-ID:  <19990321084436.Z429@lemis.com>
In-Reply-To: <Pine.BSF.4.05.9903201256020.12986-100000@misery.sdf.com>; from Tom on Sat, Mar 20, 1999 at 12:58:40PM -0800
References:  <19990321074613.V429@lemis.com> <Pine.BSF.4.05.9903201256020.12986-100000@misery.sdf.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Saturday, 20 March 1999 at 12:58:40 -0800, Tom wrote:
>
> On Sun, 21 Mar 1999, Greg Lehey wrote:
>
>> On Saturday, 20 March 1999 at  8:55:27 -0800, Tom wrote:
>>>
>>> On Fri, 19 Mar 1999, Nick Hilliard wrote:
>>>
>>>> If I set up a raid-5 system on these disks with a slice-size of 512K, the
>>>> write performance is terrible.  Bonnie reports:
>>>
>>>   512k will not result in very good single process write performance.
>>> In fact, 512k is suboptimal for many operations on RAID-5 arrays.
>>
>> I disagree.  It's pretty much optimal for Vinum.  I'd be interested in
>> your reasoning.
>
>   Not so optimum for a DPT card running RAID-5.

Why not?

> FreeBSD will likely never send IOs big enough for the strip size of
> 512kb to ever be useful.

It does.  The largest I/O transfer is 64 kB.  You don't want to
fragment requests, because that increases the I/O load on the array.

>>> If you are optimizing for a single process, use something smaller
>>> (like 8k).
>>
>> That's far, far too small.
>
>   Why?

From an earlier message:

> On Thursday, 17 December 1998 at  1:05:29 -0500, Matthew Patton wrote:
>> Here are my suggestions, all are predominantly HW based. In all cases, 1
>> hot spare per channel. Plenty of ambient cooling and power regulation is
>> needed. In addition to the simple 'art' of RAID selection there is also
>> 'slice/interleave' size to factor in. With drives doing full track read
>> aheads with their own fancy algorithms and varying lamounts of onboard
>> cache, precise numbers are very difficult to come up with. It depends on
>> what kinds of IO you do. On a database, I would probably use 16 or
>> 32kb.
>
> On FreeBSD, this is too small.
>
>> A media server (large files) 64kb or more. On boxes with lots of
>> small sizes, accessed randomly and rapidly, 8 or 16kb.
>
> Far too small
>
>> But you dont' want to fill up the controller's command queue with
>> too many commands.
>
> That's not the big problem.  The fact is that the block I/O system
> issues requests of between .5kB and 60 kB; a typical mix is somewhere
> round 8 kB.  You can't stop any striping system from breaking a
> request into two physical requests, and if you do it wrong it can be
> broken into several.  This will result in a significant drop
> in performance: the decrease in transfer time per disk is offset by
> the order of magnitude greater increase in latency.
>
> With modern disk sizes and the FreeBSD block I/O system, you can
> expect to have a reasonably small number of fragmented requests with a
> stripe size between 256 kB and 512 kB; I can't see any reason not to
> increase the size to 2 or 4 MB on a large disk.
>
> The easiest way to consider the impact of any transfer is the total
> time it takes: since just about everything is cached, the time
> relationship between the request and its completion is not important.
> Consider, then, a typical news article of 24 kB, which will probably
> be read in a single I/O.  Take disks with a transfer rate of 6 MB/s
> and an average positioning time of 8 ms, and a file system with 4 kB
> blocks.  Since it's 24 kB, we don't have to worry about fragments, so
> the file will start on a 4 kB boundary.  The number of transfers
> required depends on where the block starts: it's (S + F - 1) / S,
> where S is the stripe size in file system blocks, and F is the file
> size in file system blocks.
>
> 1: Stripe size of 4 kB.  You'll have 6 transfers.  Total subsystem
>   load: 48 ms latency, 2 ms transfer, 50 ms total.
>
> 2: Stripe size of 8 kB.  On average, you'll have 3.5 transfers.  Total
>   subsystem load: 28 ms latency, 2 ms transfer, 30 ms total.
>
> 3: Stripe size of 16 kB.  On average, you'll have 2.25 transfers.
>   Total subsystem load: 18 ms latency, 2 ms transfer, 20 ms total.
>
> 4: Stripe size of 256 kB.  On average, you'll have 1.08 transfers.
>   Total subsystem load: 8.6 ms latency, 2 ms transfer, 10.6 ms total.
>
> These calculations are borne out in practice.

I haven't yet replied to Nick's message because I wanted to check
something here first, and I've been too busy so far.  But I'll come
back with some comparisons.

Greg
--
See complete headers for address, home page and phone numbers
finger grog@lemis.com for PGP public key


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-scsi" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?19990321084436.Z429>