Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 17 Jun 1997 19:52:12 -0700 (PDT)
From:      asami@cs.berkeley.edu (Satoshi Asami)
To:        Shimon@i-connect.net
Cc:        FreeBSd-SCSI@FreeBSD.org
Subject:   Re: RAID Configuration Notes
Message-ID:  <199706180252.TAA08303@silvia.HIP.Berkeley.EDU>
In-Reply-To: <XFMail.970617153608.Shimon@i-Connect.Net> (message from Simon Shapiro on Tue, 17 Jun 1997 15:36:08 -0700 (PDT))

next in thread | previous in thread | raw e-mail | index | archive | help
 * I am getting several mail messages in regards to the DPT driver and RAID
 * configuration every day.  Makes me happy and proud :-)  If only I had a
 * penny per request :-))

Glad to hear that. :)

 * Anyway, I thought, at the risk of offending some of you with the trivial,
 * to provide some notes on RAID configuration.  Especially when it relates to
 * DPT controllers.

Great summary.  Let me point out some things that seems to me that
contradicts our findings.

First, our system.  I have -stable and -current running on a few
P6-200's at work.  They have IBM 9GB U-W disks, although not all of
them (like those on 14-disk strings) are running in 20MHz mode.

I also have a P6-233 at home, with three drives (Quantam Atlas I & II, 
Micropolis 3243WT).  They are all wide.  All machines have Adaptec
2940UW's have 3940UW's.

 * *  A single SCSI-{II,III) disk can perform about 140 disk I/Os per second.
 *    This statememt is true for block_size < 8K.  (Almost) Regardless of
 *    narrow/wide, ultra/fast, etc.  The reason being that, according to SCSI
 *    specifications, all negotiations and handshakes happen in narrow, async
 *    5MHz.  Otherwise slow/old devices will surely hang the bus.

I'm not well versed about SCSI specs (I'll leave that for Justin or
Stefan) but this is certainly not true.  By doing small reads from a
very small area on the raw disk device (like 1K out of 8K), I can get
220 IO/s from the IBM's at work, 100 from the Microp at home (don't
they have a cache?!?), 1,500 from the A-I at home and 2,400 from the
A-II at home.  These are repeated reads with one process.

No I'm not reading it from the disk cache, the reads are done from the
raw device and I see the disk activity light stay on during the test.
(Besides, I get 62,000 if I read from the block device. :)

Of course, if you meant "I/Os from the disk surface" and not the
cache, the limit is probably a hundred and something, but then the
disk type certainly will make a huge difference (not the interface,
but seek time and rotational speed).  Also, you need to define what
kind of I/O's you are talking about.  A random read from the outer
half of the disk surface will take less time than a random read from
all over the disk, for instance.

 * *  A ribbon-cable SCSI bus (parallel, not FCAL) can support up to 440 or so
 *    Tx/Sec.  Yes, this means that for very high activity, much more than 4
 *    drives per bus is a waste.

This is not true.  As I said above, I can get over 2,400 from a single 
disk on an 20MHz string.  By running many in parallel I could go up to
2,660 with 14 disks (running at 10MHz).  Here is how it grows:

 1    2    3    4    5    6    7    8    9   10   11   12   13   14
214  425  635  849 1066 1278 1489 1706 1910 2126 2319 2488 2591 2665

This is with the 1K/8K size given above.  With a 1K read from all over 
the drive surface, I get a little over 1,800 with 14 disks
(130/disk).  These are with one process per disk.

 * *  The declared MTBF for most modern 3.5" disk drives is around 800,000
 *    hours.  If you belive that, I own a bridge in London you should invest
 *    in.

 :)

 * *  RAID-1 gives OK READ performance and write performance similar to a
 *    single disk (DPT only, In-kernel RAID-1 writes at 0.5 the speed of a
 *    single-disk).  The drawback is a mximum size of disk * 2 and 50% space
 *    utilization.  Perfromance in degraded mode (a failed drive) is similar to
 *    a single disk (plus lots of noise (DPT)).

It's only if you are running RAID-1 with two disks.  The write
performance is typically a little less than a RAID-0 spanning half the
drives.  Of course, that depends on the number of disks (the data has
to go over the SCSI bus twice, so if you have enough fast disks to
saturate the bus, it will hit the ceiling faster).  For instance, here
with two 20MHz strings, I get 29MB/s for 4 disks striped and 20MB/s
for 8 disks striped/mirrored.

 * *  RAID-5 has capacity utilization of disk * (no_of_disks - 1).  READ is as
 *    fast as a RAID-0 (in optimal state) and WRITE is slowest of the bunch.
 *    WRITE performance is about 10MB/Sec on the newest firmware (vs 15-20 for 
 *    RAID-0.

That's very good compared to software parity.  (Just another
disincentive for implementing parity in ccd.... ;)

 * *  Hot spares apply to the entire controller, not just a particuloar array.
 *    In the above example, if you defined a RAID-0 array (which includes 0-3,
 *    1-6, and 2-10), you (can define but) do NOT need another hyot spare.
 *    2-9 will do for either.

(I'm not sure what a hot spare will do for your RAID-0 array, but
 that's ok. :)

 * *  To have RAID arrays span controllers and/or have the ability to exapand
 *    an existing array, you will have to either wait or integrate these
 *   changes yourself (messy).

How about having two controllers on two PCs share the same string?
That will guard against PC and adapter failures.  We are planning to
do this with our system.  The Adaptecs are happy as long as you don't
try to boot both machines at the same time with the boot disks on the
shared string (if you have a system disk on an unshared string and
disable the BIOS, it will be ok).  Do the DPTs allow for the SCSI ID's 
to be changed?

Satoshi



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199706180252.TAA08303>