Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 17 Jun 1997 15:36:08 -0700 (PDT)
From:      Simon Shapiro <Shimon@i-Connect.Net>
To:        FreeBSd-SCSI@FreeBSD.org
Subject:   RAID Configuration Notes
Message-ID:  <XFMail.970617153608.Shimon@i-Connect.Net>

next in thread | raw e-mail | index | archive | help
Hi Y'all

I am getting several mail messages in regards to the DPT driver and RAID
configuration every day.  Makes me happy and proud :-)  If only I had a
penny per request :-))

Anyway, I thought, at the risk of offending some of you with the trivial,
to provide some notes on RAID configuration.  Especially when it relates to
DPT controllers.

Bandwidth:

*  WRITEs are always slower than READS.  The ``how much'' depends
   on the configuration.  Details to follow.

*  A single SCSI-{II,III) disk can perform about 140 disk I/Os per second.
   This statememt is true for block_size < 8K.  (Almost) Regardless of
   narrow/wide, ultra/fast, etc.  The reason being that, according to SCSI
   specifications, all negotiations and handshakes happen in narrow, async
   5MHz.  Otherwise slow/old devices will surely hang the bus.

*  A ribbon-cable SCSI bus (parallel, not FCAL) can support up to 440 or so
   Tx/Sec.  Yes, this means that for very high activity, much more than 4
   drives per bus is a waste.

*  From my observations, FreeBSD does ALL block device I/O (including all
   filesystem operations) in 4K increments.  Regardless of your specified
   ``block size''.  Raw device (character) can be no larger than 64K.

*  RAID-0 gives you the best performance.  The risk is that ANY device
   failure will take with it the entire array (data/functionality wise).

*  The MTBS of ANY RAID array is the MTBF of the worst device divided by
   the number of devices.

*  The declared MTBF for most modern 3.5" disk drives is around 800,000
   hours.  If you belive that, I own a bridge in London you should invest
   in.

*  RAID-1 gives OK READ performance and write performance similar to a
   single disk (DPT only, In-kernel RAID-1 writes at 0.5 the speed of a
   single-disk).  The drawback is a mximum size of disk * 2 and 50% space
   utilization.  Perfromance in degraded mode (a failed drive) is similar to
   a single disk (plus lots of noise (DPT)).

*  RAID-5 has capacity utilization of disk * (no_of_disks - 1).  READ is as
   fast as a RAID-0 (in optimal state) and WRITE is slowest of the bunch.
   WRITE performance is about 10MB/Sec on the newest firmware (vs 15-20 for 
   RAID-0.

Performance:

*  With software interrupts disabled, 256 parallel dd reading and writing,
   with all error checking enabled we see:  

   RAID  | READ | WRITE
   ------+------+------
    none |  3-4 |   3-4 
    0    | 18-20 | 15-18
    1    | 10-15 |  8-12
    5    | 15-20 |  5-10

   RAID-0 is two drives, RAID-5 is 5 drives, across 2 busses (see below).

Configuration:

*  The DPT defaults for cache copnfiguration are fine for general 
   filesystem operation and quite useless for database work.  They are
   easily tunable. They are useless because they allocate 30% of the cache
   to read-ahead buffers and use 128K stripes.

*  You can tune the cache as well as the cache that disk drives have
   on-board.

*  When you configure RAID arrays across busses on a DPT, stripe the array
   across.

   Example:  Three busses, bus 0 has targets 1, 2, & 3, bus 1 has 4, 5, & 6
   and bus 2 has 8, 9, & 10.  When configuring a 5 wide RAID-5 array select
   the devices, in dptmgr) in this order:  0-1, 1-4, 2-8, 0-2, 1-5, and hot
   spare on 2-9.  The next array will use 0-3, 1-6, and 2-10.

  This will force the DPT to ``jump'' busses when moving from stipe to 
   stripe and result in HUGE boost in perfromance.

*  Hot spares apply to the entire controller, not just a particuloar array.
   In the above example, if you defined a RAID-0 array (which includes 0-3,
   1-6, and 2-10), you (can define but) do NOT need another hyot spare.
   2-9 will do for either.

*  To have RAID arrays span controllers and/or have the ability to exapand
   an existing array, you will have to either wait or integrate these
  changes yourself (messy).

I hope this provides some background and helps.

Simon



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?XFMail.970617153608.Shimon>