Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 5 May 2001 12:02:44 +0930
From:      Greg Lehey <grog@lemis.com>
To:        "Brad L. Chisholm" <blc@bsdwins.com>
Cc:        freebsd-hackers@FreeBSD.org
Subject:   Re: Optimal setup for large raid?
Message-ID:  <20010505120243.F67787@wantadilla.lemis.com>
In-Reply-To: <20010504125858.C18876@bsdone.bsdwins.com>; from blc@bsdwins.com on Fri, May 04, 2001 at 12:58:58PM -0400
References:  <20010504125858.C18876@bsdone.bsdwins.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Friday,  4 May 2001 at 12:58:58 -0400, Brad L. Chisholm wrote:
> I sent this to -questions a few days ago, but never received
> any response, so I thought I'd try here.  My apologies if you've
> seen this more than once.
>
> I'm also interested in what might be appropriate filesystem
> settings (newfs) for a large volume like this which will contain
> relatively few, large files.
>
> -------------
>
> We are planning to create a large software raid volume,
> and I am interested in input about what might make the
> best configuration.
>
> We have 52 identical 9Gb drives (Seagate ST19171W) spread
> across 4 SCSI controllers (Adaptec AHA 2944UW), with 13
> drives per controller.  We want fault-tolerance, but cannot
> afford to "waste" 50% of our space for a mirrored (raid1)
> configuration.  Thus, we are considering some sort of
> raid5 setup using vinum (possibly in combination with ccd).

I can't see any advantage in using ccd here.  It doesn't do RAID-5
itself, and Vinum does everything that ccd does.

> We are running FreeBSD 4.3-RELEASE, on a 550Mz P3 with
> 384Mb of memory.
>
> Possible configurations:
>
>    Configuration #1: A single raid5 vinum volume consisting of all
>     52 drives.
>
>    Questions:
>
>       A) Is there a performance penalty for this many drives in a
>       raid5 array?

Not in normal operation.  In degraded operation, where one drive is
down, you have to read from *all* the drives to reconstruct a block on
the dead drive.  On the other hand, this would happen less often
(every 51 accesses), so maybe it wouldn't be such a hit after all.  In
addition, you'd have a greater chance of a drive failing, and also a
greater chance of two drives failing (which is unrecoverable).

>       B) Should the plex be configured with sequential drives on
>          different controllers?  (i.e. if drives 1-13 are on controller 1,
>          14-27 on controller 2, 27-39 on controller 3, and 40-52 on
>          controller 4, should the drive ordering be:
>
>              1,14,27,40,2,15,28,41,...
>          or  1,2,3,4,5,6,7,8,...

Good question.  I suppose the hopping would give marginally better
performance for single accesses, and since your files are big, this
might be a better way to go.

>
>     Configuration #2: Multiple raid5 vinum volumes (perhaps 1 per controller),
>                       combined into a single volume by striping the raid5
>                       volumes.  (Basically a "raid50" setup.)
>
>     Questions:
>
>        A) Is this possible with vinum?

No.

>           From the documentation, it didn't appear to be, so we were
>           considering using 'ccd' to stripe the raid5 volumes
>           together.

Ah, that was the reason.  I still don't think this is a good idea.

>        B) Would this perform better, worse, or about the same as #1?

Under normal circumstances there shouldn't be any difference.

> Any other configurations that might prove superior?

If you really need a single volume of that size, you're probably
better off with scenario #1.

> The final volume will be used as an online backup area, and will
> contain a relatively few, large tar files.  Write performance will
> likely be more important that read, although I realize using raid5
> will impact write performance.

To put it more clearly: write performance on RAID-5 is terrible, about
25% of read performance.

> Any suggestions on what might be the best stripe size to use?

Not a power of 2, between 256 and 512 kB.

Greg
--
Finger grog@lemis.com for PGP public key
See complete headers for address and phone numbers

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-hackers" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20010505120243.F67787>