From owner-freebsd-doc  Wed Sep 18 18:58:59 1996
Return-Path: owner-doc
Received: (from root@localhost)
          by freefall.freebsd.org (8.7.5/8.7.3) id SAA24129
          for doc-outgoing; Wed, 18 Sep 1996 18:58:59 -0700 (PDT)
Received: from brasil.moneng.mei.com (brasil.moneng.mei.com [151.186.109.160])
          by freefall.freebsd.org (8.7.5/8.7.3) with ESMTP id SAA24098;
          Wed, 18 Sep 1996 18:58:56 -0700 (PDT)
Received: (from jgreco@localhost) by brasil.moneng.mei.com (8.7.Beta.1/8.7.Beta.1) id UAA10222; Wed, 18 Sep 1996 20:57:52 -0500
From: Joe Greco <jgreco@brasil.moneng.mei.com>
Message-Id: <199609190157.UAA10222@brasil.moneng.mei.com>
Subject: Re: News server...
To: froden@bigblue.no
Date: Wed, 18 Sep 1996 20:57:52 -0500 (CDT)
Cc: isp@freebsd.org, doc@freebsd.org
In-Reply-To: <199609182327.BAA26853@login.bigblue.no> from "Frode Nordahl" at Sep 19, 96 00:37:11 am
X-Mailer: ELM [version 2.4 PL24]
Content-Type: text
Sender: owner-doc@freebsd.org
X-Loop: FreeBSD.org
Precedence: bulk

Maybe the DOC guys could throw the "long" part of this message into 
the FreeBSD handbook someplace...  please?  :-)  I took a little extra
time to explain in detail.

> >More RAM, definitely.  :-)
> 
> Thought you would say that :-))

It is a given no matter how much RAM you have :-)  But here you REALLY
need it.

> >> 4x Quantum Empire (2GB)
> >> 1x Quantum Fireball (500MB)
> >> 
> >> Disk configuration
> >> 1x Quantum Empire - root device
> >> 3x Quantum Empire striped with ccd (Ileave: 16 = 8kb) for /var/spool/news
> >
> >Too small an interleave.  You want each drive to be able to complete a
> >transaction on its own...  and I am not talking about a single read,
> >I mean (minimally) the terminal directory traversal and file read for the
> >article in question.  You do not want two or three drives participating
> >in this operation.  Search the mailing list archives, I am tired of
> >explaining it.
> 
> Ok...
> 
> >I use an interleave size equal to the number of blocks in a
> >_CYLINDER_GROUP_.  That is a BIG number.
> 
> 32MB in this case.  But that wil not help on performance, or would it?  The optimum stripe size for RAID in 
> hardware is 8kb...

That is because "optimum" is defined to be "fastest throughput for a large
sequential transaction" for many RAID's.  The heads on all N drives move
"in sync", so that Drive 0 is reading Block "B * N + 0", Drive 1 is reading
Block "B * N + 1", etc, etc.  This means that while one drive may only
be able to feed data at 2.5MB/sec, all N drives together will feed data
to the host at about 2.5MB/sec TIMES N.  Great!  For sequential accesses.

If all the heads on your news server disks are moving "in sync", you lose
the advantage of multiple spindles and you might as well get a single
9GB Barracuda drive.  And news articles are small anyways, so you are
not buying anything in terms of speed - the article most likely will come
off one drive.  People who have not thought it through will argue
that this means an interleave of 8K is great for news.  But it's not.

What you want is for each spindle to be able to handle a separate
transaction on its own.  By increasing the size of the interleave to
a LARGE number, you increase the likelihood that the one drive will
complete an entire operation on its own.

Consider the case of reading alt.sex article #12345.  On a machine that
holds alt 7 days, the directory is

daily-bugle# ls -ld /news/alt/sex  
drwxrwxr-x  133 news  news  50688 Sep 18 20:31 /news/alt/sex

50K.  If you stripe 8K, and assuming the directory is stored in sequential
blocks (valid enough assumption - the disk blocks are _nearby_ if not
sequential)..

First we read the directory.  Get the first 8K from Drive 0..  scan.. nope.
Get the second 8K from Drive 1.. scan.. nope..  Get the third 8K from
Drive 2... scan.. nope..  Get the fourth 8K from Drive 3... scan.. ah,
we found it.  The article is stored "nearby", thanks to FFS optimizations..
hmm..  ok get the 4K of article data from Drive 2.

Does this seem stupid?  We have forced all 4 drives to position their heads
in the same area.

Now let's repeat with a LARGE stripe size.

First we read the directory.  We iterate through 26K of data on Drive 0
and we find the article.  The article is stored "nearby" and happens to be
within this cylinder group, thanks to FFS optimizations..  so we go read
that 4K of data with Drive 0, and voila, magically that ONE drive head
is already in the general area of the needed access.

What are your other three drives doing?  Why, they are concurrently getting
three OTHER articles needed by other readers.  Three other articles, that,
in the 8K scenario would have to be fetched sequentially.

All of a sudden it looks like it is really nice to have many drives,
because if you configure your system correctly, you can get "N" 
simultaneous accesses where N is number of drives you have.

> >> 1x Quantum Fireball - /usr/lib/news (News configuration to avoid excessive I/O on the root dev.)
> >
> >Good idea.
> 
> Yeah.. Since this computer is going to do other stuff than news too, slowing down the root device with I/O is not a 
> smart idea.. :)

Oooooooooooooooh.....  bad idea.  Dedicate the machine to news.

> >> The disks are connected to two Adaptec 7850 controllers.  the news related disks alone on the second 
> >> adaptec and the root dev on the first.
> >
> >So you have one "underutilized" SCSI bus... the first one.  Spread the disks
> >out between the busses.
> 
> Ok...
> 
> >> Does this look like a reasonable setup?  This news server does not handle any feeds (Except for the 
> incoming 
> >> feed from our provider).  Only client access. (For now).
> >
> >How many simultaneous clients do you expect to be able to handle?
> 
> The maximum peak we expect (For now) is 16*3 clients (16 clients using 3 connections each).

3 connections each?  You really should discourage that.  You eat a LOT of
resources to do that.

> >How long do you keep news?
> 
> We keep news for 5 days except for alt.binaries.* that we keep for 3 days maximum.

You do not have enough space for alt.binaries.  I just filled up a pair of
4GB drives (CCD 8GB) on a system that holds binaries for 2 days..  sigh
You will be tight holding the remainder of Usenet for 5 days in only 6GB.

> Thanks for the info!
> ---------------------------------
> Frode Nordahl <froden@bigblue.no>

No problem.

... JG