From owner-freebsd-current  Tue Apr 18 14:44:49 1995
Return-Path: current-owner
Received: (from majordom@localhost)
          by freefall.cdrom.com (8.6.10/8.6.6) id OAA19989
          for current-outgoing; Tue, 18 Apr 1995 14:44:49 -0700
Received: from brasil.moneng.mei.com (brasil.moneng.mei.com [151.186.20.4])
          by freefall.cdrom.com (8.6.10/8.6.6) with SMTP id OAA19983
          for <freebsd-current@FreeBSD.org>; Tue, 18 Apr 1995 14:44:46 -0700
Received: by brasil.moneng.mei.com (4.1/SMI-4.1)
	id AA07382; Tue, 18 Apr 95 16:43:38 CDT
From: Joe Greco <jgreco@brasil.moneng.mei.com>
Message-Id: <9504182143.AA07382@brasil.moneng.mei.com>
Subject: Re: mmap bugs gone yet?
To: pete@silver.sms.fi (Petri Helenius)
Date: Tue, 18 Apr 1995 16:43:37 -0500 (CDT)
Cc: freebsd-current@FreeBSD.org
In-Reply-To: <199504181927.WAA00592@silver.sms.fi> from "Petri Helenius" at Apr 18, 95 10:27:59 pm
X-Mailer: ELM [version 2.4 PL24]
Content-Type: text
Content-Length: 4202      
Sender: current-owner@FreeBSD.org
Precedence: bulk

Hi Pete,

> I can agree that INN is _huge_ improvement over C-news but it still does not
> solve the problem that most likely an article comes in, gets stored to the
> disk and then when we're running a hub it's read from the disk about 20
> times if the fanout is 50 feeds. (the rest comes from the buffer) If INN would
> be able to integrate nntplink functionality and be multithreading, this
> could be (in normal circumstances) be reduced to around 5 times.

I would take this to mean that you're in a very memory-starved environment?
Are you running realtime nntplinks, or batchfile nntplinks?  I only deal
with realtime links, so the following discussion mostly applies there:

I do not think that this would be helped by the integration of nntplink into
INN, multithreaded or otherwise.  Basically, if a feed is falling behind, it
will tend to continue to fall even further behind, to a point where it is no
longer reasonable to maintain a cache of articles to be sent in memory.  At
this point, INN must log to disk the list of articles to be sent.  So there
are a few scenarios that I can envision, depending on the implementation:

INN could remember the entire article in memory, and provide it to the
nntplink threads.  I do not think that this would be efficient.  Consider
somebody posting several hundred largish (700K) articles in a row.  Consider
what the memory requirements would be for any sort of "queued in-RAM"
implementation.  Consider the consequences of a feed that was even just a
few articles behind.  Now multiply it times ten feeds.  :-(  !!!!

INN could remember the list of articles in memory, and provide that to the
nntplink threads.  The nntplink processes would mmap the articles in (or
read them in) and send them.  This basically relies on their existence
within the cache, in order to be efficient.  At some point, for a feed that
is falling behind, INN would have to flush the list of articles to a
batchfile (but by that point, it has probably been long enough that it is no
longer likely that the article is in cache anyways).

INN could do what it does now, which is essentially identical to the
scenario immediately above (from a VM/efficiency viewpoint), although 
there's a lot more process context switching.

The first scenario COULD be done if you assumed everyone has at least 128MB
of RAM and not too many feeds.  The second and third scenarios work poorly
with smaller amounts of memory and work better with much larger amounts of
memory.

In general, if a feed is not keeping up, you are virtually guaranteed to be
forced to reread the article from disk at some point.  The only optimization
I can think of would be to devise some method that would try to
"synchronize" these rereads in some way.  I don't see an easy way to do
_that_.  More memory allows you more caching.  Less memory screws you.  The
ideal news server would have a gigabyte of RAM and be able to cache about a
day's worth of news in RAM.  :-)

I do not see any real way to shift the paradigm to alter this scenario -
certainly not if we accept the fact that more than 5 feeds will be severely
lagging.

> Inn also goes to non-responsive mode when it processes a newgroup or a rmgroup
> and does not accept new connections during the processing of those. This is a
> quite hard issue to resolve but annoys quite a lot on a busy server.
> 
>  > (now we just have to devise the tools... sigh!)
>  >
> ;-)
> 
>  > I guess that depends on the system, and the number of feeds, etc...  It is
>  > my (current) general policy to try to scale news servers such that there are
>  > plenty of free cycles - and since many feeds are UUCP, they're already
>  > being compressed by the CPU.  :-)
> 
> That's not true around here. Very few of the feeds are UUCP and most likely
> the UUCP feeds are quite to very small. The article load is so huge that you
> don't want to spend the money on transmitting them over dialups. (because
> around here you pay for local calls)

Ah, well.

... Joe

-------------------------------------------------------------------------------
Joe Greco - Systems Administrator			      jgreco@ns.sol.net
Solaria Public Access UNIX - Milwaukee, WI			   414/342-4847