From owner-freebsd-hackers  Sat Dec  2 11: 5:33 2000
Delivered-To: freebsd-hackers@freebsd.org
Received: from earth.backplane.com (placeholder-dcat-1076843399.broadbandoffice.net [64.47.83.135])
	by hub.freebsd.org (Postfix) with ESMTP id 64F8D37B400
	for <hackers@freebsd.org>; Sat,  2 Dec 2000 11:05:29 -0800 (PST)
Received: (from dillon@localhost)
	by earth.backplane.com (8.11.1/8.9.3) id eB2J4An63970;
	Sat, 2 Dec 2000 11:04:10 -0800 (PST)
	(envelope-from dillon)
Date: Sat, 2 Dec 2000 11:04:10 -0800 (PST)
From: Matt Dillon <dillon@earth.backplane.com>
Message-Id: <200012021904.eB2J4An63970@earth.backplane.com>
To: News History File User <newsuser@free-pr0n.netscum.dk>
Cc: hackers@freebsd.org, usenet@tdk.net
Subject: Re: vm_pageout_scan badness
References: <200012011918.eB1JIol53670@earth.backplane.com> <200012020525.eB25PPQ92768@newsmangler.inet.tele.dk>
Sender: owner-freebsd-hackers@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.ORG

:closely the pattern of what happens to the available memory following
:a fresh boot...  At the moment, this (reader) machine has been up for
:half a day, with performance barely able to keep up with a full feed
:(but starting to slip as the overnight burst of binaries is starting),
:but at last look, history lookups and writes are accounting for more
:than half (!) of the INN news process time, with available idle time
:being essentially zero.  So...

    No idle time?  That doesn't sound like blocked I/O to me, it sounds
    like the machine has run out of cpu.

:Following the boot, things start out with plenty of memory Free, and
:something like 4MB Active, which seems reasonable to me.  Then I start
:things.
:
:As is to be expected, INN increases in size as it does history lookups
:and updates, and the amount of memory shown as Active tracks this,
:more or less.  But what's happening to the Free value!  It's going
:down at as much as 4MB per `top' interval.  Or should I say, what is
:happening to the Inactive value -- it's constantly increasing, and I
:observe a rapid migration of all the Free memory to Inactive, until
:the value of Inactive peaks out at the time that Free drops to about
:996k, beyond which it changes little.  None of the swap space has
:been touched yet.
:
:As soon as the value for Free hits bottom and that of Inactive has
:reached a max, now the migration happens from Inactive to Active --
:until this point, the value of Active has been roughly what I would
:expect to see, given the size of the history hash/index files, and
:the BerkeleyDB file I'm now using MAP_NOSYNC as well for a definite
:improvement in overview access times.

    Hmm.  An increasing 'inactive' most often occurs when a program
    is reading a file sequentially.  It sounds like most of the inactive
    pages are probably due to reader requests from the spool.

:>     Is it possible that history file rewriting is creating an issue?  Doesn't
:>     INN rewrite the history file every once in a while to clear out old
:>     garbage?  I'm not up on the latest INN.
:
:In normal operation, no -- the text file is append-only (the text file
:isn't used for lookups with the MD5-based hashing), and expire, which
:I'm running manually, rewrites the hash files -- leading to a mysterious
:lack of space today when I attempted to run both expire and makedbz (a
:variant of makehistory), and apparently some reader processes or some
:daemons still had the old inodes open, until suddenly in one swell foop,
:some 750MB was freed up -- far more than I expected to see, so I should
:probably look into this space usage sometime...
:
:This shouldn't be a problem the way I'm running things now.  I haven't
:run an expire process since the last reboot to observe things closely.

    Woa.  750MB?  There are only two things that can cause that:

    * A process with hundreds of megabytes of private store exited

    * A large (500+ MB) file is deleted after having previously been
      mmap()'d.  (or the process holding the last open descriptor to
      the file, after deletion, now exits).

    If I remember INN right, there is a situation that can occur here... the
    reader processes open up the history file in order to implement a certain
    NNTP commands.  I'm trying to remember which one... I think its one of
    search commands.  Fubar... anyone remember which NNTP command opens
    up the history file?  In anycase, I remember at BEST I had to completely
    disable that command when running INN because it caused long-running
    reader processes to keep a descriptor open on now-deleted history files.
    When you do an expire run which replaces the history file, the original
    (now deleted) history file may still be open by those reader processes.
    This could easily account for your problems.

    This sort of situation occurs most often when there is no timeout
    or too-long a timeout in the reader processes, and/or if tcp keepalives
    are not turned on, plus when certain NNTP commands (used mostly by
    abusers, by the way, which try to download feeds via their reader 
    access) are enabled.  I would immediately research this... look for 
    reader processes that have hung around too long and try killing them,
    then see if that clears out some memory.

    There will also be a serious file fragmentation issue using MAP_NOSYNC
    in the expire process.  You can probably use MAP_NOSYNC safely in the
    INND core, but don't use it to rebuild the history file in the expire
    process.

						-Matt


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-hackers" in the body of the message