From owner-freebsd-current  Thu Apr 27  9:34:53 2000
Delivered-To: freebsd-current@freebsd.org
Received: from apollo.backplane.com (apollo.backplane.com [216.240.41.2])
	by hub.freebsd.org (Postfix) with ESMTP id E159437B8F5
	for <freebsd-current@FreeBSD.ORG>; Thu, 27 Apr 2000 09:34:49 -0700 (PDT)
	(envelope-from dillon@apollo.backplane.com)
Received: (from dillon@localhost)
	by apollo.backplane.com (8.9.3/8.9.1) id JAA05279;
	Thu, 27 Apr 2000 09:34:41 -0700 (PDT)
	(envelope-from dillon)
Date: Thu, 27 Apr 2000 09:34:41 -0700 (PDT)
From: Matthew Dillon <dillon@apollo.backplane.com>
Message-Id: <200004271634.JAA05279@apollo.backplane.com>
To: Brad Knowles <blk@skynet.be>
Cc: "John W. DeBoskey" <jwd@unx.sas.com>, freebsd-current@FreeBSD.ORG
Subject: Re: Support for large mfs
References: <200004270554.BAA34693@bb01f39.unx.sas.com>
 <200004270605.XAA00807@apollo.backplane.com> <v04220800b52da49fcdee@[195.238.23.59]>
Sender: owner-freebsd-current@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.ORG

:	I use a mfs for storing the Diablo history file on our news 
:peering server.  Yes, I know the front part of the file is mmap()'ed 
:and effectively kept completely in memory anyway, but I've seen 
:periods of time when we received over 160,000 articles in a single 
:hour (an average of about 45 per second), and if you compare this to 
:our normal ratio of offered versus accepted articles (something in 
:the range of 32,238,303 vs. 612,429; for a 52.64:1 ratio), this would 
:imply we probably did something like 2,368.8 history lookups per 
:second during that period of time -- and this is just for inbound 
:articles.
:
:	In my experience, it is a non-trivial exercise to build a drive 
:array system that can keep up with the number of disk accesses 
:necessary to handle this many history lookups per second.  I think 
:I've recently done it (and reported my testing results on 
:news.software.nntp, along with summarizing the previous test results 
:from Joe Greco and Terry Kennedy), but it's still non-trivial and the 
:solutions I've seen so far are all still rather expensive.  Thus the 
:reason why I currently continue to use an mfs for the history 
:database.
:
:	However, now I'm wondering if it might not be better to use a 
:file-backed or maybe a swap-backed VN device instead of an mfs.  Do 
:you have any thoughts?
:
:--
:   These are my opinions -- not to be taken as official Skynet policy
:======================================================================
:Brad Knowles, <blk@skynet.be>                || Belgacom Skynet SA/NV

    I can't imagine why MFS would perform better... it shouldn't, every
    block is stored in system memory *TWICE* (once in the VM cache, and
    once in the mfs process's address space).  If you have enough system 
    memory to create a large MFS filesystem and it performs well, then
    the system should perform even better if you remove the MFS filesystem
    and just use a normal filesystem.

    A swap-backed VN device operates just like a normal disk device except
    that you get automatic striping if your swap happens to be striped.  
    It takes advantage of the fact that the system's VM page cache does
    a good job of caching the disk blocks so it doesn't have to.

    This is how a normal filesystem works as well.  If you have enough memory,
    the system should be able to cache a normal filesytem's data blocks as
    easily as it caches the VN devices and easier (because they aren't 
    double-cached) then the MFS device.

    I would consider trying a normal filesystem with an async or a softupdates
    mount.  Or a normal filesystem with softupdates enabled.  It may also
    help to turn off write-behind (sysctl -w vfs.write_behind=0), though if
    you are running the latest 4.x stable the write heuristic is now in and
    should do a good job on its own.

						-Matt


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-current" in the body of the message