From owner-freebsd-current Thu Apr 27 9:34:53 2000 Delivered-To: freebsd-current@freebsd.org Received: from apollo.backplane.com (apollo.backplane.com [216.240.41.2]) by hub.freebsd.org (Postfix) with ESMTP id E159437B8F5 for ; Thu, 27 Apr 2000 09:34:49 -0700 (PDT) (envelope-from dillon@apollo.backplane.com) Received: (from dillon@localhost) by apollo.backplane.com (8.9.3/8.9.1) id JAA05279; Thu, 27 Apr 2000 09:34:41 -0700 (PDT) (envelope-from dillon) Date: Thu, 27 Apr 2000 09:34:41 -0700 (PDT) From: Matthew Dillon Message-Id: <200004271634.JAA05279@apollo.backplane.com> To: Brad Knowles Cc: "John W. DeBoskey" , freebsd-current@FreeBSD.ORG Subject: Re: Support for large mfs References: <200004270554.BAA34693@bb01f39.unx.sas.com> <200004270605.XAA00807@apollo.backplane.com> Sender: owner-freebsd-current@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.ORG : I use a mfs for storing the Diablo history file on our news :peering server. Yes, I know the front part of the file is mmap()'ed :and effectively kept completely in memory anyway, but I've seen :periods of time when we received over 160,000 articles in a single :hour (an average of about 45 per second), and if you compare this to :our normal ratio of offered versus accepted articles (something in :the range of 32,238,303 vs. 612,429; for a 52.64:1 ratio), this would :imply we probably did something like 2,368.8 history lookups per :second during that period of time -- and this is just for inbound :articles. : : In my experience, it is a non-trivial exercise to build a drive :array system that can keep up with the number of disk accesses :necessary to handle this many history lookups per second. I think :I've recently done it (and reported my testing results on :news.software.nntp, along with summarizing the previous test results :from Joe Greco and Terry Kennedy), but it's still non-trivial and the :solutions I've seen so far are all still rather expensive. Thus the :reason why I currently continue to use an mfs for the history :database. : : However, now I'm wondering if it might not be better to use a :file-backed or maybe a swap-backed VN device instead of an mfs. Do :you have any thoughts? : :-- : These are my opinions -- not to be taken as official Skynet policy :====================================================================== :Brad Knowles, || Belgacom Skynet SA/NV I can't imagine why MFS would perform better... it shouldn't, every block is stored in system memory *TWICE* (once in the VM cache, and once in the mfs process's address space). If you have enough system memory to create a large MFS filesystem and it performs well, then the system should perform even better if you remove the MFS filesystem and just use a normal filesystem. A swap-backed VN device operates just like a normal disk device except that you get automatic striping if your swap happens to be striped. It takes advantage of the fact that the system's VM page cache does a good job of caching the disk blocks so it doesn't have to. This is how a normal filesystem works as well. If you have enough memory, the system should be able to cache a normal filesytem's data blocks as easily as it caches the VN devices and easier (because they aren't double-cached) then the MFS device. I would consider trying a normal filesystem with an async or a softupdates mount. Or a normal filesystem with softupdates enabled. It may also help to turn off write-behind (sysctl -w vfs.write_behind=0), though if you are running the latest 4.x stable the write heuristic is now in and should do a good job on its own. -Matt To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-current" in the body of the message