From owner-freebsd-questions@FreeBSD.ORG Mon Jun 21 00:13:02 2004 Return-Path: Delivered-To: freebsd-questions@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id C996D16A4CE; Mon, 21 Jun 2004 00:13:02 +0000 (GMT) Received: from apollo.backplane.com (apollo.backplane.com [216.240.41.2]) by mx1.FreeBSD.org (Postfix) with ESMTP id 7DD8043D5A; Mon, 21 Jun 2004 00:13:02 +0000 (GMT) (envelope-from dillon@apollo.backplane.com) Received: from apollo.backplane.com (localhost [127.0.0.1]) i5L0CJx7031534; Sun, 20 Jun 2004 17:12:21 -0700 (PDT) (envelope-from dillon@apollo.backplane.com) Received: (from dillon@localhost) by apollo.backplane.com (8.12.9p2/8.12.9/Submit) id i5L0CJd6031533; Sun, 20 Jun 2004 17:12:19 -0700 (PDT) (envelope-from dillon) Date: Sun, 20 Jun 2004 17:12:19 -0700 (PDT) From: Matthew Dillon Message-Id: <200406210012.i5L0CJd6031533@apollo.backplane.com> To: Mikhail Teterin References: <200406200343.03920@aldan> <200406201835.i5KIZBeJ026532@apollo.backplane.com> <200406201852.08030@aldan> cc: questions@freebsd.org cc: current@freebsd.org Subject: Re: read vs. mmap (or io vs. page faults) X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 21 Jun 2004 00:13:03 -0000 Hmm. Well, you can try calling madvise(... MADV_WILLNEED), that's what it is for. It is usually a bad idea to try to populate the page table with all resident pages associated with the a memory mapping, because mmap() is often used to map huge files... hundreds of megabytes or even dozens of gigabytes (on 64 bit architectures). The last thing you want to do is to populate the page table for the entire file. It might work for your particular program, but it is a bad idea for the OS to assume that for every mmap(). What it comes down to, really, is whether you feel you actually need the additional performance, because it kinda sounds to me that whatever processing you are doing to the data is either going to be I/O bound, or it isn't going to run long enough for the additional overhead to matter verses the processing overhead of the program itself. If you are really worried you could pre-fault the mmap before you do any processing at all and measure the time it takes to pre-fault the pages vs the time it takes to process the memory image. (You pre-fault simply by accessing one byte of data in each page across the mmap(), before you begin any processing). -Matt Matthew Dillon := It's hard to say. mmap() could certainly be made more efficient, e.g. := by faulting in more pages at a time to reduce the actual fault rate. := But it's fairly difficult to beat a read copy into a small buffer. : :Well, that's the thing -- by mmap-ing the whole file at once (and by :madvise-ing with MADV_SEQUENTIONAL), I thought, I told, the kernel :everything it needed to know to make the best decision. Why can't :page-faulting code do a better job using all this knowledge, than the :poor read, which only knows about the partial read in question? : :I find it so disappointing, that it can, probably, be considered a bug. :I'll try this code on Linux and Solaris. If mmap is better there (as it :really ought to be), we have a problem, IMHO. Thanks! : : -mi :