From owner-freebsd-questions@FreeBSD.ORG  Mon Jun 21 00:13:02 2004
Return-Path: <owner-freebsd-questions@FreeBSD.ORG>
Delivered-To: freebsd-questions@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP
	id C996D16A4CE; Mon, 21 Jun 2004 00:13:02 +0000 (GMT)
Received: from apollo.backplane.com (apollo.backplane.com [216.240.41.2])
	by mx1.FreeBSD.org (Postfix) with ESMTP
	id 7DD8043D5A; Mon, 21 Jun 2004 00:13:02 +0000 (GMT)
	(envelope-from dillon@apollo.backplane.com)
Received: from apollo.backplane.com (localhost [127.0.0.1])
	i5L0CJx7031534;	Sun, 20 Jun 2004 17:12:21 -0700 (PDT)
	(envelope-from dillon@apollo.backplane.com)
Received: (from dillon@localhost)
	by apollo.backplane.com (8.12.9p2/8.12.9/Submit) id i5L0CJd6031533;
	Sun, 20 Jun 2004 17:12:19 -0700 (PDT)
	(envelope-from dillon)
Date: Sun, 20 Jun 2004 17:12:19 -0700 (PDT)
From: Matthew Dillon <dillon@apollo.backplane.com>
Message-Id: <200406210012.i5L0CJd6031533@apollo.backplane.com>
To: Mikhail Teterin <mi+kde@aldan.algebra.com>
References: <200406200343.03920@aldan>
	<200406201835.i5KIZBeJ026532@apollo.backplane.com> <200406201852.08030@aldan>
cc: questions@freebsd.org
cc: current@freebsd.org
Subject: Re: read vs. mmap (or io vs. page faults)
X-BeenThere: freebsd-questions@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: User questions <freebsd-questions.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-questions>,
	<mailto:freebsd-questions-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-questions>
List-Post: <mailto:freebsd-questions@freebsd.org>
List-Help: <mailto:freebsd-questions-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-questions>,
	<mailto:freebsd-questions-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 21 Jun 2004 00:13:03 -0000

    Hmm.  Well, you can try calling madvise(... MADV_WILLNEED), that's what
    it is for.  

    It is usually a bad idea to try to populate the page table with all
    resident pages associated with the a memory mapping, because mmap()
    is often used to map huge files... hundreds of megabytes or even 
    dozens of gigabytes (on 64 bit architectures).  The last thing you want
    to do is to populate the page table for the entire file.  It might
    work for your particular program, but it is a bad idea for the OS to
    assume that for every mmap().

    What it comes down to, really, is whether you feel you actually need the
    additional performance, because it kinda sounds to me that whatever 
    processing you are doing to the data is either going to be I/O bound,
    or it isn't going to run long enough for the additional overhead to matter
    verses the processing overhead of the program itself.

    If you are really worried you could pre-fault the mmap before you do
    any processing at all and measure the time it takes to pre-fault the
    pages vs the time it takes to process the memory image.  (You pre-fault
    simply by accessing one byte of data in each page across the mmap(),
    before you begin any processing).

					-Matt
					Matthew Dillon 
					<dillon@backplane.com>

:=     It's hard to say.  mmap() could certainly be made more efficient, e.g.
:=     by faulting in more pages at a time to reduce the actual fault rate.
:=     But it's fairly difficult to beat a read copy into a small buffer.
:
:Well, that's the thing -- by mmap-ing the whole file at once (and by
:madvise-ing with MADV_SEQUENTIONAL), I thought, I told, the kernel
:everything it needed to know to make the best decision. Why can't
:page-faulting code do a better job using all this knowledge, than the
:poor read, which only knows about the partial read in question?
:
:I find it so disappointing, that it can, probably, be considered a bug.
:I'll try this code on Linux and Solaris. If mmap is better there (as it
:really ought to be), we have a problem, IMHO. Thanks!
:
:	-mi
: