From owner-freebsd-current@FreeBSD.ORG  Sun Jun 20 07:43:28 2004
Return-Path: <owner-freebsd-current@FreeBSD.ORG>
Delivered-To: freebsd-current@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP
	id E5A4216A4CE; Sun, 20 Jun 2004 07:43:28 +0000 (GMT)
Received: from aldan.algebra.com (aldan.algebra.com [216.254.65.224])
	by mx1.FreeBSD.org (Postfix) with ESMTP
	id 8BBB943D45; Sun, 20 Jun 2004 07:43:28 +0000 (GMT)
	(envelope-from mi@aldan.algebra.com)
Received: from aldan.algebra.com (mi@localhost [127.0.0.1])
	by aldan.algebra.com (8.12.11/8.12.11) with ESMTP id i5K7h5JW026162
	(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO);
	Sun, 20 Jun 2004 03:43:05 -0400 (EDT)
	(envelope-from mi@aldan.algebra.com)
Received: by aldan.algebra.com (8.12.11/8.12.11/Submit) id i5K7h51l026161;
	Sun, 20 Jun 2004 03:43:05 -0400 (EDT)
	(envelope-from mi)
From: Mikhail Teterin <mi+kde@aldan.algebra.com>
To: questions@FreeBSD.org
Date: Sun, 20 Jun 2004 03:43:03 -0400
User-Agent: KMail/1.6.2
X-Face: %UW#n0|w>ydeGt/b@1-.UFP=K^~-:0f#O:D7w<gv/&E-lL7twZCT8B~/PA4|\t$ti+22K">hJ5G_<5143Bb3kOIs9XpX+"V+~$adGP:J|SLieM31VIhqXeLBli"<kcG^EOVihy+z3/UR{6SCQ
MIME-Version: 1.0
Content-Disposition: inline
Content-Type: text/plain;
  charset="us-ascii"
Content-Transfer-Encoding: 7bit
Message-Id: <200406200343.03920@aldan>
X-Mailman-Approved-At: Sun, 20 Jun 2004 11:47:13 +0000
cc: current@FreeBSD.org
Subject: read vs. mmap (or io vs. page faults)
X-BeenThere: freebsd-current@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Discussions about the use of FreeBSD-current
	<freebsd-current.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-current>,
	<mailto:freebsd-current-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-current>
List-Post: <mailto:freebsd-current@freebsd.org>
List-Help: <mailto:freebsd-current-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-current>,
	<mailto:freebsd-current-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 20 Jun 2004 07:43:29 -0000

Hello!

I'm writing a message-digest utility, which operates on file and
can use either stdio:

	while (not eof) {
		char buffer[BUFSIZE];
		size = read(.... buffer ...);
		process(buffer, size);
	}

or mmap:

	buffer = mmap(... file_size, PROT_READ ...);
	process(buffer, file_size);

I expected the second way to be faster, as it is supposed to avoid
one memory copying (no user-space buffer). But in reality, on a
CPU-bound (rather than IO-bound) machine, using mmap() is considerably
slower. Here are the tcsh's time results:

	Single Pentium2-400MHz running 4.8-stable:
	------------------------------------------
stdio:	56.837u 34.115s 2:06.61 71.8%   66+193k 11253+0io 3pf+0w
mmap:	72.463u 7.534s 2:34.62 51.7%    5+186k 105+0io 22328pf+0w

	Dual Pentium2 Xeon 450MHz running recent -current:
	--------------------------------------------------
stdio:	36.557u 29.395s 3:09.88 34.7%   10+165k 32646+0io 0pf+0w
mmap:	42.052u 7.545s 2:02.25 40.5%    10+169k 16+0io 15232pf+0w

On the IO-bound machine, using mmap is only marginally faster:

	Single Pentium4M (Centrino 1GHz) runing recent -current:
	--------------------------------------------------------
stdio:	27.195u 8.280s 1:33.02 38.1%    10+169k 11221+0io 1pf+0w
mmap:	26.619u 3.004s 1:23.59 35.4%    10+169k 47+0io 19463pf+0w

Notice the last two columns in time's output -- why is page-faulting a
page in -- on-demand -- so much slower then read()-ing it? I even tried
inserting ``madvise(buffer, file_size, MADV_SEQUENTIAL)'' between the
mmap() and the process() -- made difference at all (or made the mmap()
take slightly longer)...

I this how things are supposed to be, or will mmap() become more
efficient eventually? Thanks!

	-mi