From owner-freebsd-current@FreeBSD.ORG Tue Jun 22 02:21:47 2004 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id A7ECC16A4CE; Tue, 22 Jun 2004 02:21:47 +0000 (GMT) Received: from out007.verizon.net (out007pub.verizon.net [206.46.170.107]) by mx1.FreeBSD.org (Postfix) with ESMTP id 0379A43D31; Tue, 22 Jun 2004 02:21:47 +0000 (GMT) (envelope-from cswiger@mac.com) Received: from [192.168.1.3] ([68.161.84.3]) by out007.verizon.net (InterMail vM.5.01.06.06 201-253-122-130-106-20030910) with ESMTP id <20040622022131.UYLZ28276.out007.verizon.net@[192.168.1.3]>; Mon, 21 Jun 2004 21:21:31 -0500 Message-ID: <40D797AA.1010408@mac.com> Date: Mon, 21 Jun 2004 22:21:30 -0400 From: Chuck Swiger Organization: The Courts of Chaos User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7) Gecko/20040608 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Matthew Dillon References: <200406211057.31103@aldan> <200406211952.i5LJqWSl035702@apollo.backplane.com> <200406211810.03629@misha-mx.virtual-estates.net> <200406220015.i5M0F9br036789@apollo.backplane.com> In-Reply-To: <200406220015.i5M0F9br036789@apollo.backplane.com> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-Authentication-Info: Submitted using SMTP AUTH at out007.verizon.net from [68.161.84.3] at Mon, 21 Jun 2004 21:21:31 -0500 cc: questions@freebsd.org cc: current@freebsd.org Subject: Re: read vs. mmap (or io vs. page faults) X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 22 Jun 2004 02:21:47 -0000 Matthew Dillon wrote: > Mikhail Teterin wrote: >>= Both read and mmap have a read-ahead heuristic. The heuristic >>= works. In fact, the mmap heuristic is so smart it can read-behind >>= as well as read-ahead if it detects a backwards scan. >> >> Evidently, read's heuristics are better. At least, for this task. I'm, >> actually, surprised, they are _different_ at all. It might be interesting to retry your tests under a Mach kernel. BSD has multiple codepaths for IPC functionality that are unified under Mach. >> The mmap interface is supposed to be more efficient -- theoreticly -- >> because it requires one less buffer-copying, and because it (together >> with the possible madvise()) provides the kernel with more information >> thus enabling it to make better (at least -- no worse) decisions. I've heard people repeat the same notion, that is to say "that mmap()ing a file is supposed to be faster than read()ing it" [1], but the two operations are not quite the same thing, and there is more work being done to mmap a file (and thus gain random access to any byte of the file by dereferencing memory), than to read and process small blocks of data at a time. Matt's right that processing a small block that fits into L1/L2 cache (and probably already is resident) is very fast. The extra copy doesn't matter as much as it once did on slower machines, and he's provided some good analysis of L1/L2 caching issues and buffer copying speeds. However, I tend to think the issue of buffer copying speeds are likely to be moot when you are reading from disk and are thus I/O bound [2], rather than having the manner in which the file's contents are represented to the program being that significant. --------- [1]: Actually, while it is intuitive that trying to tell the system, "hey, I want all of that file read into RAM now, as quickly as you can using mmap() and madvise()", what happens with systems which use demand-paging VM (like FreeBSD, Linux, and most others) is far more lazy: In reality, your process gets nothing but a promise from mmap() that if you access the right chunk of memory, your program will unblock once that data has been read and faulted into the local address space. That level of urgency doesn't seem to correspond to what you asked for :-), although it still works pretty well in practice. [2]: We're talking about maybe 20 to 60 or so MB/s for disk, versus 10x to 100x that for RAM to RAM copying, much less the L2 copying speeds Matt mentions below: > Well, I think you forgot my earlier explanation regarding buffer copying. > Buffer copying is a very cheap operation if it occurs within the L1 or > L2 cache, and that is precisely what is happening when you read() into > a fixed buffer in a loop in a C program... your buffer is fixed in > memory and is almost guarenteed to be in the L1/L2 cache, which means > that the extra copy operation is very fast on a modern processor. It's > something like 12-16 GBytes/sec to the L1 cache on an Athlon 64, for > example, and 3 GBytes/sec uncached to main memory. This has been an interesting discussion, BTW, thanks. -- -Chuck