From owner-freebsd-current@FreeBSD.ORG Sun Jun 20 15:41:52 2004 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id AB56016A4CE; Sun, 20 Jun 2004 15:41:52 +0000 (GMT) Received: from dan.emsphone.com (dan.emsphone.com [199.67.51.101]) by mx1.FreeBSD.org (Postfix) with ESMTP id 57D9743D46; Sun, 20 Jun 2004 15:41:52 +0000 (GMT) (envelope-from dan@dan.emsphone.com) Received: (from dan@localhost) by dan.emsphone.com (8.12.10/8.12.10) id i5KFfKa1055360; Sun, 20 Jun 2004 10:41:20 -0500 (CDT) (envelope-from dan) Date: Sun, 20 Jun 2004 10:41:20 -0500 From: Dan Nelson To: Mikhail Teterin Message-ID: <20040620154120.GC5040@dan.emsphone.com> References: <200406200343.03920@aldan> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <200406200343.03920@aldan> X-OS: FreeBSD 5.2-CURRENT X-message-flag: Outlook Error User-Agent: Mutt/1.5.6i cc: questions@freebsd.org cc: current@freebsd.org Subject: Re: read vs. mmap (or io vs. page faults) X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 20 Jun 2004 15:41:52 -0000 In the last episode (Jun 20), Mikhail Teterin said: > I expected the second way to be faster, as it is supposed to avoid > one memory copying (no user-space buffer). But in reality, on a > CPU-bound (rather than IO-bound) machine, using mmap() is > considerably slower. Here are the tcsh's time results: > > Single Pentium2-400MHz running 4.8-stable: > ------------------------------------------ > stdio: 56.837u 34.115s 2:06.61 71.8% 66+193k 11253+0io 3pf+0w > mmap: 72.463u 7.534s 2:34.62 51.7% 5+186k 105+0io 22328pf+0w > > Dual Pentium2 Xeon 450MHz running recent -current: > -------------------------------------------------- > stdio: 36.557u 29.395s 3:09.88 34.7% 10+165k 32646+0io 0pf+0w > mmap: 42.052u 7.545s 2:02.25 40.5% 10+169k 16+0io 15232pf+0w > > On the IO-bound machine, using mmap is only marginally faster: > > Single Pentium4M (Centrino 1GHz) runing recent -current: > -------------------------------------------------------- > stdio: 27.195u 8.280s 1:33.02 38.1% 10+169k 11221+0io 1pf+0w > mmap: 26.619u 3.004s 1:23.59 35.4% 10+169k 47+0io 19463pf+0w > > Notice the last two columns in time's output -- why is page-faulting a > page in -- on-demand -- so much slower then read()-ing it? I even tried > inserting ``madvise(buffer, file_size, MADV_SEQUENTIAL)'' between the > mmap() and the process() -- made difference at all (or made the mmap() > take slightly longer)... MADV_SEQUENTIAL just lets the system expire already-read blocks from its cache faster, so it won't help much here. read() should cause some prefetching to occur, but it obviously doesn't work all the time or else inblock wouldn't have been as high as 11000. For sequential access I would have expected read() to have been able to prefetch almost every block before the userland process needed it. -- Dan Nelson dnelson@allantgroup.com