Date: Sun, 20 Jun 2004 10:41:20 -0500 From: Dan Nelson <dnelson@allantgroup.com> To: Mikhail Teterin <mi+kde@aldan.algebra.com> Cc: current@freebsd.org Subject: Re: read vs. mmap (or io vs. page faults) Message-ID: <20040620154120.GC5040@dan.emsphone.com> In-Reply-To: <200406200343.03920@aldan> References: <200406200343.03920@aldan>
next in thread | previous in thread | raw e-mail | index | archive | help
In the last episode (Jun 20), Mikhail Teterin said: > I expected the second way to be faster, as it is supposed to avoid > one memory copying (no user-space buffer). But in reality, on a > CPU-bound (rather than IO-bound) machine, using mmap() is > considerably slower. Here are the tcsh's time results: > > Single Pentium2-400MHz running 4.8-stable: > ------------------------------------------ > stdio: 56.837u 34.115s 2:06.61 71.8% 66+193k 11253+0io 3pf+0w > mmap: 72.463u 7.534s 2:34.62 51.7% 5+186k 105+0io 22328pf+0w > > Dual Pentium2 Xeon 450MHz running recent -current: > -------------------------------------------------- > stdio: 36.557u 29.395s 3:09.88 34.7% 10+165k 32646+0io 0pf+0w > mmap: 42.052u 7.545s 2:02.25 40.5% 10+169k 16+0io 15232pf+0w > > On the IO-bound machine, using mmap is only marginally faster: > > Single Pentium4M (Centrino 1GHz) runing recent -current: > -------------------------------------------------------- > stdio: 27.195u 8.280s 1:33.02 38.1% 10+169k 11221+0io 1pf+0w > mmap: 26.619u 3.004s 1:23.59 35.4% 10+169k 47+0io 19463pf+0w > > Notice the last two columns in time's output -- why is page-faulting a > page in -- on-demand -- so much slower then read()-ing it? I even tried > inserting ``madvise(buffer, file_size, MADV_SEQUENTIAL)'' between the > mmap() and the process() -- made difference at all (or made the mmap() > take slightly longer)... MADV_SEQUENTIAL just lets the system expire already-read blocks from its cache faster, so it won't help much here. read() should cause some prefetching to occur, but it obviously doesn't work all the time or else inblock wouldn't have been as high as 11000. For sequential access I would have expected read() to have been able to prefetch almost every block before the userland process needed it. -- Dan Nelson dnelson@allantgroup.com
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20040620154120.GC5040>