Date: Sun, 20 Jun 2004 03:43:03 -0400 From: Mikhail Teterin <mi+kde@aldan.algebra.com> To: questions@FreeBSD.org Cc: current@FreeBSD.org Subject: read vs. mmap (or io vs. page faults) Message-ID: <200406200343.03920@aldan>
next in thread | raw e-mail | index | archive | help
Hello! I'm writing a message-digest utility, which operates on file and can use either stdio: while (not eof) { char buffer[BUFSIZE]; size = read(.... buffer ...); process(buffer, size); } or mmap: buffer = mmap(... file_size, PROT_READ ...); process(buffer, file_size); I expected the second way to be faster, as it is supposed to avoid one memory copying (no user-space buffer). But in reality, on a CPU-bound (rather than IO-bound) machine, using mmap() is considerably slower. Here are the tcsh's time results: Single Pentium2-400MHz running 4.8-stable: ------------------------------------------ stdio: 56.837u 34.115s 2:06.61 71.8% 66+193k 11253+0io 3pf+0w mmap: 72.463u 7.534s 2:34.62 51.7% 5+186k 105+0io 22328pf+0w Dual Pentium2 Xeon 450MHz running recent -current: -------------------------------------------------- stdio: 36.557u 29.395s 3:09.88 34.7% 10+165k 32646+0io 0pf+0w mmap: 42.052u 7.545s 2:02.25 40.5% 10+169k 16+0io 15232pf+0w On the IO-bound machine, using mmap is only marginally faster: Single Pentium4M (Centrino 1GHz) runing recent -current: -------------------------------------------------------- stdio: 27.195u 8.280s 1:33.02 38.1% 10+169k 11221+0io 1pf+0w mmap: 26.619u 3.004s 1:23.59 35.4% 10+169k 47+0io 19463pf+0w Notice the last two columns in time's output -- why is page-faulting a page in -- on-demand -- so much slower then read()-ing it? I even tried inserting ``madvise(buffer, file_size, MADV_SEQUENTIAL)'' between the mmap() and the process() -- made difference at all (or made the mmap() take slightly longer)... I this how things are supposed to be, or will mmap() become more efficient eventually? Thanks! -mi
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200406200343.03920>