Date: Sat, 25 Mar 2006 10:29:17 -0800 (PST) From: Matthew Dillon <dillon@apollo.backplane.com> To: Peter Jeremy <peterjeremy@optushome.com.au> Cc: alc@freebsd.org, Mikhail Teterin <mi+mx@aldan.algebra.com>, stable@freebsd.org Subject: Re: Reading via mmap stinks (Re: weird bugs with mmap-ing via NFS) Message-ID: <200603251829.k2PITH5D014732@apollo.backplane.com> References: <200603211607.30372.mi%2Bmx@aldan.algebra.com> <200603231403.36136.mi%2Bmx@aldan.algebra.com> <200603232048.k2NKm4QL067644@apollo.backplane.com> <200603231626.19102.mi%2Bmx@aldan.algebra.com> <200603232316.k2NNGBka068754@apollo.backplane.com> <20060324084940.GA703@turion.vk2pj.dyndns.org> <200603241800.k2OI0KF8005579@apollo.backplane.com> <20060325094207.GD703@turion.vk2pj.dyndns.org>
next in thread | previous in thread | raw e-mail | index | archive | help
:The results here are weird. With 1GB RAM and a 2GB dataset, the
:timings seem to depend on the sequence of operations: reading is
:significantly faster, but only when the data was mmap'd previously
:There's one outlier that I can't easily explain.
:...
:Peter Jeremy
Really odd. Note that if your disk can only do 25 MBytes/sec, the
calculation is: 2052167894 / 25MB = ~80 seconds, not ~60 seconds
as you would expect from your numbers.
So that would imply that the 80 second numbers represent read-ahead,
and the 60 second numbers indicate that some of the data was retained
from a prior run (and not blown out by the sequential reading in the
later run).
This type of situation *IS* possible as a side effect of other
heuristics. It is particularly possible when you combine read() with
mmap because read() uses a different heuristic then mmap() to
implement the read-ahead. There is also code in there which depresses
the page priority of 'old' already-read pages in the sequential case.
So, for example, if you do a linear grep of 2GB you might end up with
a cache state that looks like this:
l = low priority page
m = medium priority page
h = high priority page
FILE: [---------------------------mmmmmmmmmmmmm]
Then when you rescan using mmap,
FILE: [lllllllll------------------mmmmmmmmmmmmm]
[------lllllllll------------mmmmmmmmmmmmm]
[---------lllllllll---------mmmmmmmmmmmmm]
[------------lllllllll------mmmmmmmmmmmmm]
[---------------lllllllll---mmmmmmmmmmmmm]
[------------------lllllllllmmmmmmmmmmmmm]
[---------------------llllllHHHmmmmmmmmmm]
[------------------------lllHHHHHHmmmmmmm]
[---------------------------HHHHHHHHHmmmm]
[---------------------------mmmHHHHHHHHHm]
The low priority pages don't bump out the medium priority pages
from the previous scan, so the grep winds up doing read-ahead
until it hits the large swath of pages already cached from the
previous scan, without bumping out those pages.
There is also a heuristic in the system (FreeBSD and DragonFly)
which tries to randomly retain pages. It clearly isn't working :-)
I need to change it to randomly retain swaths of pages, the
idea being that it should take repeated runs to rebalance the VM cache
rather then allowing a single run to blow it out or allowing a
static set of pages to be retained indefinitely, which is what your
tests seem to show is occuring.
-Matt
Matthew Dillon
<dillon@backplane.com>
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200603251829.k2PITH5D014732>
