Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 24 Feb 2012 16:39:11 -0500
From:      Ryan Stone <rysto32@gmail.com>
To:        freebsd-hackers@freebsd.org
Subject:   vm_pageout_page_stats() calling pmap_remove_all() on pages that it deactivates
Message-ID:  <CAFMmRNwe5MK5q46xsdRg=SE_7tuEneHv2Ad8zauSEDn3n9-eaQ@mail.gmail.com>

next in thread | raw e-mail | index | archive | help
Near the end of vm_pageout_page_stats() there is the following code:

if (m->act_count == 0) {
	/*
	 * We turn off page access, so that we have
	 * more accurate RSS stats.  We don't do this
	 * in the normal page deactivation when the
	 * system is loaded VM wise, because the
	 * cost of the large number of page protect
	 * operations would be higher than the value
	 * of doing the operation.
	 */
	pmap_remove_all(m);
	vm_page_deactivate(m);
}

I question how useful it is to remove m from every pmap.  The stated
reasoning in the comment above is that it makes for more accurate RSS
statistics.  However, vm_pageout_page_stats() only does anything at
all when there is a shortage of inactive+cache+free pages, so I find
the assertion that this leads to a "more accurate" RSS accounting a
bit specious when the rest of the VM subsystem isn't trying to provide
accurate RSS stats at all.  Besides, the page is still resident in
memory if we've only deactivated it.  This code seems to be conflating
"resident set" with "working set".

The page still being resident in memory is why I think removing the
page from the pmap is the wrong thing to do here.  The situation that
lead me to looking at this code was pretty simple: I had a daemon
running on a swapless system leaking memory.  I was running a script
that logged the output of top periodically.  What I saw was the VSS
and RSS of the daemon growing steadily over time until all of a
sudden, its RSS dropped dramatically and the system's inactive page
count dropped(this was vm_pageout_page_stats() kicking in due to the
memory shortage).  I was mislead into thinking that the daemon had
just freed a lot of memory and that malloc had called madvise(...,
MADV_FREE) to free the pages back to the kernel.  Of course the newly
deactivated pages could never be freed and I spent a day going in the
entirely wrong direction before I figured out what
vm_pageout_page_stats() was doing and stopped looking for bugs in
madvise.

In investigating this I did stumble upon one situation where removing
the pages on deactivation lead to very bad behaviour.  I had a system
with a lot of wired memory and about 200MB free.  I ran a test program
to allocate nearly all of the free memory and then sit there sleeping,
never touching it again.  vm_pageout_page_stats() duly kicked in and
deactivated the test program's memory, bringing its resident set down
to a couple of KB.  However that didn't actually succeed in freeing
any memory, and so the next time I tried to ssh to it or something the
OOM killer ended up having to be invoked.  It went on a mass killing
spree, but never even considered killing my test program because its
RSS was so small(despite the fact that it was holding on to most of
the memory in the system).  It's a bit of a corner case: I think that
you have to have a ton of wired memory, no swap and just the right
amount of free memory(the fact that most of the unwired-but-allocated
memory on that system was allocated to daemons that restarted
themselves automatically probably didn't help the situation at all, as
they kept running the system back to the edge).

Anyway, given that I can't see any value to removing a page from a
pmap just because we are deactivating it, and it seems to cause
confusion and even less-than-ideal (and arguably incorrect) behaviour
in certain corner cases, should it just be removed?  Or is there some
subtly to this that I'm missing?



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAFMmRNwe5MK5q46xsdRg=SE_7tuEneHv2Ad8zauSEDn3n9-eaQ>