Date: Sun, 20 Apr 2003 12:24:59 -0700 (PDT) From: Matthew Dillon <dillon@apollo.backplane.com> To: David Schultz <das@freebsd.org> Cc: freebsd-current@freebsd.org Subject: Re: Broken memory management on system with no swap Message-ID: <200304201924.h3KJOxWo090302@apollo.backplane.com> References: <000501c30682$4e5e64b0$6601a8c0@VAIO650> <20030420002940.GB46590@HAL9000.homeunix.com> <20030420191744.G19683@gamplex.bde.org> <20030420101401.GA2821@HAL9000.homeunix.com> <20030420191029.GA4803@HAL9000.homeunix.com>
next in thread | previous in thread | raw e-mail | index | archive | help
:Thanks for your analysis. I thought there might be a GBDE-related
:factor when Lucky mentioned that copying a file triggers the bug,
:since cp(1) turns off mmap() mode for files > 8 MB to avoid this
:sort of thing. But nevertheless, I can see how the situation you
:describe can occur, where the system realizes too late that all of
:the reclaimable pages are tied up in the active queue.
Yes, I can see that happening too. The inactive queue is scanned before
the active queue (again, the ordering is important for normal operation
and we wouldn't want to change it). But this also creates a situation
where moving pages from the active queue to the inactive queue and then
laundering or reclaiming them from the inactive queue requires two
passes before the system recognizes the newly available memory.
If some operation.. say a copy, causes nearly all available pages to
be moved to the active queue, whether clean or dirty, it would require
two passes before any of those pages could be reused. In this case
we know that the use of write() will not create an excessive number of
dirty pages in the active queue due to (A) the limited size of the
buffer cache and (B) the write-behind clustering that occurs when
writing a file sequentially.
So it *must* simply be the fact that all the pages are made active
very quickly and the pageout code simply requires two passes to
get through the queues before it can reuse any of those pages.
The 'pass != 0' test should be able to handle both cases assuming
that the page's act_count does not get in the way. Whether or not
act_count gets in the way of us being able to reclaim a page in two
passes can be tested by setting vm.pageout_algorithm to 1 (which will
cause act_count to be ignored). If the problem still occurs with
the pass != 0 test and vm.pageout_algorithm set to 0 (the default),
but does not occur with vm.pageout_algorithm set to 1, then we know
the problem is due to pages not being moved out of the active queue
quickly enough (1) for this situation.
note (1): normally act_count protects against thrashing. It is the
active queue's act_count algorithm which gives FreeBSD's such a nice
smooth degredation curve when memory loads become extreme by preventing
a frequently accessed page from being freed too early, so we don't
want to just turn it off. Maybe we need a test for 'too many active
pages', aka when > 80% of available pages are in the active queue
to temporarily disable the act_count test.
-Matt
:> I suggest changing this:
:>
:> if ((vm_swap_size < 64 && vm_page_count_min()) ||
:> (swap_pager_full && vm_paging_target() > 0)) {
:>
:> To this:
:>
:> if (pass != 0 &&
:> ((vm_swap_size < 64 && vm_page_count_min()) ||
:> (swap_pager_full && vm_paging_target() > 0))) {
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200304201924.h3KJOxWo090302>
