Date: Thu, 30 Nov 2017 13:09:19 -0800 From: Larry McVoy <lm@mcvoy.com> To: Mark Johnston <markj@FreeBSD.org> Cc: Larry McVoy <lm@mcvoy.com>, Warner Losh <imp@bsdimp.com>, Scott Long <scottl@netflix.com>, Kevin Bowling <kbowling@llnw.com>, Drew Gallatin <gallatin@netflix.com>, "freebsd-arch@freebsd.org" <freebsd-arch@freebsd.org> Subject: Re: small patch for pageout. Comments? Message-ID: <20171130210919.GB3924@mcvoy.com> In-Reply-To: <20171130210131.GC21606@raichu> References: <20171130173424.GA811@mcvoy.com> <CANCZdfqL9ZsKTfFi%2BvsCTh3yaNjtwaYYY3fvivdbNybBnujawg@mail.gmail.com> <20171130184923.GA30262@mcvoy.com> <20171130204750.GB21606@raichu> <20171130205041.GA3924@mcvoy.com> <20171130210131.GC21606@raichu>
next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, Nov 30, 2017 at 04:01:31PM -0500, Mark Johnston wrote:
> On Thu, Nov 30, 2017 at 12:50:41PM -0800, Larry McVoy wrote:
> > On Thu, Nov 30, 2017 at 03:47:50PM -0500, Mark Johnston wrote:
> > > > I dunno if there is a "right amount". I could make it a little smarter by
> > > > keeping track of how many pages we freed and sleep if we freed none in a
> > > > scan (which seems really unlikely).
> > >
> > > This situation can happen if the inactive queue is full of dirty pages.
> > > A problem with your patch is that we might not give enough time to the
> > > laundry thread (the thread responsible for writing the contents of dirty
> > > pages to disk and returning them to inactive queue for the page daemon
> > > to free) to write out dirty pages. In this case we might trigger the OOM
> > > killer prematurely, and in fact this scenario is what motivated r300865.
> > > So I would argue that we do in fact need to sleep if the page daemon is
> > > failing to make progress, in order to give time for I/O to complete.
> >
> > OK, that sounds reasonable. So what defines progress? v_dfree not
> > increasing? Is one page freed progress?
>
> One page freed is progress. We currently invoke the OOM killer only when
> the page daemon is making no progress. This turns out to be too
> conservative, which is what kib's patch attempts to address. wrt
> your patch, I'm saying that I think we should still sleep after a scan
> that failed to free any pages.
Something like this?
--- a/sys/vm/vm_pageout.c
+++ b/sys/vm/vm_pageout.c
@@ -1752,6 +1752,7 @@ vm_pageout_worker(void *arg)
struct vm_domain *domain;
int domidx, pass;
bool target_met;
+ u_int dfree;
domidx = (uintptr_t)arg;
domain = &vm_dom[domidx];
@@ -1776,6 +1777,7 @@ vm_pageout_worker(void *arg)
*/
while (TRUE) {
mtx_lock(&vm_page_queue_free_mtx);
+ dfree = VM_CNT_FETCH(v_dfree);
/*
* Generally, after a level >= 1 scan, if there are enough
@@ -1815,10 +1817,22 @@ vm_pageout_worker(void *arg)
* (page reclamation) scan, then increase the level
* and scan again now. Otherwise, sleep a bit and
* try again later.
+ *
+ * If we have more than one CPU this pause is not
+ * helpful, it just decreases the rate at which we
+ * clean pages. On a uniprocessor we want to pause
+ * to let the user level processes get some time to
+ * run. We also don't keep banging on the page tables
+ * if we didn't manage to free any in the last pass.
*/
mtx_unlock(&vm_page_queue_free_mtx);
- if (pass >= 1)
- pause("psleep", hz / VM_INACT_SCAN_RATE);
+ if (pass >= 1) {
+ dfree = VM_CNT_FETCH(v_dfree) - dfree;
+ if ((dfree == 0) || (mp_ncpus < 2)) {
+if (!dfree) printf("Sleeping because pass %d didn't find anything\n", pass);
+ pause("psleep", hz / VM_INACT_SCAN_RATE);
+ }
+ }
pass++;
} else {
/*
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20171130210919.GB3924>
