Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 30 Nov 2017 10:49:23 -0800
From:      Larry McVoy <lm@mcvoy.com>
To:        Warner Losh <imp@bsdimp.com>
Cc:        Larry McVoy <lm@mcvoy.com>, "freebsd-arch@freebsd.org" <freebsd-arch@freebsd.org>, Scott Long <scottl@netflix.com>, Kevin Bowling <kbowling@llnw.com>, Drew Gallatin <gallatin@netflix.com>
Subject:   Re: small patch for pageout. Comments?
Message-ID:  <20171130184923.GA30262@mcvoy.com>
In-Reply-To: <CANCZdfqL9ZsKTfFi%2BvsCTh3yaNjtwaYYY3fvivdbNybBnujawg@mail.gmail.com>
References:  <20171130173424.GA811@mcvoy.com> <CANCZdfqL9ZsKTfFi%2BvsCTh3yaNjtwaYYY3fvivdbNybBnujawg@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, Nov 30, 2017 at 11:37:35AM -0700, Warner Losh wrote:
> On Thu, Nov 30, 2017 at 10:34 AM, Larry McVoy <lm@mcvoy.com> wrote:
> 
> > In a recent numa meeting that Scott called, Jeff suggested a small
> > patch to the pageout daemon (included below).
> >
> > It's rather dramatic the difference it makes for me.  If I arrange to
> > thrash the crap out of memory, without this patch the kernel is so
> > borked with all the processes in disk wait that I can't kill them,
> > I can't reboot, my only option is to power off.
> >
> > With the patch there is still some borkage, the kernel is randomly
> > killing processes because of out of mem, it should kill one of my
> > processes that is causing the problem but it doesn't, it killed
> > random stuff like dhclient, getty (logged me out), etc.
> >
> > But the system is responsive.
> >
> > What the patch does is say "if we have more than one core, don't sleep
> > in pageout, just keep running until we freed enough mem".
> >
> > Comments?
> >
> 
> Just to confirm why this patch works.
> 
> For UP systems, we have to pause here to allow work to complete, otherwise
> we can't switch to their threads to complete the I/Os. For MP, however, we
> can continue to schedule more work because that work can be completed on
> other CPUs. This parallelism greatly increases the pageout rate, allowing
> the system to keep up better when some ass-hat process (or processes) is
> thrashing memory.

Yep.

> I'm pretty sure the UP case was also designed to not flood the lower layers
> with work, starving other consumers. Does this result in undo flooding, and
> would we get better results if we could schedule up to the right amount of
> work rather flooding in the MP case?

I dunno if there is a "right amount".  I could make it a little smarter by
keeping track of how many pages we freed and sleep if we freed none in a 
scan (which seems really unlikely).

All I know for sure is that without this you can lock up the system to
the point it takes a power cycle to unwedge it.  With this the system
is responsive.

Rather than worrying about the smartness, I'd argue this is an improvement,
ship it, and then I can go look at how the system decides to kill processes
(because that's currently busted).



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20171130184923.GA30262>