Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 31 Mar 2018 18:54:32 -0400
From:      Mark Johnston <markj@FreeBSD.org>
To:        Tijl Coosemans <tijl@FreeBSD.org>
Cc:        src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org
Subject:   Re: svn commit: r331732 - head/sys/vm
Message-ID:  <20180331225432.GB1440@raichu>
In-Reply-To: <20180331202118.5401ed2a@kalimero.tijl.coosemans.org>
References:  <201803291427.w2TEReA3024929@repo.freebsd.org> <20180331202118.5401ed2a@kalimero.tijl.coosemans.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Sat, Mar 31, 2018 at 08:21:18PM +0200, Tijl Coosemans wrote:
> On Thu, 29 Mar 2018 14:27:40 +0000 (UTC) Mark Johnston <markj@FreeBSD.org> wrote:
> > Author: markj
> > Date: Thu Mar 29 14:27:40 2018
> > New Revision: 331732
> > URL: https://svnweb.freebsd.org/changeset/base/331732
> > 
> > Log:
> >   Fix the background laundering mechanism after r329882.
> >   
> >   Rather than using the number of inactive queue scans as a metric for
> >   how many clean pages are being freed by the page daemon, have the
> >   page daemon keep a running counter of the number of pages it has freed,
> >   and have the laundry thread use that when computing the background
> >   laundering threshold.
> > [...]
> 
> I'm seeing big processes being killed with an "out of swap space" message
> even though there's still plenty of swap available.  It seems to be fixed
> by making this division round upwards:
> 
> 		if (target == 0 && ndirty * isqrt((nfreed +
> 		    (vmd->vmd_free_target - vmd->vmd_free_min) - 1) /
> 		    (vmd->vmd_free_target - vmd->vmd_free_min)) >= nclean) {
> 
> I don't know where this formula comes from, so I don't know if this
> change is correct.

Hm, that's somewhat surprising. This code shouldn't be executing in
situations where the OOM kill logic is invoked (i.e., memory pressure
plus a shortage of clean pages in the inactive queue).

How much RAM does the system have? Could you collect "sysctl vm" output
around the time of an OOM kill?

I'm wondering if the higher inactive queue scan frequency after r329882
might be responsible: OOM kills are performed after vm.pageout_oom_seq
back-to-back scans fail to reclaim any pages. Does your problem persist
if you increase the value of that sysctl, say to 60?



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20180331225432.GB1440>