From owner-freebsd-arm@freebsd.org Wed Aug 8 15:37:46 2018 Return-Path: Delivered-To: freebsd-arm@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 98206105F9C0 for ; Wed, 8 Aug 2018 15:37:46 +0000 (UTC) (envelope-from fbsd@www.zefox.net) Received: from www.zefox.net (www.zefox.net [50.1.20.27]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "www.zefox.org", Issuer "www.zefox.org" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 19DEE8292E; Wed, 8 Aug 2018 15:37:45 +0000 (UTC) (envelope-from fbsd@www.zefox.net) Received: from www.zefox.net (localhost [127.0.0.1]) by www.zefox.net (8.15.2/8.15.2) with ESMTPS id w78Fc1Q3028766 (version=TLSv1.2 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO); Wed, 8 Aug 2018 08:38:01 -0700 (PDT) (envelope-from fbsd@www.zefox.net) Received: (from fbsd@localhost) by www.zefox.net (8.15.2/8.15.2/Submit) id w78Fc0sk028765; Wed, 8 Aug 2018 08:38:00 -0700 (PDT) (envelope-from fbsd) Date: Wed, 8 Aug 2018 08:38:00 -0700 From: bob prohaska To: Mark Johnston Cc: Mark Millard , freebsd-arm@freebsd.org, bob prohaska Subject: Re: RPI3 swap experiments ["was killed: out of swap space" with: "v_free_count: 5439, v_inactive_count: 1"] Message-ID: <20180808153800.GF26133@www.zefox.net> References: <23793AAA-A339-4DEC-981F-21C7CC4FE440@yahoo.com> <20180731231912.GF94742@www.zefox.net> <2222ABBD-E689-4C3B-A7D3-50AECCC5E7B2@yahoo.com> <20180801034511.GA96616@www.zefox.net> <201808010405.w7145RS6086730@donotpassgo.dyslexicfish.net> <6BFE7B77-A0E2-4FAF-9C68-81951D2F6627@yahoo.com> <20180802002841.GB99523@www.zefox.net> <20180802015135.GC99523@www.zefox.net> <20180806155837.GA6277@raichu> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20180806155837.GA6277@raichu> User-Agent: Mutt/1.5.24 (2015-08-30) X-BeenThere: freebsd-arm@freebsd.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: "Porting FreeBSD to ARM processors." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 08 Aug 2018 15:37:46 -0000 On Mon, Aug 06, 2018 at 11:58:37AM -0400, Mark Johnston wrote: > On Wed, Aug 01, 2018 at 09:27:31PM -0700, Mark Millard wrote: > > [I have a top-posted introduction here in reply > > to a message listed at the bottom.] > > > > Bob P. meet Mark J. Mark J. meet Bob P. I'm > > hopinh you can help Bob P. use a patch that > > you once published on the lists. This was from: > > > > https://lists.freebsd.org/pipermail/freebsd-current/2018-June/069835.html > > > > Bob P. has been having problems with an rpi3 > > based buildworld ending up with "was killed: > > out of swap space" but when the swap partitions > > do not seem to be heavily used (seen via swapinfo > > or watching top). > > > > > The patch to report OOMA information did its job, very tersely. The console reported > > > v_free_count: 5439, v_inactive_count: 1 > > > Aug 1 18:08:25 www kernel: pid 93301 (c++), uid 0, was killed: out of swap space > > > > > > The entire buildworld.log and gstat output are at > > > http://www.zefox.net/~fbsd/rpi3/swaptests/r336877M/ > > > > > > It appears that at 18:08:21 a write to the USB swap device took 530.5 ms, > > > next top was killed and ten seconds later c++ was killed, _after_ da0b > > > was no longer busy. > > My suspicion, based on the high latency, is that this is a consequence > of r329882, which lowered the period of time that the page daemon will > sleep while waiting for dirty pages to be cleaned. If a certain number > of consecutive wakeups and queue scans occur without making progress, > the OOM killer is triggered. That number is vm.pageout_oom_seq - could > you try increasing it by a factor of 10 and retry your test? > > > > This buildworld stopped a quite a bit earlier than usual; most of the time > > > the buildworld.log file is close to 20 MB at the time OOMA acts. In this case > > > it was around 13 MB. Not clear if that's of significance. > > > > > > If somebody would indicate whether this result is informative, and any possible > > > improvements to the test, I'd be most grateful. > > If the above suggestion doesn't help, the next thing to try would be to > revert the oom_seq value to the default, apply this patch, and see if > the problem continues to occur. If this doesn't help, please try > applying both measures, i.e., set oom_seq to 120 _and_ apply the patch. > > diff --git a/sys/vm/vm_pagequeue.h b/sys/vm/vm_pagequeue.h > index fb56bdf2fdfc..29a16060253f 100644 > --- a/sys/vm/vm_pagequeue.h > +++ b/sys/vm/vm_pagequeue.h > @@ -74,7 +74,7 @@ struct vm_pagequeue { > } __aligned(CACHE_LINE_SIZE); > > #ifndef VM_BATCHQUEUE_SIZE > -#define VM_BATCHQUEUE_SIZE 7 > +#define VM_BATCHQUEUE_SIZE 1 > #endif > > struct vm_batchqueue { The patched kernel ran longer than default but OOMA still halted buildworld around 13 MB. That's considerably farther than a default build world have run but less than observed when setting vm.pageout_oom_seq=120 alone. Log files are at http://www.zefox.net/~fbsd/rpi3/swaptests/r337226M/1gbsdflash_1gbusbflash/batchqueue/ Both changes are now in place and -j4 buildworld has been restarted. Thanks for reading! bob prohaska