From owner-freebsd-arm@freebsd.org  Wed Mar 25 01:56:35 2020
Return-Path: <owner-freebsd-arm@freebsd.org>
Delivered-To: freebsd-arm@mailman.nyi.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1])
 by mailman.nyi.freebsd.org (Postfix) with ESMTP id AA14A271FF8
 for <freebsd-arm@mailman.nyi.freebsd.org>;
 Wed, 25 Mar 2020 01:56:35 +0000 (UTC)
 (envelope-from fbsd@www.zefox.net)
Received: from www.zefox.net (www.zefox.net [50.1.20.27])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (Client CN "www.zefox.org", Issuer "www.zefox.org" (not verified))
 by mx1.freebsd.org (Postfix) with ESMTPS id 48nB6v4vb5z4DrH
 for <freebsd-arm@freebsd.org>; Wed, 25 Mar 2020 01:56:23 +0000 (UTC)
 (envelope-from fbsd@www.zefox.net)
Received: from www.zefox.net (localhost [127.0.0.1])
 by www.zefox.net (8.15.2/8.15.2) with ESMTPS id 02P1uYmU093429
 (version=TLSv1.2 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO);
 Tue, 24 Mar 2020 18:56:35 -0700 (PDT)
 (envelope-from fbsd@www.zefox.net)
Received: (from fbsd@localhost)
 by www.zefox.net (8.15.2/8.15.2/Submit) id 02P1uXkj093428;
 Tue, 24 Mar 2020 18:56:33 -0700 (PDT) (envelope-from fbsd)
Date: Tue, 24 Mar 2020 18:56:33 -0700
From: bob prohaska <fbsd@www.zefox.net>
To: Mark Millard <marklmi@yahoo.com>
Cc: freebsd-arm@freebsd.org, bob prohaska <fbsd@www.zefox.net>
Subject: Re: Belated out of swap kill on rpi3 at r359216
Message-ID: <20200325015633.GA93057@www.zefox.net>
References: <20200324155753.GA91922@www.zefox.net>
 <83E41A13-6C24-4B56-A837-779044038FBC@yahoo.com>
 <20200324185518.GA92311@www.zefox.net>
 <75CE3C07-8A0A-4D32-84C6-24BEA967447E@yahoo.com>
 <20200324224658.GA92726@www.zefox.net>
 <764D5A86-6A42-44E0-A706-F1C49BB198DA@yahoo.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <764D5A86-6A42-44E0-A706-F1C49BB198DA@yahoo.com>
User-Agent: Mutt/1.5.24 (2015-08-30)
X-Rspamd-Queue-Id: 48nB6v4vb5z4DrH
X-Spamd-Bar: +++
Authentication-Results: mx1.freebsd.org; dkim=none; dmarc=none;
 spf=none (mx1.freebsd.org: domain of fbsd@www.zefox.net has no SPF policy when
 checking 50.1.20.27) smtp.mailfrom=fbsd@www.zefox.net
X-Spamd-Result: default: False [3.18 / 15.00]; ARC_NA(0.00)[];
 WWW_DOT_DOMAIN(0.50)[]; RCVD_TLS_ALL(0.00)[];
 FROM_HAS_DN(0.00)[]; RCPT_COUNT_THREE(0.00)[3];
 TO_DN_SOME(0.00)[];
 IP_SCORE(0.06)[ip: (0.24), ipnet: 50.1.16.0/20(0.12), asn: 7065(-0.03),
 country: US(-0.05)]; MIME_GOOD(-0.10)[text/plain];
 URIBL_BLOCKED(0.00)[zefox.net.multi.uribl.com];
 DMARC_NA(0.00)[zefox.net]; AUTH_NA(1.00)[];
 NEURAL_SPAM_MEDIUM(0.37)[0.371,0];
 TO_MATCH_ENVRCPT_SOME(0.00)[]; NEURAL_SPAM_LONG(0.85)[0.849,0];
 R_SPF_NA(0.00)[]; FREEMAIL_TO(0.00)[yahoo.com];
 FROM_EQ_ENVFROM(0.00)[]; R_DKIM_NA(0.00)[];
 MIME_TRACE(0.00)[0:+];
 ASN(0.00)[asn:7065, ipnet:50.1.16.0/20, country:US];
 MID_RHS_MATCH_FROM(0.00)[]; MID_RHS_WWW(0.50)[];
 RCVD_COUNT_TWO(0.00)[2]
X-BeenThere: freebsd-arm@freebsd.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "Porting FreeBSD to ARM processors." <freebsd-arm.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-arm>,
 <mailto:freebsd-arm-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-arm/>
List-Post: <mailto:freebsd-arm@freebsd.org>
List-Help: <mailto:freebsd-arm-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-arm>,
 <mailto:freebsd-arm-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 25 Mar 2020 01:56:35 -0000

[this thread needs a haircut, new comments start near line 288]
On Tue, Mar 24, 2020 at 05:21:47PM -0700, Mark Millard wrote:
> On 2020-Mar-24, at 15:46, bob prohaska <fbsd at www.zefox.net> wrote:
> 
> > The logfile being discussed is at
> > http://www.zefox.net/~fbsd/rpi3/swaptests/r359216/swapscript.osupdate.log
> > for convenient reference. More replies below.
> > 
> > 
> > On Tue, Mar 24, 2020 at 01:21:42PM -0700, Mark Millard wrote:
> >> 
> >> 
> >> On 2020-Mar-24, at 11:55, bob prohaska <fbsd at www.zefox.net> wrote:
> >> 
> >>> On Tue, Mar 24, 2020 at 10:09:28AM -0700, Mark Millard wrote:
> >>>> 
> >>>> 
> >>>> On 2020-Mar-24, at 08:57, bob prohaska <fbsd at www.zefox.net> wrote:
> >>>> 
> >>>>> An attempt to buildworld on an rpi3 running r359216 stopped with an
> >>>>> OOMA kill. Sources were at 359264, loader.conf contained
> >>>>> vm.pageout_oom_seq="4096" . 
> >>>> 
> >>>> What of the value of vm.pfault_oom_attempts ?
> >>>> 
> >>>> If vm.pfault_oom_attempts was not -1,
> >>>> what of the value of vm.pfault_oom_wait ?
> >>>> 
> >>> bob@www:/usr/src % sysctl vm.pfault_oom_wait
> >>> vm.pfault_oom_wait: 10
> >>> I didn't change it, this must be the default. 
> >>> Would it be useful to add something like
> >>> vm.pfault_oom_wait="20" 
> >>> to /boot/loader.conf?
> >> 
> >> I assume that this answer means vm.pfault_oom_attempts
> >> was not -1. But you did not list what it was.
> >> 
> > 
> > Sorry, the variable names are running together in my head.
> > 
> > bob@www:/usr/src % sysctl vm.pfault_oom_attempts
> > vm.pfault_oom_attempts: 3
> > It's now reset to -1, hopefully it'll work better than last time 8-)
> > 
> >> vm.pfault_oom_attempts=-1 is the means of avoiding
> >> the the vm.pfault_oom_attempts,vm.pfault_oom_wait
> >> pair from causing OOM at all. (vm.pfault_oom_wait
> >> becomes irrelevant.)
> >> 
> > 
> >> When vm.pfault_oom_attempts != -1 , then there are
> >> potential time-outs that overall involve:
> >> 
> >>   vm.pfault_oom_attempts * vm.pfault_oom_wait
> >> 
> >> but has some tradeoffs in the partitioning between
> >> the 2 factors:
> >> 
> >> # sysctl -d vm.pfault_oom_attempts
> >> vm.pfault_oom_attempts: Number of page allocation attempts in page fault handler before it triggers OOM handling
> >> 
> >> # sysctl -d vm.pfault_oom_wait
> >> vm.pfault_oom_wait: Number of seconds to wait for free pages before retrying the page fault handler
> >> 
> >> As I understand, the following cases could each have
> >> very different results depending on what the latencies
> >> are like and such:
> >> 
> >> vm.pfault_oom_attempts==20 && vm.pfault_oom_wait==1
> >> vs.
> >> vm.pfault_oom_attempts==1 && vm.pfault_oom_wait==20
> >> vs.
> >> vm.pfault_oom_attempts==4 && vm.pfault_oom_wait==5
> >> 
> >> As I remember, Konstantin Belousov once asked someone
> >> that was having a repeatable problem to try some
> >> alternatives that explored this but, as I remember,
> >> he got no reply to the request.
> >> 
> > Konstantin wrote to both me and the list in a thread on 
> > Re: panic: non-current pmap. If it's relevant please indicate.
> > 
> > He also wrote to Don Lewis, in a thread on Re: spurious out of swap kills
> > but that was on a 
> > " 16-thread Ryzen machine, with 64 GB of RAM and 40 GB of swap"
> > which is surely unlike my predicament 8-)
> > 
> >> The person might have just used: vm.pfault_oom_attempts=-1
> >> and continued with their primary activity, for all I know.
> >> 
> >> vm.pfault_oom_attempts=-1 is only recommended when one
> >> is sure that they will not run out of swap/paging space,
> >> if I understand right.
> >> 
> >> For reference, for other settings:
> >> 
> >> # sysctl -d vm.pageout_oom_seq
> >> vm.pageout_oom_seq: back-to-back calls to oom detector to start OOM
> >> 
> >> There is also:
> >> 
> >> # sysctl -d vm.oom_pf_secs
> >> vm.oom_pf_secs: 
> >> 
> >> but it seems to have an empty description. May be
> >> it reports vm.pfault_oom_attempts * vm.pfault_oom_wait
> >> when vm.pfault_oom_attempts != -1 . (I've not looked.)
> >> 
> >>>>> A snipped of gstat log suggests the worst congestion in the storage I/O
> >>>>> happened at Tue Mar 24 04:52:26 PDT 2020 with an L(q) of 37, but the
> >>>>> OOMA kill happened at Tue Mar 24 04:53:04 PDT 2020, by which time the
> >>>>> L(q) had dropped to one, half a minute later.
> >>>>> 
> >>>>> Is the delay in OOMA action to be expected? 
> >>>>> 
> >>>>> Here's the relevant part of the log, I hope the columns display readably:
> >>>>> 
> >>>>> 0/2/2/19177 mbuf clusters in use (current/cache/total/max)
> >>>>> procs     memory       page                      disks     faults       cpu
> >>>>> r b w     avm     fre  flt  re  pi  po    fr   sr mm0 da0   in   sy   cs us sy id
> >>>>> 0  0 29 1897320   47312 12851   9   4   5 13330 1669   0   0 14172  8020  3034 65 29  6
> >>>>> dT: 1.056s  w: 1.000s
> >>>>> L(q)  ops/s    r/s   kBps   ms/r    w/s   kBps   ms/w    d/s   kBps   ms/d   %busy Name
> >>>>> 37    367    323   5463    6.4     45   1511   76.8      0      0    0.0   91.7  mmcsd0
> >>>>> 37    367    323   5463    6.5     45   1511   76.9      0      0    0.0   93.3  mmcsd0s2
> >>>>> 34    133    111   3209    7.6     22    697  134.7      0      0    0.0   74.0  mmcsd0s2a
> >>>>>  3    235    212   2254    5.9     23    814   21.5      0      0    0.0   70.0  mmcsd0s2b
> >>>>> 34    133    111   3209    7.6     22    697  134.7      0      0    0.0   74.2  ufs/rootfs
> >>>>> Tue Mar 24 04:52:26 PDT 2020
> >>>>> Device          1K-blocks     Used    Avail Capacity
> >>>>> /dev/mmcsd0s2b    4404252   224484  4179768     5%
> >>>>> Mar 24 04:20:50 www kernel: swap_pager: indefinite wait buffer: bufobj: 0, blkno: 21842, size: 12288
> >>>>> Mar 24 04:20:50 www kernel: swap_pager: indefinite wait buffer: bufobj: 0, blkno: 25477, size: 4096
> >>>> 
> >>>> How bad were things back when the "indefinate wait buffer" notices
> >>>> were generated (Mar 24 04:20:50 time frame)?
> >>>> 
> >>> It looks like _new_ indefinite wait messages started coming at around Mon Mar 23 23:00:28 PDT 2020:
> >>> procs     memory       page                      disks     faults       cpu
> >>> r b w     avm     fre  flt  re  pi  po    fr   sr mm0 da0   in   sy   cs us sy id
> >>> 2  0 18 1588824  197676 14108   2   1   0 14759  238   0   0 14055  9263  2668 62 32  5
> >>> dT: 1.027s  w: 1.000s
> >>> L(q)  ops/s    r/s   kBps   ms/r    w/s   kBps   ms/w    d/s   kBps   ms/d   %busy Name
> >>>   9      1      0      0    0.0      1     31  13451      0      0    0.0  326.9  mmcsd0
> >>>   9      1      0      0    0.0      1     31  13451      0      0    0.0  326.9  mmcsd0s2
> >>>   6      1      0      0    0.0      1     31  13451      0      0    0.0  326.9  mmcsd0s2a
> >>>   6      1      0      0    0.0      1     31  13451      0      0    0.0  326.9  ufs/rootfs
> >>> Mon Mar 23 23:00:28 PDT 2020
> >>> Device          1K-blocks     Used    Avail Capacity
> >>> /dev/mmcsd0s2b    4404252   245288  4158964     6%
> >>> Mar 23 23:00:28 www kernel: swap_pager: indefinite wait buffer: bufobj: 0, blkno: 280324, size: 8192
> >>> Mar 23 23:00:28 www kernel: swap_pager: indefinite wait buffer: bufobj: 0, blkno: 285903, size: 16384
> >>> 
> >>> Admittedly, 300% busy looks pretty bad. Still the machine didn't quit.....
> >>> On the next sample %busy went down to 18% for swap, less for all else.
> >> 
> >> As I understand, the ms/w value 13451 means that there was
> >> for a time (a mean of ?) somewhat under 14 seconds from a
> >> write being queued to it being completed.
> >> 
> >> If a read could be blocked for such time frames by such
> >> writes, it suggests that vm.pfault_oom_wait might need to
> >> be larger than 13 if vm.pfault_oom_attempts=-1 is not in
> >> use.
> >> 
> > Ok, I'm starting to get it. On this machine it's 10.
> > 
> > But the 13 second delay appeared at Mon Mar 23 23:00:28 PDT 2020. 
> > The kill occurred around Tue Mar 24 04:53:08 PDT 2020
> > under what look like much more benign circumstances. 
> 
> I had intended to ask about the text next-to/just-above,
> not much earlier [thus, the "(Mar 24 04:20:50 time frame)"].
> So I was asking about:
> 
> Mar 24 04:20:50 www kernel: swap_pager: indefinite wait buffer: bufobj: 0, blkno: 21842, size: 12288
> Mar 24 04:20:50 www kernel: swap_pager: indefinite wait buffer: bufobj: 0, blkno: 25477, size: 4096
> 
> and did not originally notice that you replied with
> information for "Mon Mar 23 23:00:28 PDT 2020".
> 
> Other than showing examples of what the delays can be,
> the "Mar 24 04:20:50" is of more direct relevance for
> the OOM kill that happened.
> 
> I'll show the text around/before 04:20:50 later in this
> note. It proves somewhat interesting.
> 
> >> Or vm.pfault_oom_attempts=-1 could be used to avoid large
> >> time frames from directly leading to OOM activity.
> >> 
> > Understood. I simply forgot to restore the setting after
> > the initial troubles with it. 
> > 
> >>>> 
> >>>> Below I show code changes to be more explicit in the
> >>>> output messaging about what contributes to initiating
> >>>> OOM kills, without needing boot verbose or the like.
> >>>> There are also some messages from Mark J.'s old code
> >>>> for reporting on things related to initiating OOM
> >>>> kills, or some minor variations of them.
> >>>> 
> >>> 
> >>>> You may want to try such changes to provide more
> >>>> context for your OOM kills when they happen. Below
> >>>> the 4 reasons reported are (showing the most
> >>>> detailed of the related messages from different
> >>>> stages):
> >>>> 
> >>>> swp_pager_meta_build: swap blk uma zone exhausted
> >>>> 
> >>>> swp_pager_meta_build: swap pctrie uma zone exhausted
> >>>> 
> >>>> vm_fault_allocate: proc %d (%s) failed to alloc page on fault, starting OOM
> >>>> 
> >>>> m_pageout_mightbe_oom: kill context: v_free_count: %u, v_inactive_count: %u
> >>>> 
> >>>> 
> >>> 
> >>> Could inquiries instead  be added to the logging script?
> >>> Right now it's
> >>> #!/bin/sh
> >>> while true
> >>> do vmstat ; gstat -abd -I 1s ; date ; swapinfo ; tail -n 2 /var/log/messages ; netstat -m | grep "mbuf clusters"
> >>> done
> >>> 
> >>> I'd much rather tamper with the logging script than the kernel 8-)
> >> 
> >> Unfortunately, the reason I made the kernel messaging
> >> changes is that, as far as I know, the normal kernel
> >> simply does not publish the information anywhere when
> >> the internal criteria leads to OOM activity: No
> >> inquiry available without kernel changes.
> > 
> > If all else fails I'll try to apply your patches to the kernel
> > and recompile. 
> 
> Up to you, of course.
> 
> 
> The text before "Mar 24 04:20:50" messages, with notes
> and blank lines mixed in to group things:
> 
> . . .
> Tue Mar 24 04:20:38 PDT 2020
> Device          1K-blocks     Used    Avail Capacity
> /dev/mmcsd0s2b    4404252   403784  4000468     9%
> Mar 24 02:50:01 www kernel: swap_pager: indefinite wait buffer: bufobj: 0, blkno: 623056, size: 16384
> Mar 24 04:17:17 www kernel: swap_pager: indefinite wait buffer: bufobj: 0, blkno: 1033854, size: 4096
> 3/773/776/19177 mbuf clusters in use (current/cache/total/max)
> 
> procs     memory       page                      disks     faults       cpu
> r b w     avm     fre  flt  re  pi  po    fr   sr mm0 da0   in   sy   cs us sy id
>  0  0 30 2034696   54180 12942   7   2   4 13432 1374   0   0 14074  8102  2849 65 29  5
> dT: 1.031s  w: 1.000s
>  L(q)  ops/s    r/s   kBps   ms/r    w/s   kBps   ms/w    d/s   kBps   ms/d   %busy Name
>    18      1      0      0    0.0      1     31  26686      0      0    0.0  287.1  mmcsd0
>    18      1      0      0    0.0      1     31  26686      0      0    0.0  287.1  mmcsd0s2
>    10      1      0      0    0.0      1     31  26686      0      0    0.0  287.1  mmcsd0s2a
>    10      1      0      0    0.0      1     31  26686      0      0    0.0  287.1  ufs/rootfs
> Tue Mar 24 04:20:40 PDT 2020
> Device          1K-blocks     Used    Avail Capacity
> /dev/mmcsd0s2b    4404252   403784  4000468     9%
> Mar 24 02:50:01 www kernel: swap_pager: indefinite wait buffer: bufobj: 0, blkno: 623056, size: 16384
> Mar 24 04:17:17 www kernel: swap_pager: indefinite wait buffer: bufobj: 0, blkno: 1033854, size: 4096
> 6/770/776/19177 mbuf clusters in use (current/cache/total/max)
> 
> Note that the above has 26sec+ ms/w figures. No reads.
> No mmcsd0s2b.
> 
> 
> procs     memory       page                      disks     faults       cpu
> r b w     avm     fre  flt  re  pi  po    fr   sr mm0 da0   in   sy   cs us sy id
>  0  0 30 2034696   54252 12942   7   2   4 13432 1374   0   0 14074  8102  2849 65 29  5
> dT: 1.064s  w: 1.000s
>  L(q)  ops/s    r/s   kBps   ms/r    w/s   kBps   ms/w    d/s   kBps   ms/d   %busy Name
> Tue Mar 24 04:20:42 PDT 2020
> Device          1K-blocks     Used    Avail Capacity
> /dev/mmcsd0s2b    4404252   403784  4000468     9%
> Mar 24 02:50:01 www kernel: swap_pager: indefinite wait buffer: bufobj: 0, blkno: 623056, size: 16384
> Mar 24 04:17:17 www kernel: swap_pager: indefinite wait buffer: bufobj: 0, blkno: 1033854, size: 4096
> 9/1023/1032/19177 mbuf clusters in use (current/cache/total/max)
> 
> For the above, note the lack of any lines with ms/r or ms/w figures,
> just the title rows show.
> 

Same thing happens when gstat -a is run on an idle machine.
I thought it normal, the CPU had all the data it needed. 

> 
> procs     memory       page                      disks     faults       cpu
> r b w     avm     fre  flt  re  pi  po    fr   sr mm0 da0   in   sy   cs us sy id
>  0  0 30 2034696   53668 12942   7   2   4 13432 1374   0   0 14074  8102  2849 65 29  5
> dT: 1.046s  w: 1.000s
>  L(q)  ops/s    r/s   kBps   ms/r    w/s   kBps   ms/w    d/s   kBps   ms/d   %busy Name
>    17      1      0      0    0.0      1     31  30042      0      0    0.0  320.7  mmcsd0
>    17      1      0      0    0.0      1     31  30042      0      0    0.0  320.7  mmcsd0s2
>     9      1      0      0    0.0      1     31  30042      0      0    0.0  320.7  mmcsd0s2a
>     9      1      0      0    0.0      1     31  30042      0      0    0.0  320.7  ufs/rootfs
> Tue Mar 24 04:20:43 PDT 2020
> Device          1K-blocks     Used    Avail Capacity
> /dev/mmcsd0s2b    4404252   403784  4000468     9%
> Mar 24 02:50:01 www kernel: swap_pager: indefinite wait buffer: bufobj: 0, blkno: 623056, size: 16384
> Mar 24 04:17:17 www kernel: swap_pager: indefinite wait buffer: bufobj: 0, blkno: 1033854, size: 4096
> 9/1023/1032/19177 mbuf clusters in use (current/cache/total/max)
> 
> Note that the above has 30sec+ ms/w figures. No reads.
> 
> Also note that the L(q) figures dropped by only 1 over
> the about 04:20:40 to 04:20:43 interval (18->17 and
> 10->9). (I only take these deltas as suggestive. They
> might be accidental near matches.) No mmcsd0s2b.
> 
> 
> procs     memory       page                      disks     faults       cpu
> r b w     avm     fre  flt  re  pi  po    fr   sr mm0 da0   in   sy   cs us sy id
>  0  0 30 2034696   53624 12942   7   2   4 13432 1374   0   0 14074  8102  2849 65 29  5
> dT: 1.003s  w: 1.000s
>  L(q)  ops/s    r/s   kBps   ms/r    w/s   kBps   ms/w    d/s   kBps   ms/d   %busy Name
> Tue Mar 24 04:20:45 PDT 2020
> Device          1K-blocks     Used    Avail Capacity
> /dev/mmcsd0s2b    4404252   403784  4000468     9%
> Mar 24 02:50:01 www kernel: swap_pager: indefinite wait buffer: bufobj: 0, blkno: 623056, size: 16384
> Mar 24 04:17:17 www kernel: swap_pager: indefinite wait buffer: bufobj: 0, blkno: 1033854, size: 4096
> 12/1020/1032/19177 mbuf clusters in use (current/cache/total/max)
> 
> For the above, note the lack of any lines with ms/r or ms/w figures,
> just the title rows show.
>

It's tempting, but maybe not correct, the think the I/O caught up.
 
> 
> procs     memory       page                      disks     faults       cpu
> r b w     avm     fre  flt  re  pi  po    fr   sr mm0 da0   in   sy   cs us sy id
>  0  0 30 2034696   53600 12942   7   2   4 13432 1374   0   0 14074  8102  2849 65 29  5
> dT: 1.063s  w: 1.000s
>  L(q)  ops/s    r/s   kBps   ms/r    w/s   kBps   ms/w    d/s   kBps   ms/d   %busy Name
> Tue Mar 24 04:20:47 PDT 2020
> Device          1K-blocks     Used    Avail Capacity
> /dev/mmcsd0s2b    4404252   403784  4000468     9%
> Mar 24 02:50:01 www kernel: swap_pager: indefinite wait buffer: bufobj: 0, blkno: 623056, size: 16384
> Mar 24 04:17:17 www kernel: swap_pager: indefinite wait buffer: bufobj: 0, blkno: 1033854, size: 4096
> 12/1020/1032/19177 mbuf clusters in use (current/cache/total/max)
> 
> For the above, note the lack of any lines with ms/r or ms/w figures,
> just the title rows show.
>

Can't that be interpreted as gstat -a having nothing to report? 
Just _why_ it might have nothing to report never crossed my mind....
> 
> 
> procs     memory       page                      disks     faults       cpu
> r b w     avm     fre  flt  re  pi  po    fr   sr mm0 da0   in   sy   cs us sy id
>  0  0 30 2034696   53592 12942   7   2   4 13432 1374   0   0 14074  8102  2849 65 29  5
> dT: 1.006s  w: 1.000s
>  L(q)  ops/s    r/s   kBps   ms/r    w/s   kBps   ms/w    d/s   kBps   ms/d   %busy Name
> Tue Mar 24 04:20:48 PDT 2020
> Device          1K-blocks     Used    Avail Capacity
> /dev/mmcsd0s2b    4404252   403784  4000468     9%
> Mar 24 02:50:01 www kernel: swap_pager: indefinite wait buffer: bufobj: 0, blkno: 623056, size: 16384
> Mar 24 04:17:17 www kernel: swap_pager: indefinite wait buffer: bufobj: 0, blkno: 1033854, size: 4096
> 15/1017/1032/19177 mbuf clusters in use (current/cache/total/max)
> 
> For the above, note the lack of any lines with ms/r or ms/w figures,
> just the title rows show.
> 
> 
> 
> procs     memory       page                      disks     faults       cpu
> r b w     avm     fre  flt  re  pi  po    fr   sr mm0 da0   in   sy   cs us sy id
>  0  0 30 2034696   53672 12942   7   2   4 13432 1374   0   0 14074  8102  2849 65 29  5
> dT: 1.002s  w: 1.000s
>  L(q)  ops/s    r/s   kBps   ms/r    w/s   kBps   ms/w    d/s   kBps   ms/d   %busy Name
> Tue Mar 24 04:20:50 PDT 2020
> Device          1K-blocks     Used    Avail Capacity
> /dev/mmcsd0s2b    4404252   403784  4000468     9%
> Mar 24 02:50:01 www kernel: swap_pager: indefinite wait buffer: bufobj: 0, blkno: 623056, size: 16384
> Mar 24 04:17:17 www kernel: swap_pager: indefinite wait buffer: bufobj: 0, blkno: 1033854, size: 4096
> 15/1017/1032/19177 mbuf clusters in use (current/cache/total/max)
> 
> For the above, note the list of any lines with ms/r or ms/w figures,
> just the title rows.
> 
> 
> 
> procs     memory       page                      disks     faults       cpu
> r b w     avm     fre  flt  re  pi  po    fr   sr mm0 da0   in   sy   cs us sy id
>  2  0 30 2045028   54160 12942   7   2   4 13432 1374   0   0 14074  8102  2849 65 29  5
> dT: 1.001s  w: 1.000s
>  L(q)  ops/s    r/s   kBps   ms/r    w/s   kBps   ms/w    d/s   kBps   ms/d   %busy Name
>     2    329    236   3839   13.0     93   1262   18.6      0      0    0.0   99.4  mmcsd0
>     2    329    236   3839   13.2     93   1262   18.7      0      0    0.0  101.1  mmcsd0s2
>     1    153    152   2816   12.5      1     32  366.5      0      0    0.0   95.2  mmcsd0s2a
>     1    176     84   1023   14.5     92   1230   14.9      0      0    0.0   66.8  mmcsd0s2b
>     1    153    152   2816   12.5      1     32  366.6      0      0    0.0   95.3  ufs/rootfs
> Tue Mar 24 04:20:51 PDT 2020
> Device          1K-blocks     Used    Avail Capacity
> /dev/mmcsd0s2b    4404252   296976  4107276     7%
> Mar 24 04:20:50 www kernel: swap_pager: indefinite wait buffer: bufobj: 0, blkno: 21842, size: 12288
> Mar 24 04:20:50 www kernel: swap_pager: indefinite wait buffer: bufobj: 0, blkno: 25477, size: 4096
> 0/1032/1032/19177 mbuf clusters in use (current/cache/total/max)
> 
> For the above, the lines with ms/r and ms/w figures are back
> but the indefinite wait buffer for "Mar 24 04:20:50" happened
> too, before the "Tue Mar 24 04:20:51 PDT 2020" above. I'd guess
> they are from before the data that the gstat output is based on.
> 
> Also there are lots of reads and some writes in the above
> gstat output. mmcsd0s2b is again showing as well.
> 
> Note: It looks like  ufs/rootfs might be a label
> identifying mmcsd0s2a, so they track up to slight
> time-of-information differences. I'm guessing that
> mmcsd0s2b is just a swap/paging partition.
>

That's correct.  

> So it looks like activity for ufs/rootfs may have
> blocked/delayed activity for paging/swaping
> (mmcsd0s2b), at least for a time, even if it is
> not directly the cause of the specific OOM
> activity.
> 

How would one distinguish slow swap i/o from slow
filesystem i/o ? 

Would moving swap to a USB device avoid blocking 
by writes to microSD? That's not hard to try, but
it puts a whole new kettle of fish on the fire.

It should be added that the original buildworld has been restarted with
vm.pfault_oom_attempts: -1 and so far, though it still generates the
"indefinite wait" warnings, the ms/w times look better. It's not done yet,
though. How much overhead comes with OOMA?

Many thanks for your insights, 

bob prohaska