From owner-freebsd-arm@freebsd.org Wed Mar 25 01:56:35 2020 Return-Path: Delivered-To: freebsd-arm@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id AA14A271FF8 for ; Wed, 25 Mar 2020 01:56:35 +0000 (UTC) (envelope-from fbsd@www.zefox.net) Received: from www.zefox.net (www.zefox.net [50.1.20.27]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "www.zefox.org", Issuer "www.zefox.org" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 48nB6v4vb5z4DrH for ; Wed, 25 Mar 2020 01:56:23 +0000 (UTC) (envelope-from fbsd@www.zefox.net) Received: from www.zefox.net (localhost [127.0.0.1]) by www.zefox.net (8.15.2/8.15.2) with ESMTPS id 02P1uYmU093429 (version=TLSv1.2 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO); Tue, 24 Mar 2020 18:56:35 -0700 (PDT) (envelope-from fbsd@www.zefox.net) Received: (from fbsd@localhost) by www.zefox.net (8.15.2/8.15.2/Submit) id 02P1uXkj093428; Tue, 24 Mar 2020 18:56:33 -0700 (PDT) (envelope-from fbsd) Date: Tue, 24 Mar 2020 18:56:33 -0700 From: bob prohaska To: Mark Millard Cc: freebsd-arm@freebsd.org, bob prohaska Subject: Re: Belated out of swap kill on rpi3 at r359216 Message-ID: <20200325015633.GA93057@www.zefox.net> References: <20200324155753.GA91922@www.zefox.net> <83E41A13-6C24-4B56-A837-779044038FBC@yahoo.com> <20200324185518.GA92311@www.zefox.net> <75CE3C07-8A0A-4D32-84C6-24BEA967447E@yahoo.com> <20200324224658.GA92726@www.zefox.net> <764D5A86-6A42-44E0-A706-F1C49BB198DA@yahoo.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <764D5A86-6A42-44E0-A706-F1C49BB198DA@yahoo.com> User-Agent: Mutt/1.5.24 (2015-08-30) X-Rspamd-Queue-Id: 48nB6v4vb5z4DrH X-Spamd-Bar: +++ Authentication-Results: mx1.freebsd.org; dkim=none; dmarc=none; spf=none (mx1.freebsd.org: domain of fbsd@www.zefox.net has no SPF policy when checking 50.1.20.27) smtp.mailfrom=fbsd@www.zefox.net X-Spamd-Result: default: False [3.18 / 15.00]; ARC_NA(0.00)[]; WWW_DOT_DOMAIN(0.50)[]; RCVD_TLS_ALL(0.00)[]; FROM_HAS_DN(0.00)[]; RCPT_COUNT_THREE(0.00)[3]; TO_DN_SOME(0.00)[]; IP_SCORE(0.06)[ip: (0.24), ipnet: 50.1.16.0/20(0.12), asn: 7065(-0.03), country: US(-0.05)]; MIME_GOOD(-0.10)[text/plain]; URIBL_BLOCKED(0.00)[zefox.net.multi.uribl.com]; DMARC_NA(0.00)[zefox.net]; AUTH_NA(1.00)[]; NEURAL_SPAM_MEDIUM(0.37)[0.371,0]; TO_MATCH_ENVRCPT_SOME(0.00)[]; NEURAL_SPAM_LONG(0.85)[0.849,0]; R_SPF_NA(0.00)[]; FREEMAIL_TO(0.00)[yahoo.com]; FROM_EQ_ENVFROM(0.00)[]; R_DKIM_NA(0.00)[]; MIME_TRACE(0.00)[0:+]; ASN(0.00)[asn:7065, ipnet:50.1.16.0/20, country:US]; MID_RHS_MATCH_FROM(0.00)[]; MID_RHS_WWW(0.50)[]; RCVD_COUNT_TWO(0.00)[2] X-BeenThere: freebsd-arm@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Porting FreeBSD to ARM processors." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 25 Mar 2020 01:56:35 -0000 [this thread needs a haircut, new comments start near line 288] On Tue, Mar 24, 2020 at 05:21:47PM -0700, Mark Millard wrote: > On 2020-Mar-24, at 15:46, bob prohaska wrote: > > > The logfile being discussed is at > > http://www.zefox.net/~fbsd/rpi3/swaptests/r359216/swapscript.osupdate.log > > for convenient reference. More replies below. > > > > > > On Tue, Mar 24, 2020 at 01:21:42PM -0700, Mark Millard wrote: > >> > >> > >> On 2020-Mar-24, at 11:55, bob prohaska wrote: > >> > >>> On Tue, Mar 24, 2020 at 10:09:28AM -0700, Mark Millard wrote: > >>>> > >>>> > >>>> On 2020-Mar-24, at 08:57, bob prohaska wrote: > >>>> > >>>>> An attempt to buildworld on an rpi3 running r359216 stopped with an > >>>>> OOMA kill. Sources were at 359264, loader.conf contained > >>>>> vm.pageout_oom_seq="4096" . > >>>> > >>>> What of the value of vm.pfault_oom_attempts ? > >>>> > >>>> If vm.pfault_oom_attempts was not -1, > >>>> what of the value of vm.pfault_oom_wait ? > >>>> > >>> bob@www:/usr/src % sysctl vm.pfault_oom_wait > >>> vm.pfault_oom_wait: 10 > >>> I didn't change it, this must be the default. > >>> Would it be useful to add something like > >>> vm.pfault_oom_wait="20" > >>> to /boot/loader.conf? > >> > >> I assume that this answer means vm.pfault_oom_attempts > >> was not -1. But you did not list what it was. > >> > > > > Sorry, the variable names are running together in my head. > > > > bob@www:/usr/src % sysctl vm.pfault_oom_attempts > > vm.pfault_oom_attempts: 3 > > It's now reset to -1, hopefully it'll work better than last time 8-) > > > >> vm.pfault_oom_attempts=-1 is the means of avoiding > >> the the vm.pfault_oom_attempts,vm.pfault_oom_wait > >> pair from causing OOM at all. (vm.pfault_oom_wait > >> becomes irrelevant.) > >> > > > >> When vm.pfault_oom_attempts != -1 , then there are > >> potential time-outs that overall involve: > >> > >> vm.pfault_oom_attempts * vm.pfault_oom_wait > >> > >> but has some tradeoffs in the partitioning between > >> the 2 factors: > >> > >> # sysctl -d vm.pfault_oom_attempts > >> vm.pfault_oom_attempts: Number of page allocation attempts in page fault handler before it triggers OOM handling > >> > >> # sysctl -d vm.pfault_oom_wait > >> vm.pfault_oom_wait: Number of seconds to wait for free pages before retrying the page fault handler > >> > >> As I understand, the following cases could each have > >> very different results depending on what the latencies > >> are like and such: > >> > >> vm.pfault_oom_attempts==20 && vm.pfault_oom_wait==1 > >> vs. > >> vm.pfault_oom_attempts==1 && vm.pfault_oom_wait==20 > >> vs. > >> vm.pfault_oom_attempts==4 && vm.pfault_oom_wait==5 > >> > >> As I remember, Konstantin Belousov once asked someone > >> that was having a repeatable problem to try some > >> alternatives that explored this but, as I remember, > >> he got no reply to the request. > >> > > Konstantin wrote to both me and the list in a thread on > > Re: panic: non-current pmap. If it's relevant please indicate. > > > > He also wrote to Don Lewis, in a thread on Re: spurious out of swap kills > > but that was on a > > " 16-thread Ryzen machine, with 64 GB of RAM and 40 GB of swap" > > which is surely unlike my predicament 8-) > > > >> The person might have just used: vm.pfault_oom_attempts=-1 > >> and continued with their primary activity, for all I know. > >> > >> vm.pfault_oom_attempts=-1 is only recommended when one > >> is sure that they will not run out of swap/paging space, > >> if I understand right. > >> > >> For reference, for other settings: > >> > >> # sysctl -d vm.pageout_oom_seq > >> vm.pageout_oom_seq: back-to-back calls to oom detector to start OOM > >> > >> There is also: > >> > >> # sysctl -d vm.oom_pf_secs > >> vm.oom_pf_secs: > >> > >> but it seems to have an empty description. May be > >> it reports vm.pfault_oom_attempts * vm.pfault_oom_wait > >> when vm.pfault_oom_attempts != -1 . (I've not looked.) > >> > >>>>> A snipped of gstat log suggests the worst congestion in the storage I/O > >>>>> happened at Tue Mar 24 04:52:26 PDT 2020 with an L(q) of 37, but the > >>>>> OOMA kill happened at Tue Mar 24 04:53:04 PDT 2020, by which time the > >>>>> L(q) had dropped to one, half a minute later. > >>>>> > >>>>> Is the delay in OOMA action to be expected? > >>>>> > >>>>> Here's the relevant part of the log, I hope the columns display readably: > >>>>> > >>>>> 0/2/2/19177 mbuf clusters in use (current/cache/total/max) > >>>>> procs memory page disks faults cpu > >>>>> r b w avm fre flt re pi po fr sr mm0 da0 in sy cs us sy id > >>>>> 0 0 29 1897320 47312 12851 9 4 5 13330 1669 0 0 14172 8020 3034 65 29 6 > >>>>> dT: 1.056s w: 1.000s > >>>>> L(q) ops/s r/s kBps ms/r w/s kBps ms/w d/s kBps ms/d %busy Name > >>>>> 37 367 323 5463 6.4 45 1511 76.8 0 0 0.0 91.7 mmcsd0 > >>>>> 37 367 323 5463 6.5 45 1511 76.9 0 0 0.0 93.3 mmcsd0s2 > >>>>> 34 133 111 3209 7.6 22 697 134.7 0 0 0.0 74.0 mmcsd0s2a > >>>>> 3 235 212 2254 5.9 23 814 21.5 0 0 0.0 70.0 mmcsd0s2b > >>>>> 34 133 111 3209 7.6 22 697 134.7 0 0 0.0 74.2 ufs/rootfs > >>>>> Tue Mar 24 04:52:26 PDT 2020 > >>>>> Device 1K-blocks Used Avail Capacity > >>>>> /dev/mmcsd0s2b 4404252 224484 4179768 5% > >>>>> Mar 24 04:20:50 www kernel: swap_pager: indefinite wait buffer: bufobj: 0, blkno: 21842, size: 12288 > >>>>> Mar 24 04:20:50 www kernel: swap_pager: indefinite wait buffer: bufobj: 0, blkno: 25477, size: 4096 > >>>> > >>>> How bad were things back when the "indefinate wait buffer" notices > >>>> were generated (Mar 24 04:20:50 time frame)? > >>>> > >>> It looks like _new_ indefinite wait messages started coming at around Mon Mar 23 23:00:28 PDT 2020: > >>> procs memory page disks faults cpu > >>> r b w avm fre flt re pi po fr sr mm0 da0 in sy cs us sy id > >>> 2 0 18 1588824 197676 14108 2 1 0 14759 238 0 0 14055 9263 2668 62 32 5 > >>> dT: 1.027s w: 1.000s > >>> L(q) ops/s r/s kBps ms/r w/s kBps ms/w d/s kBps ms/d %busy Name > >>> 9 1 0 0 0.0 1 31 13451 0 0 0.0 326.9 mmcsd0 > >>> 9 1 0 0 0.0 1 31 13451 0 0 0.0 326.9 mmcsd0s2 > >>> 6 1 0 0 0.0 1 31 13451 0 0 0.0 326.9 mmcsd0s2a > >>> 6 1 0 0 0.0 1 31 13451 0 0 0.0 326.9 ufs/rootfs > >>> Mon Mar 23 23:00:28 PDT 2020 > >>> Device 1K-blocks Used Avail Capacity > >>> /dev/mmcsd0s2b 4404252 245288 4158964 6% > >>> Mar 23 23:00:28 www kernel: swap_pager: indefinite wait buffer: bufobj: 0, blkno: 280324, size: 8192 > >>> Mar 23 23:00:28 www kernel: swap_pager: indefinite wait buffer: bufobj: 0, blkno: 285903, size: 16384 > >>> > >>> Admittedly, 300% busy looks pretty bad. Still the machine didn't quit..... > >>> On the next sample %busy went down to 18% for swap, less for all else. > >> > >> As I understand, the ms/w value 13451 means that there was > >> for a time (a mean of ?) somewhat under 14 seconds from a > >> write being queued to it being completed. > >> > >> If a read could be blocked for such time frames by such > >> writes, it suggests that vm.pfault_oom_wait might need to > >> be larger than 13 if vm.pfault_oom_attempts=-1 is not in > >> use. > >> > > Ok, I'm starting to get it. On this machine it's 10. > > > > But the 13 second delay appeared at Mon Mar 23 23:00:28 PDT 2020. > > The kill occurred around Tue Mar 24 04:53:08 PDT 2020 > > under what look like much more benign circumstances. > > I had intended to ask about the text next-to/just-above, > not much earlier [thus, the "(Mar 24 04:20:50 time frame)"]. > So I was asking about: > > Mar 24 04:20:50 www kernel: swap_pager: indefinite wait buffer: bufobj: 0, blkno: 21842, size: 12288 > Mar 24 04:20:50 www kernel: swap_pager: indefinite wait buffer: bufobj: 0, blkno: 25477, size: 4096 > > and did not originally notice that you replied with > information for "Mon Mar 23 23:00:28 PDT 2020". > > Other than showing examples of what the delays can be, > the "Mar 24 04:20:50" is of more direct relevance for > the OOM kill that happened. > > I'll show the text around/before 04:20:50 later in this > note. It proves somewhat interesting. > > >> Or vm.pfault_oom_attempts=-1 could be used to avoid large > >> time frames from directly leading to OOM activity. > >> > > Understood. I simply forgot to restore the setting after > > the initial troubles with it. > > > >>>> > >>>> Below I show code changes to be more explicit in the > >>>> output messaging about what contributes to initiating > >>>> OOM kills, without needing boot verbose or the like. > >>>> There are also some messages from Mark J.'s old code > >>>> for reporting on things related to initiating OOM > >>>> kills, or some minor variations of them. > >>>> > >>> > >>>> You may want to try such changes to provide more > >>>> context for your OOM kills when they happen. Below > >>>> the 4 reasons reported are (showing the most > >>>> detailed of the related messages from different > >>>> stages): > >>>> > >>>> swp_pager_meta_build: swap blk uma zone exhausted > >>>> > >>>> swp_pager_meta_build: swap pctrie uma zone exhausted > >>>> > >>>> vm_fault_allocate: proc %d (%s) failed to alloc page on fault, starting OOM > >>>> > >>>> m_pageout_mightbe_oom: kill context: v_free_count: %u, v_inactive_count: %u > >>>> > >>>> > >>> > >>> Could inquiries instead be added to the logging script? > >>> Right now it's > >>> #!/bin/sh > >>> while true > >>> do vmstat ; gstat -abd -I 1s ; date ; swapinfo ; tail -n 2 /var/log/messages ; netstat -m | grep "mbuf clusters" > >>> done > >>> > >>> I'd much rather tamper with the logging script than the kernel 8-) > >> > >> Unfortunately, the reason I made the kernel messaging > >> changes is that, as far as I know, the normal kernel > >> simply does not publish the information anywhere when > >> the internal criteria leads to OOM activity: No > >> inquiry available without kernel changes. > > > > If all else fails I'll try to apply your patches to the kernel > > and recompile. > > Up to you, of course. > > > The text before "Mar 24 04:20:50" messages, with notes > and blank lines mixed in to group things: > > . . . > Tue Mar 24 04:20:38 PDT 2020 > Device 1K-blocks Used Avail Capacity > /dev/mmcsd0s2b 4404252 403784 4000468 9% > Mar 24 02:50:01 www kernel: swap_pager: indefinite wait buffer: bufobj: 0, blkno: 623056, size: 16384 > Mar 24 04:17:17 www kernel: swap_pager: indefinite wait buffer: bufobj: 0, blkno: 1033854, size: 4096 > 3/773/776/19177 mbuf clusters in use (current/cache/total/max) > > procs memory page disks faults cpu > r b w avm fre flt re pi po fr sr mm0 da0 in sy cs us sy id > 0 0 30 2034696 54180 12942 7 2 4 13432 1374 0 0 14074 8102 2849 65 29 5 > dT: 1.031s w: 1.000s > L(q) ops/s r/s kBps ms/r w/s kBps ms/w d/s kBps ms/d %busy Name > 18 1 0 0 0.0 1 31 26686 0 0 0.0 287.1 mmcsd0 > 18 1 0 0 0.0 1 31 26686 0 0 0.0 287.1 mmcsd0s2 > 10 1 0 0 0.0 1 31 26686 0 0 0.0 287.1 mmcsd0s2a > 10 1 0 0 0.0 1 31 26686 0 0 0.0 287.1 ufs/rootfs > Tue Mar 24 04:20:40 PDT 2020 > Device 1K-blocks Used Avail Capacity > /dev/mmcsd0s2b 4404252 403784 4000468 9% > Mar 24 02:50:01 www kernel: swap_pager: indefinite wait buffer: bufobj: 0, blkno: 623056, size: 16384 > Mar 24 04:17:17 www kernel: swap_pager: indefinite wait buffer: bufobj: 0, blkno: 1033854, size: 4096 > 6/770/776/19177 mbuf clusters in use (current/cache/total/max) > > Note that the above has 26sec+ ms/w figures. No reads. > No mmcsd0s2b. > > > procs memory page disks faults cpu > r b w avm fre flt re pi po fr sr mm0 da0 in sy cs us sy id > 0 0 30 2034696 54252 12942 7 2 4 13432 1374 0 0 14074 8102 2849 65 29 5 > dT: 1.064s w: 1.000s > L(q) ops/s r/s kBps ms/r w/s kBps ms/w d/s kBps ms/d %busy Name > Tue Mar 24 04:20:42 PDT 2020 > Device 1K-blocks Used Avail Capacity > /dev/mmcsd0s2b 4404252 403784 4000468 9% > Mar 24 02:50:01 www kernel: swap_pager: indefinite wait buffer: bufobj: 0, blkno: 623056, size: 16384 > Mar 24 04:17:17 www kernel: swap_pager: indefinite wait buffer: bufobj: 0, blkno: 1033854, size: 4096 > 9/1023/1032/19177 mbuf clusters in use (current/cache/total/max) > > For the above, note the lack of any lines with ms/r or ms/w figures, > just the title rows show. > Same thing happens when gstat -a is run on an idle machine. I thought it normal, the CPU had all the data it needed. > > procs memory page disks faults cpu > r b w avm fre flt re pi po fr sr mm0 da0 in sy cs us sy id > 0 0 30 2034696 53668 12942 7 2 4 13432 1374 0 0 14074 8102 2849 65 29 5 > dT: 1.046s w: 1.000s > L(q) ops/s r/s kBps ms/r w/s kBps ms/w d/s kBps ms/d %busy Name > 17 1 0 0 0.0 1 31 30042 0 0 0.0 320.7 mmcsd0 > 17 1 0 0 0.0 1 31 30042 0 0 0.0 320.7 mmcsd0s2 > 9 1 0 0 0.0 1 31 30042 0 0 0.0 320.7 mmcsd0s2a > 9 1 0 0 0.0 1 31 30042 0 0 0.0 320.7 ufs/rootfs > Tue Mar 24 04:20:43 PDT 2020 > Device 1K-blocks Used Avail Capacity > /dev/mmcsd0s2b 4404252 403784 4000468 9% > Mar 24 02:50:01 www kernel: swap_pager: indefinite wait buffer: bufobj: 0, blkno: 623056, size: 16384 > Mar 24 04:17:17 www kernel: swap_pager: indefinite wait buffer: bufobj: 0, blkno: 1033854, size: 4096 > 9/1023/1032/19177 mbuf clusters in use (current/cache/total/max) > > Note that the above has 30sec+ ms/w figures. No reads. > > Also note that the L(q) figures dropped by only 1 over > the about 04:20:40 to 04:20:43 interval (18->17 and > 10->9). (I only take these deltas as suggestive. They > might be accidental near matches.) No mmcsd0s2b. > > > procs memory page disks faults cpu > r b w avm fre flt re pi po fr sr mm0 da0 in sy cs us sy id > 0 0 30 2034696 53624 12942 7 2 4 13432 1374 0 0 14074 8102 2849 65 29 5 > dT: 1.003s w: 1.000s > L(q) ops/s r/s kBps ms/r w/s kBps ms/w d/s kBps ms/d %busy Name > Tue Mar 24 04:20:45 PDT 2020 > Device 1K-blocks Used Avail Capacity > /dev/mmcsd0s2b 4404252 403784 4000468 9% > Mar 24 02:50:01 www kernel: swap_pager: indefinite wait buffer: bufobj: 0, blkno: 623056, size: 16384 > Mar 24 04:17:17 www kernel: swap_pager: indefinite wait buffer: bufobj: 0, blkno: 1033854, size: 4096 > 12/1020/1032/19177 mbuf clusters in use (current/cache/total/max) > > For the above, note the lack of any lines with ms/r or ms/w figures, > just the title rows show. > It's tempting, but maybe not correct, the think the I/O caught up. > > procs memory page disks faults cpu > r b w avm fre flt re pi po fr sr mm0 da0 in sy cs us sy id > 0 0 30 2034696 53600 12942 7 2 4 13432 1374 0 0 14074 8102 2849 65 29 5 > dT: 1.063s w: 1.000s > L(q) ops/s r/s kBps ms/r w/s kBps ms/w d/s kBps ms/d %busy Name > Tue Mar 24 04:20:47 PDT 2020 > Device 1K-blocks Used Avail Capacity > /dev/mmcsd0s2b 4404252 403784 4000468 9% > Mar 24 02:50:01 www kernel: swap_pager: indefinite wait buffer: bufobj: 0, blkno: 623056, size: 16384 > Mar 24 04:17:17 www kernel: swap_pager: indefinite wait buffer: bufobj: 0, blkno: 1033854, size: 4096 > 12/1020/1032/19177 mbuf clusters in use (current/cache/total/max) > > For the above, note the lack of any lines with ms/r or ms/w figures, > just the title rows show. > Can't that be interpreted as gstat -a having nothing to report? Just _why_ it might have nothing to report never crossed my mind.... > > > procs memory page disks faults cpu > r b w avm fre flt re pi po fr sr mm0 da0 in sy cs us sy id > 0 0 30 2034696 53592 12942 7 2 4 13432 1374 0 0 14074 8102 2849 65 29 5 > dT: 1.006s w: 1.000s > L(q) ops/s r/s kBps ms/r w/s kBps ms/w d/s kBps ms/d %busy Name > Tue Mar 24 04:20:48 PDT 2020 > Device 1K-blocks Used Avail Capacity > /dev/mmcsd0s2b 4404252 403784 4000468 9% > Mar 24 02:50:01 www kernel: swap_pager: indefinite wait buffer: bufobj: 0, blkno: 623056, size: 16384 > Mar 24 04:17:17 www kernel: swap_pager: indefinite wait buffer: bufobj: 0, blkno: 1033854, size: 4096 > 15/1017/1032/19177 mbuf clusters in use (current/cache/total/max) > > For the above, note the lack of any lines with ms/r or ms/w figures, > just the title rows show. > > > > procs memory page disks faults cpu > r b w avm fre flt re pi po fr sr mm0 da0 in sy cs us sy id > 0 0 30 2034696 53672 12942 7 2 4 13432 1374 0 0 14074 8102 2849 65 29 5 > dT: 1.002s w: 1.000s > L(q) ops/s r/s kBps ms/r w/s kBps ms/w d/s kBps ms/d %busy Name > Tue Mar 24 04:20:50 PDT 2020 > Device 1K-blocks Used Avail Capacity > /dev/mmcsd0s2b 4404252 403784 4000468 9% > Mar 24 02:50:01 www kernel: swap_pager: indefinite wait buffer: bufobj: 0, blkno: 623056, size: 16384 > Mar 24 04:17:17 www kernel: swap_pager: indefinite wait buffer: bufobj: 0, blkno: 1033854, size: 4096 > 15/1017/1032/19177 mbuf clusters in use (current/cache/total/max) > > For the above, note the list of any lines with ms/r or ms/w figures, > just the title rows. > > > > procs memory page disks faults cpu > r b w avm fre flt re pi po fr sr mm0 da0 in sy cs us sy id > 2 0 30 2045028 54160 12942 7 2 4 13432 1374 0 0 14074 8102 2849 65 29 5 > dT: 1.001s w: 1.000s > L(q) ops/s r/s kBps ms/r w/s kBps ms/w d/s kBps ms/d %busy Name > 2 329 236 3839 13.0 93 1262 18.6 0 0 0.0 99.4 mmcsd0 > 2 329 236 3839 13.2 93 1262 18.7 0 0 0.0 101.1 mmcsd0s2 > 1 153 152 2816 12.5 1 32 366.5 0 0 0.0 95.2 mmcsd0s2a > 1 176 84 1023 14.5 92 1230 14.9 0 0 0.0 66.8 mmcsd0s2b > 1 153 152 2816 12.5 1 32 366.6 0 0 0.0 95.3 ufs/rootfs > Tue Mar 24 04:20:51 PDT 2020 > Device 1K-blocks Used Avail Capacity > /dev/mmcsd0s2b 4404252 296976 4107276 7% > Mar 24 04:20:50 www kernel: swap_pager: indefinite wait buffer: bufobj: 0, blkno: 21842, size: 12288 > Mar 24 04:20:50 www kernel: swap_pager: indefinite wait buffer: bufobj: 0, blkno: 25477, size: 4096 > 0/1032/1032/19177 mbuf clusters in use (current/cache/total/max) > > For the above, the lines with ms/r and ms/w figures are back > but the indefinite wait buffer for "Mar 24 04:20:50" happened > too, before the "Tue Mar 24 04:20:51 PDT 2020" above. I'd guess > they are from before the data that the gstat output is based on. > > Also there are lots of reads and some writes in the above > gstat output. mmcsd0s2b is again showing as well. > > Note: It looks like ufs/rootfs might be a label > identifying mmcsd0s2a, so they track up to slight > time-of-information differences. I'm guessing that > mmcsd0s2b is just a swap/paging partition. > That's correct. > So it looks like activity for ufs/rootfs may have > blocked/delayed activity for paging/swaping > (mmcsd0s2b), at least for a time, even if it is > not directly the cause of the specific OOM > activity. > How would one distinguish slow swap i/o from slow filesystem i/o ? Would moving swap to a USB device avoid blocking by writes to microSD? That's not hard to try, but it puts a whole new kettle of fish on the fire. It should be added that the original buildworld has been restarted with vm.pfault_oom_attempts: -1 and so far, though it still generates the "indefinite wait" warnings, the ms/w times look better. It's not done yet, though. How much overhead comes with OOMA? Many thanks for your insights, bob prohaska