Date: Tue, 24 Mar 2020 10:09:28 -0700 From: Mark Millard <marklmi@yahoo.com> To: bob prohaska <fbsd@www.zefox.net> Cc: freebsd-arm@freebsd.org Subject: Re: Belated out of swap kill on rpi3 at r359216 Message-ID: <83E41A13-6C24-4B56-A837-779044038FBC@yahoo.com> In-Reply-To: <20200324155753.GA91922@www.zefox.net> References: <20200324155753.GA91922@www.zefox.net>
next in thread | previous in thread | raw e-mail | index | archive | help
On 2020-Mar-24, at 08:57, bob prohaska <fbsd at www.zefox.net> wrote: > An attempt to buildworld on an rpi3 running r359216 stopped with an > OOMA kill. Sources were at 359264, loader.conf contained > vm.pageout_oom_seq=3D"4096" .=20 What of the value of vm.pfault_oom_attempts ? If vm.pfault_oom_attempts was not -1, what of the value of vm.pfault_oom_wait ? > A snipped of gstat log suggests the worst congestion in the storage = I/O > happened at Tue Mar 24 04:52:26 PDT 2020 with an L(q) of 37, but the > OOMA kill happened at Tue Mar 24 04:53:04 PDT 2020, by which time the > L(q) had dropped to one, half a minute later. >=20 > Is the delay in OOMA action to be expected?=20 >=20 > Here's the relevant part of the log, I hope the columns display = readably: >=20 > 0/2/2/19177 mbuf clusters in use (current/cache/total/max) > procs memory page disks faults = cpu > r b w avm fre flt re pi po fr sr mm0 da0 in sy = cs us sy id > 0 0 29 1897320 47312 12851 9 4 5 13330 1669 0 0 14172 = 8020 3034 65 29 6 > dT: 1.056s w: 1.000s > L(q) ops/s r/s kBps ms/r w/s kBps ms/w d/s kBps = ms/d %busy Name > 37 367 323 5463 6.4 45 1511 76.8 0 0 = 0.0 91.7 mmcsd0 > 37 367 323 5463 6.5 45 1511 76.9 0 0 = 0.0 93.3 mmcsd0s2 > 34 133 111 3209 7.6 22 697 134.7 0 0 = 0.0 74.0 mmcsd0s2a > 3 235 212 2254 5.9 23 814 21.5 0 0 = 0.0 70.0 mmcsd0s2b > 34 133 111 3209 7.6 22 697 134.7 0 0 = 0.0 74.2 ufs/rootfs > Tue Mar 24 04:52:26 PDT 2020 > Device 1K-blocks Used Avail Capacity > /dev/mmcsd0s2b 4404252 224484 4179768 5% > Mar 24 04:20:50 www kernel: swap_pager: indefinite wait buffer: = bufobj: 0, blkno: 21842, size: 12288 > Mar 24 04:20:50 www kernel: swap_pager: indefinite wait buffer: = bufobj: 0, blkno: 25477, size: 4096 How bad were things back when the "indefinate wait buffer" notices were generated (Mar 24 04:20:50 time frame)? > 0/0/0/19177 mbuf clusters in use (current/cache/total/max) > procs memory page disks faults = cpu > r b w avm fre flt re pi po fr sr mm0 da0 in sy = cs us sy id > 3 0 29 1936132 19232 12849 9 4 5 13328 1671 0 0 14173 = 8020 3036 65 29 6 > dT: 1.010s w: 1.000s > L(q) ops/s r/s kBps ms/r w/s kBps ms/w d/s kBps = ms/d %busy Name > 9 323 148 2531 20.7 175 3327 14.6 0 0 = 0.0 97.2 mmcsd0 > 9 323 148 2531 20.8 175 3327 14.6 0 0 = 0.0 98.4 mmcsd0s2 > 2 96 96 1869 16.9 0 0 0.0 0 0 = 0.0 82.9 mmcsd0s2a > 7 227 51 661 28.3 175 3327 14.6 0 0 = 0.0 84.0 mmcsd0s2b > 2 96 96 1869 16.9 0 0 0.0 0 0 = 0.0 83.0 ufs/rootfs > Tue Mar 24 04:52:41 PDT 2020 > Device 1K-blocks Used Avail Capacity > /dev/mmcsd0s2b 4404252 272928 4131324 6% > Mar 24 04:20:50 www kernel: swap_pager: indefinite wait buffer: = bufobj: 0, blkno: 21842, size: 12288 > Mar 24 04:20:50 www kernel: swap_pager: indefinite wait buffer: = bufobj: 0, blkno: 25477, size: 4096 > 0/0/0/19177 mbuf clusters in use (current/cache/total/max) > procs memory page disks faults = cpu > r b w avm fre flt re pi po fr sr mm0 da0 in sy = cs us sy id > 0 0 29 1939940 48016 12849 9 4 5 13328 1676 948 0 14174 = 8019 3037 65 29 6 > dT: 1.001s w: 1.000s > L(q) ops/s r/s kBps ms/r w/s kBps ms/w d/s kBps = ms/d %busy Name > 3 371 151 1989 91.0 221 3847 14.0 0 0 = 0.0 100.1 mmcsd0 > 3 371 151 1989 91.1 221 3847 14.0 0 0 = 0.0 100.1 mmcsd0s2 > 1 41 41 415 74.5 0 0 0.0 0 0 = 0.0 304.8 mmcsd0s2a > 2 331 110 1574 97.3 221 3847 14.0 0 0 = 0.0 99.3 mmcsd0s2b > 1 41 41 415 74.5 0 0 0.0 0 0 = 0.0 304.9 ufs/rootfs > Tue Mar 24 04:52:45 PDT 2020 > Device 1K-blocks Used Avail Capacity > /dev/mmcsd0s2b 4404252 282480 4121772 6% > Mar 24 04:20:50 www kernel: swap_pager: indefinite wait buffer: = bufobj: 0, blkno: 21842, size: 12288 > Mar 24 04:20:50 www kernel: swap_pager: indefinite wait buffer: = bufobj: 0, blkno: 25477, size: 4096 > 0/2/2/19177 mbuf clusters in use (current/cache/total/max) > procs memory page disks faults = cpu > r b w avm fre flt re pi po fr sr mm0 da0 in sy = cs us sy id > 2 0 29 1944652 42768 12849 9 4 5 13328 1678 0 0 14174 = 8019 3037 65 29 6 > dT: 1.010s w: 1.000s > L(q) ops/s r/s kBps ms/r w/s kBps ms/w d/s kBps = ms/d %busy Name > 2 498 498 6345 3.5 0 0 0.0 0 0 = 0.0 88.8 mmcsd0 > 2 498 498 6345 3.9 0 0 0.0 0 0 = 0.0 98.3 mmcsd0s2 > 1 152 152 1501 4.1 0 0 0.0 0 0 = 0.0 62.1 mmcsd0s2a > 1 346 346 4844 3.9 0 0 0.0 0 0 = 0.0 88.8 mmcsd0s2b > 1 152 152 1501 4.2 0 0 0.0 0 0 = 0.0 62.5 ufs/rootfs > Tue Mar 24 04:52:47 PDT 2020 > Device 1K-blocks Used Avail Capacity > /dev/mmcsd0s2b 4404252 282396 4121856 6% > Mar 24 04:20:50 www kernel: swap_pager: indefinite wait buffer: = bufobj: 0, blkno: 21842, size: 12288 > Mar 24 04:20:50 www kernel: swap_pager: indefinite wait buffer: = bufobj: 0, blkno: 25477, size: 4096 > 0/0/0/19177 mbuf clusters in use (current/cache/total/max) > procs memory page disks faults = cpu > r b w avm fre flt re pi po fr sr mm0 da0 in sy = cs us sy id > 5 0 28 1980200 19676 12848 9 4 5 13327 1684 714 0 14175 = 8019 3039 65 29 6 > dT: 1.002s w: 1.000s > L(q) ops/s r/s kBps ms/r w/s kBps ms/w d/s kBps = ms/d %busy Name > 9 487 174 1449 26.2 313 2427 9.1 0 0 = 0.0 98.4 mmcsd0 > 9 487 174 1449 26.4 313 2427 9.1 0 0 = 0.0 100.0 mmcsd0s2 > 1 45 45 267 22.9 0 0 0.0 0 0 = 0.0 80.3 mmcsd0s2a > 9 442 129 1182 27.7 313 2427 9.1 0 0 = 0.0 97.2 mmcsd0s2b > 1 45 45 267 22.9 0 0 0.0 0 0 = 0.0 80.4 ufs/rootfs > Tue Mar 24 04:53:01 PDT 2020 > Device 1K-blocks Used Avail Capacity > /dev/mmcsd0s2b 4404252 309996 4094256 7% > Mar 24 04:20:50 www kernel: swap_pager: indefinite wait buffer: = bufobj: 0, blkno: 21842, size: 12288 > Mar 24 04:20:50 www kernel: swap_pager: indefinite wait buffer: = bufobj: 0, blkno: 25477, size: 4096 > 0/2/2/19177 mbuf clusters in use (current/cache/total/max) > procs memory page disks faults = cpu > r b w avm fre flt re pi po fr sr mm0 da0 in sy = cs us sy id > 0 0 29 1983636 44400 12848 9 4 5 13326 1689 0 0 14175 = 8018 3040 65 29 6 > dT: 1.002s w: 1.000s > L(q) ops/s r/s kBps ms/r w/s kBps ms/w d/s kBps = ms/d %busy Name > 12 528 0 0 0.0 528 3797 7.5 0 0 = 0.0 100.0 mmcsd0 > 12 528 0 0 0.0 528 3797 7.5 0 0 = 0.0 100.0 mmcsd0s2 > 11 528 0 0 0.0 528 3797 7.5 0 0 = 0.0 100.0 mmcsd0s2b > Tue Mar 24 04:53:05 PDT 2020 > Device 1K-blocks Used Avail Capacity > /dev/mmcsd0s2b 4404252 322360 4081892 7% > Mar 24 04:20:50 www kernel: swap_pager: indefinite wait buffer: = bufobj: 0, blkno: 21842, size: 12288 > Mar 24 04:20:50 www kernel: swap_pager: indefinite wait buffer: = bufobj: 0, blkno: 25477, size: 4096 > 3/769/772/19177 mbuf clusters in use (current/cache/total/max) > procs memory page disks faults = cpu > r b w avm fre flt re pi po fr sr mm0 da0 in sy = cs us sy id > 1 0 29 1826272 165204 12848 9 4 5 13326 1689 971 0 14176 = 8018 3040 65 29 6 > dT: 1.001s w: 1.000s > L(q) ops/s r/s kBps ms/r w/s kBps ms/w d/s kBps = ms/d %busy Name > 1 614 604 4661 4.9 10 168 40.6 0 0 = 0.0 93.8 mmcsd0 > 1 614 604 4661 4.9 10 168 40.6 0 0 = 0.0 94.5 mmcsd0s2 > 0 210 209 1243 4.1 1 32 181.0 0 0 = 0.0 83.1 mmcsd0s2a > 1 404 395 3418 5.4 9 136 25.1 0 0 = 0.0 81.0 mmcsd0s2b > 0 210 209 1243 4.1 1 32 181.0 0 0 = 0.0 83.6 ufs/rootfs > Tue Mar 24 04:53:08 PDT 2020 > Device 1K-blocks Used Avail Capacity > /dev/mmcsd0s2b 4404252 230064 4174188 5% > Mar 24 04:20:50 www kernel: swap_pager: indefinite wait buffer: = bufobj: 0, blkno: 25477, size: 4096 > Mar 24 04:53:04 www kernel: pid 96708 (c++), jid 0, uid 0, was killed: = out of swap space > 0/772/772/19177 mbuf clusters in use (current/cache/total/max) > procs memory page disks faults = cpu > r b w avm fre flt re pi po fr sr mm0 da0 in sy = cs us sy id > 2 0 28 1705884 221144 12848 9 4 5 13326 1689 1109 0 14176 = 8018 3041 65 29 6 > dT: 1.002s w: 1.000s > L(q) ops/s r/s kBps ms/r w/s kBps ms/w d/s kBps = ms/d %busy Name > 0 545 545 3580 2.1 0 0 0.0 0 0 = 0.0 74.0 mmcsd0 > 0 545 545 3580 2.1 0 0 0.0 0 0 = 0.0 75.7 mmcsd0s2 > 0 288 288 1517 2.0 0 0 0.0 0 0 = 0.0 56.4 mmcsd0s2a > 0 256 256 2064 2.4 0 0 0.0 0 0 = 0.0 49.0 mmcsd0s2b > 0 288 288 1517 2.0 0 0 0.0 0 0 = 0.0 57.1 ufs/rootfs > Tue Mar 24 04:53:10 PDT 2020 > Device 1K-blocks Used Avail Capacity > /dev/mmcsd0s2b 4404252 230064 4174188 5% > Mar 24 04:20:50 www kernel: swap_pager: indefinite wait buffer: = bufobj: 0, blkno: 25477, size: 4096 > Mar 24 04:53:04 www kernel: pid 96708 (c++), jid 0, uid 0, was killed: = out of swap space >=20 Below I show code changes to be more explicit in the output messaging about what contributes to initiating OOM kills, without needing boot verbose or the like. There are also some messages from Mark J.'s old code for reporting on things related to initiating OOM kills, or some minor variations of them. You may want to try such changes to provide more context for your OOM kills when they happen. Below the 4 reasons reported are (showing the most detailed of the related messages from different stages): swp_pager_meta_build: swap blk uma zone exhausted swp_pager_meta_build: swap pctrie uma zone exhausted vm_fault_allocate: proc %d (%s) failed to alloc page on fault, starting = OOM m_pageout_mightbe_oom: kill context: v_free_count: %u, v_inactive_count: = %u The below is based on my head -r358966 context for the detailed diff's. # svnlite diff /usr/src/sys/vm/ | more Index: /usr/src/sys/vm/swap_pager.c =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D --- /usr/src/sys/vm/swap_pager.c (revision 358966) +++ /usr/src/sys/vm/swap_pager.c (working copy) @@ -1998,6 +1998,7 @@ 0, 1)) printf("swap blk zone exhausted, = " "increase = kern.maxswzone\n"); + printf("swp_pager_meta_build: swap blk = uma zone exhausted\n"); vm_pageout_oom(VM_OOM_SWAPZ); pause("swzonxb", 10); } else @@ -2028,6 +2029,7 @@ 0, 1)) printf("swap pctrie zone = exhausted, " "increase = kern.maxswzone\n"); + printf("swp_pager_meta_build: swap = pctrie uma zone exhausted\n"); vm_pageout_oom(VM_OOM_SWAPZ); pause("swzonxp", 10); } else Index: /usr/src/sys/vm/vm_fault.c =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D --- /usr/src/sys/vm/vm_fault.c (revision 358966) +++ /usr/src/sys/vm/vm_fault.c (working copy) @@ -1074,9 +1074,8 @@ fs->oom++; vm_waitpfault(dset, vm_pfault_oom_wait * hz); } else { - if (bootverbose) - printf( - "proc %d (%s) failed to alloc page on fault, starting = OOM\n", + printf( + "vm_fault_allocate: proc %d (%s) failed to alloc page on = fault, starting OOM\n", curproc->p_pid, curproc->p_comm); vm_pageout_oom(VM_OOM_MEM_PF); fs->oom =3D 0; Index: /usr/src/sys/vm/vm_page.c =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D --- /usr/src/sys/vm/vm_page.c (revision 358966) +++ /usr/src/sys/vm/vm_page.c (working copy) @@ -3134,6 +3134,7 @@ * race-free vm_wait_domain(). */ if (curproc =3D=3D pageproc) { + printf("thread %d waiting for memory\n", = curthread->td_tid); mtx_lock(&vm_domainset_lock); vm_pageproc_waiters++; msleep(&vm_pageproc_waiters, &vm_domainset_lock, PVM | = PDROP, Index: /usr/src/sys/vm/vm_pageout.c =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D --- /usr/src/sys/vm/vm_pageout.c (revision 358966) +++ /usr/src/sys/vm/vm_pageout.c (working copy) @@ -1741,6 +1741,8 @@ * start OOM. Initiate the selection and signaling of the * victim. */ + printf("vm_pageout_mightbe_oom: kill context: v_free_count: %u, = v_inactive_count: %u\n", + vmd->vmd_free_count, = vmd->vmd_pagequeues[PQ_INACTIVE].pq_cnt); vm_pageout_oom(VM_OOM_MEM); =20 /* @@ -1933,10 +1935,24 @@ } sx_sunlock(&allproc_lock); if (bigproc !=3D NULL) { + char *reason_text; + switch (shortage) { + case VM_OOM_MEM_PF: + reason_text=3D "fault's page allocation failed"; + break; + case VM_OOM_MEM: + reason_text=3D "free RAM stayed below = threshold"; + break; + case VM_OOM_SWAPZ: + reason_text=3D "swblk or swpctrie zone = exhausted"; + break; + default: + reason_text=3D "Unknown Reason"; + } if (vm_panic_on_oom !=3D 0 && --vm_panic_on_oom =3D=3D = 0) - panic("out of swap space"); + panic("%s",reason_text); PROC_LOCK(bigproc); - killproc(bigproc, "out of swap space"); + killproc(bigproc, reason_text); sched_nice(bigproc, PRIO_MIN); _PRELE(bigproc); PROC_UNLOCK(bigproc); =3D=3D=3D Mark Millard marklmi at yahoo.com ( dsl-only.net went away in early 2018-Mar)
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?83E41A13-6C24-4B56-A837-779044038FBC>