Date: Sat, 29 Jan 2022 03:59:40 -0800 From: Mark Millard <marklmi@yahoo.com> To: bob prohaska <fbsd@www.zefox.net> Cc: Free BSD <freebsd-arm@freebsd.org> Subject: Re: devel/llvm13 failed to reclaim memory on 8 GB Pi4 running -current [ZFS context: similar to UFS] Message-ID: <BFEC2EC0-3127-49A2-93FD-F059AF7842A7@yahoo.com> In-Reply-To: <6D67BFDF-D786-4BB7-BF2D-CE4D5532D452@yahoo.com> References: <20220127164512.GA51200@www.zefox.net> <C8BDF77F-5144-4234-A453-8DEC9EA9E227@yahoo.com> <2C7E741F-4703-4E41-93FE-72E1F16B60E2@yahoo.com> <20220127214801.GA51710@www.zefox.net> <5E861D46-128A-4E09-A3CF-736195163B17@yahoo.com> <20220127233048.GA51951@www.zefox.net> <6528ED25-A3C6-4277-B951-1F58ADA2D803@yahoo.com> <10B4E2F0-6219-4674-875F-A7B01CA6671C@yahoo.com> <54CD0806-3902-4B9C-AA30-5ED003DE4D41@yahoo.com> <A4FA4E8B-635B-454E-87D1-C36A84E2C3BA@yahoo.com> <9771EB33-037E-403E-8A77-7E8E98DCF375@yahoo.com> <B12D2AB9-147E-49EF-854F-A3B999ADDECC@yahoo.com> <BA25F969-4DAC-4E5D-88EF-9475139B6B8A@yahoo.com> <6D67BFDF-D786-4BB7-BF2D-CE4D5532D452@yahoo.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On 2022-Jan-28, at 19:20, Mark Millard <marklmi@yahoo.com> wrote: > On 2022-Jan-28, at 15:05, Mark Millard <marklmi@yahoo.com> wrote: >=20 >> On 2022-Jan-28, at 00:31, Mark Millard <marklmi@yahoo.com> wrote: >>=20 >>>> . . . >>>=20 >>> UFS context: >>>=20 >>> . . .; load averages: . . . MaxObs: 5.47, 4.99, 4.82 >>> . . . threads: . . ., 14 MaxObsRunning >>> . . . >>> Mem: . . ., 6457Mi MaxObsActive, 1263Mi MaxObsWired, 7830Mi = MaxObs(Act+Wir+Lndry) >>> Swap: 8192Mi Total, 8192Mi Used, K Free, 100% Inuse, 8192Mi = MaxObsUsed, 14758Mi MaxObs(Act+Lndry+SwapUsed), 16017Mi = MaxObs(Act+Wir+Lndry+SwapUsed) >>>=20 >>>=20 >>> Console: >>>=20 >>> swap_pager: out of swap space >>> swp_pager_getswapspace(4): failed >>> swp_pager_getswapspace(1): failed >>> swp_pager_getswapspace(1): failed >>> swp_pager_getswapspace(2): failed >>> swp_pager_getswapspace(2): failed >>> swp_pager_getswapspace(4): failed >>> swp_pager_getswapspace(1): failed >>> swp_pager_getswapspace(9): failed >>> swp_pager_getswapspace(4): failed >>> swp_pager_getswapspace(7): failed >>> swp_pager_getswapspace(29): failed >>> swp_pager_getswapspace(9): failed >>> swp_pager_getswapspace(1): failed >>> swp_pager_getswapspace(2): failed >>> swp_pager_getswapspace(1): failed >>> swp_pager_getswapspace(4): failed >>> swp_pager_getswapspace(1): failed >>> swp_pager_getswapspace(10): failed >>>=20 >>> . . . Then some time with no messages . . . >>>=20 >>> vm_pageout_mightbe_oom: kill context: v_free_count: 7740, = v_inactive_count: 1 >>> Jan 27 23:01:07 CA72_UFS kernel: pid 57238 (c++), jid 3, uid 0, was = killed: failed to reclaim memory >>> swp_pager_getswapspace(2): failed >>>=20 >>>=20 >>> Note: The "vm_pageout_mightbe_oom: kill context:" >>> notice is one of the few parts of an old reporting >>> patch Mark J. had supplied (long ago) that still >>> fits in the modern code (or that I was able to keep >>> updated enough to fit, anyway). It is another of the >>> personal updates that I keep in my source trees, >>> such as in /usr/main-src/ . >>>=20 >>> diff --git a/sys/vm/vm_pageout.c b/sys/vm/vm_pageout.c >>> index 36d5f3275800..f345e2d4a2d4 100644 >>> --- a/sys/vm/vm_pageout.c >>> +++ b/sys/vm/vm_pageout.c >>> @@ -1828,6 +1828,8 @@ vm_pageout_mightbe_oom(struct vm_domain *vmd, = int page_shortage, >>> * start OOM. Initiate the selection and signaling of the >>> * victim. >>> */ >>> + printf("vm_pageout_mightbe_oom: kill context: v_free_count: = %u, v_inactive_count: %u\n", >>> + vmd->vmd_free_count, = vmd->vmd_pagequeues[PQ_INACTIVE].pq_cnt); >>> vm_pageout_oom(VM_OOM_MEM); >>>=20 >>> /* >>>=20 >>>=20 >>> Again, I'd used vm.pfault_oom_attempts inappropriately >>> for running out of swap (although with UFS it did do >>> a kill fairly soon): >>>=20 >>> # Delay when persistent low free RAM leads to >>> # Out Of Memory killing of processes: >>> vm.pageout_oom_seq=3D120 >>> # >>> # For plunty of swap/paging space (will not >>> # run out), avoid pageout delays leading to >>> # Out Of Memory killing of processes: >>> vm.pfault_oom_attempts=3D-1 >>> # >>> # For possibly insufficient swap/paging space >>> # (might run out), increase the pageout delay >>> # that leads to Out Of Memory killing of >>> # processes (showing defaults at the time): >>> #vm.pfault_oom_attempts=3D 3 >>> #vm.pfault_oom_wait=3D 10 >>> # (The multiplication is the total but there >>> # are other potential tradoffs in the factors >>> # multiplied, even for nearly the same total.) >>>=20 >>> I'll change: >>>=20 >>> vm.pfault_oom_attempts >>> vm.pfault_oom_wait >>>=20 >>> and reboot --and start the bulk somewhat before >>> going to bed. >>>=20 >>>=20 >>> For reference: >>>=20 >>> [00:02:13] [01] [00:00:00] Building devel/llvm13 | llvm13-13.0.0_3 >>> [07:37:05] [01] [07:34:52] Finished devel/llvm13 | llvm13-13.0.0_3: = Failed: build >>>=20 >>>=20 >>> [ 65% 4728/7265] . . . flang/lib/Evaluate/fold-designator.cpp >>> [ 65% 4729/7265] . . . flang/lib/Evaluate/fold-integer.cpp >>> FAILED: = tools/flang/lib/Evaluate/CMakeFiles/obj.FortranEvaluate.dir/fold-integer.c= pp.o=20 >>> [ 65% 4729/7265] . . . flang/lib/Evaluate/fold-logical.cpp >>> [ 65% 4729/7265] . . . flang/lib/Evaluate/fold-complex.cpp >>> [ 65% 4729/7265] . . . flang/lib/Evaluate/fold-real.cpp >>>=20 >>> So the flang/lib/Evaluate/fold-integer.cpp one was the one killed. >>>=20 >>> Notably, the specific sources being compiled are different >>> than in the ZFS context report. But this might be because >>> of my killing ninja explicitly in the ZFS context, before >>> killing the running compilers. >>>=20 >>> Again, using the options to avoid building the Fortran >>> compiler probably avoids such memory use --if you do not >>> need the Fortran compiler. >>=20 >>=20 >> UFS based on instead using (not vm.pfault_oom_attempts=3D-1): >>=20 >> vm.pfault_oom_attempts=3D 3 >> vm.pfault_oom_wait=3D 10 >>=20 >> It reached swap-space-full: >>=20 >> . . .; load averages: . . . MaxObs: 5.42, 4.98, 4.80 >> . . . threads: . . ., 11 MaxObsRunning >> . . . >> Mem: . . ., 6482Mi MaxObsActive, 1275Mi MaxObsWired, 7832Mi = MaxObs(Act+Wir+Lndry) >> Swap: 8192Mi Total, 8192Mi Used, K Free, 100% Inuse, 4096B In, 81920B = Out, 8192Mi MaxObsUsed, 14733Mi MaxObs(Act+Lndry+SwapUsed), 16007Mi = MaxObs(Act+Wir+Lndry+SwapUsed) >>=20 >>=20 >> swap_pager: out of swap space >> swp_pager_getswapspace(5): failed >> swp_pager_getswapspace(25): failed >> swp_pager_getswapspace(1): failed >> swp_pager_getswapspace(31): failed >> swp_pager_getswapspace(6): failed >> swp_pager_getswapspace(1): failed >> swp_pager_getswapspace(25): failed >> swp_pager_getswapspace(10): failed >> swp_pager_getswapspace(17): failed >> swp_pager_getswapspace(27): failed >> swp_pager_getswapspace(5): failed >> swp_pager_getswapspace(11): failed >> swp_pager_getswapspace(9): failed >> swp_pager_getswapspace(29): failed >> swp_pager_getswapspace(2): failed >> swp_pager_getswapspace(1): failed >> swp_pager_getswapspace(9): failed >> swp_pager_getswapspace(20): failed >> swp_pager_getswapspace(4): failed >> swp_pager_getswapspace(21): failed >> swp_pager_getswapspace(11): failed >> swp_pager_getswapspace(2): failed >> swp_pager_getswapspace(21): failed >> swp_pager_getswapspace(2): failed >> swp_pager_getswapspace(1): failed >> swp_pager_getswapspace(2): failed >> swp_pager_getswapspace(3): failed >> swp_pager_getswapspace(3): failed >> swp_pager_getswapspace(2): failed >> swp_pager_getswapspace(1): failed >> swp_pager_getswapspace(20): failed >> swp_pager_getswapspace(2): failed >> swp_pager_getswapspace(1): failed >> swp_pager_getswapspace(16): failed >> swp_pager_getswapspace(6): failed >> swap_pager: out of swap space >> swp_pager_getswapspace(4): failed >> swp_pager_getswapspace(9): failed >> swp_pager_getswapspace(17): failed >> swp_pager_getswapspace(30): failed >> swp_pager_getswapspace(1): failed >>=20 >> . . . Then some time with no messages . . . >>=20 >> vm_pageout_mightbe_oom: kill context: v_free_count: 7875, = v_inactive_count: 1 >> Jan 28 14:36:44 CA72_UFS kernel: pid 55178 (c++), jid 3, uid 0, was = killed: failed to reclaim memory >> swp_pager_getswapspace(11): failed >>=20 >>=20 >> So, not all that much different from how the >> vm.pfault_oom_attempts=3D-1 example looked. >>=20 >>=20 >> [00:01:00] [01] [00:00:00] Building devel/llvm13 | llvm13-13.0.0_3 >> [07:41:39] [01] [07:40:39] Finished devel/llvm13 | llvm13-13.0.0_3: = Failed: build >>=20 >> Again it killed: >>=20 >> FAILED: = tools/flang/lib/Evaluate/CMakeFiles/obj.FortranEvaluate.dir/fold-integer.c= pp.o >>=20 >> So, basically the same stopping area as for the >> vm.pfault_oom_attempts=3D-1 example. >>=20 >>=20 >> I'll set things up for swap totaling to 30 GiBytes, reboot, >> and start it again. This will hopefully let me see and >> report MaxObs??? figures for a successful build when there >> is RAM+SWAP: 38 GiBytes. So: more than 9 GiBytes per compiler >> instance (mean). >=20 > The analogous ZFS test with: >=20 > vm.pfault_oom_attempts=3D 3 > vm.pfault_oom_wait=3D 10 >=20 > got: >=20 > . . .; load averages: . . . MaxObs: 5.90, 5.07, 4.80 > . . . threads: . . ., 11 MaxObsRunning > . . . > Mem: . . ., 6006Mi MaxObsActive > . . . > Swap: 8192Mi Total, 8192Mi Used, 32768B Free, 99% Inuse, 28984Ki In, = 4792Ki Out, 8192Mi MaxObsUsed, 14282Mi MaxObs(Act+Lndry+SwapUsed), = 16009Mi MaxObs(Act+Wir+Lndry+SwapUsed) >=20 > (I got that slightly early, before the 100% showed up.) >=20 >=20 > swap_pager: out of swap space > swp_pager_getswapspace(10): failed > swp_pager_getswapspace(1): failed > swp_pager_getswapspace(4): failed > swp_pager_getswapspace(16): failed > swp_pager_getswapspace(5): failed > swp_pager_getswapspace(2): failed > swp_pager_getswapspace(8): failed > swp_pager_getswapspace(12): failed > swp_pager_getswapspace(1): failed > swp_pager_getswapspace(32): failed > swp_pager_getswapspace(4): failed > swp_pager_getswapspace(9): failed > swp_pager_getswapspace(4): failed > swp_pager_getswapspace(17): failed > swp_pager_getswapspace(21): failed > swp_pager_getswapspace(10): failed > swp_pager_getswapspace(18): failed > swp_pager_getswapspace(6): failed > swp_pager_getswapspace(2): failed > swp_pager_getswapspace(14): failed > swp_pager_getswapspace(1): failed > swp_pager_getswapspace(5): failed > swp_pager_getswapspace(25): failed > swp_pager_getswapspace(12): failed > swp_pager_getswapspace(5): failed > swp_pager_getswapspace(7): failed > swp_pager_getswapspace(10): failed > swp_pager_getswapspace(3): failed > swp_pager_getswapspace(24): failed > swap_pager: out of swap space > swp_pager_getswapspace(11): failed > swap_pager: out of swap space > swp_pager_getswapspace(17): failed > swp_pager_getswapspace(5): failed > swp_pager_getswapspace(1): failed > swp_pager_getswapspace(32): failed > swp_pager_getswapspace(15): failed > swp_pager_getswapspace(19): failed > swp_pager_getswapspace(1): failed > swp_pager_getswapspace(25): failed > swp_pager_getswapspace(11): failed > swp_pager_getswapspace(1): failed > swp_pager_getswapspace(15): failed > swp_pager_getswapspace(1): failed > swp_pager_getswapspace(8): failed > swp_pager_getswapspace(31): failed > swp_pager_getswapspace(26): failed > swp_pager_getswapspace(1): failed > swp_pager_getswapspace(20): failed > swp_pager_getswapspace(4): failed > swp_pager_getswapspace(3): failed > swp_pager_getswapspace(3): failed > swp_pager_getswapspace(9): failed > swp_pager_getswapspace(1): failed > swp_pager_getswapspace(15): failed > swp_pager_getswapspace(3): failed > swp_pager_getswapspace(7): failed > swp_pager_getswapspace(8): failed > swp_pager_getswapspace(17): failed > swp_pager_getswapspace(2): failed > swp_pager_getswapspace(10): failed > swp_pager_getswapspace(6): failed > swp_pager_getswapspace(2): failed > swp_pager_getswapspace(11): failed > swp_pager_getswapspace(21): failed > swp_pager_getswapspace(1): failed > swp_pager_getswapspace(9): failed > swp_pager_getswapspace(32): failed > swp_pager_getswapspace(2): failed > swp_pager_getswapspace(32): failed > swp_pager_getswapspace(25): failed > swp_pager_getswapspace(21): failed > swp_pager_getswapspace(22): failed > swp_pager_getswapspace(14): failed > swp_pager_getswapspace(10): failed > swap_pager: out of swap space > swp_pager_getswapspace(1): failed > swp_pager_getswapspace(28): failed > swp_pager_getswapspace(2): failed > swp_pager_getswapspace(13): failed > swp_pager_getswapspace(3): failed > swp_pager_getswapspace(31): failed > swp_pager_getswapspace(20): failed > swp_pager_getswapspace(2): failed > vm_pageout_mightbe_oom: kill context: v_free_count: 8186, = v_inactive_count: 1 > Jan 28 18:42:42 CA72_4c8G_ZFS kernel: pid 98734 (c++), jid 3, uid 0, = was killed: failed to reclaim memory >=20 > [00:00:49] [01] [00:00:00] Building devel/llvm13 | llvm13-13.0.0_3 > [08:06:09] [01] [08:05:20] Finished devel/llvm13 | llvm13-13.0.0_3: = Failed: build >=20 > FAILED: = tools/flang/lib/Evaluate/CMakeFiles/obj.FortranEvaluate.dir/fold-complex.c= pp.o >=20 > and flang/lib/Evaluate/fold-integer.cpp was one of the compiles going = on. Finally, what a successful build of devel/llvm13 on UFS was like on the 8 GiByte RPi4B (overclocked, USB3 NVMe based SSD): [00:00:57] [01] [00:00:00] Building devel/llvm13 | llvm13-13.0.0_3 [12:25:40] [01] [12:24:43] Finished devel/llvm13 | llvm13-13.0.0_3: = Success where its Maximum Observed figures were: . . .; load averages: . . . MaxObs: 6.15, 5.71, 5.31 . . . threads: . . ., 11 MaxObsRunning . . . Mem: . . ., 6465Mi MaxObsActive, 1355Mi MaxObsWired, 7832Mi = MaxObs(Act+Wir+Lndry) Swap: . . ., 10429Mi MaxObsUsed, 16799Mi MaxObs(Act+Lndry+SwapUsed), = 18072Mi MaxObs(Act+Wir+Lndry+SwapUsed) But 18072Mi MaxObs(Act+Wir+Lndry+SwapUsed) =3D=3D 17.6484375 GiByte, so more than 17.6484375 GiByte for RAM+SWAP, depending on how much room for inactive and margin one chooses. Probably 20+ GiBytes, so 12+ GiBytes of swap for 8 GiBytes of RAM. (Reminder: maximum of sum <=3D sum of maximums.) =3D=3D=3D Mark Millard marklmi at yahoo.com
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?BFEC2EC0-3127-49A2-93FD-F059AF7842A7>