Date: Tue, 28 Jan 2020 11:33:30 -0800 From: Cy Schubert <Cy.Schubert@cschubert.com> To: Mark Millard <marklmi@yahoo.com> Cc: "Rodney W. Grimes" <freebsd-rwg@gndrsh.dnsmgr.net>, sgk@troutmask.apl.washington.edu, freebsd-current@freebsd.org, yasu@utahime.org Subject: Re: After update to r357104 build of poudriere jail fails with 'out of swap space' Message-ID: <4682F012-E3C5-4B49-8099-659EBCB7B585@cschubert.com> In-Reply-To: <C37361F8-FA0A-472F-B4DC-4D963B2515EF@yahoo.com> References: <202001261745.00QHjkuW044006@gndrsh.dnsmgr.net> <202001271309.00RD96nr005876@slippy.cwsent.com> <A0E565B0-52A1-41CE-915F-35B8E0F9394F@cschubert.com> <BA0CE7D8-CFA1-40A3-BEFA-21D0C230B082@yahoo.com> <202001272048.00RKmiZs006726@slippy.cwsent.com> <C37361F8-FA0A-472F-B4DC-4D963B2515EF@yahoo.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On January 27, 2020 2:25:59 PM PST, Mark Millard <marklmi@yahoo=2Ecom> wrot= e: > > >On 2020-Jan-27, at 12:48, Cy Schubert <Cy=2ESchubert at cschubert=2Ecom> >wrote: > >> In message <BA0CE7D8-CFA1-40A3-BEFA-21D0C230B082@yahoo=2Ecom>, Mark >Millard=20 >> write >> s: >>>=20 >>>=20 >>>=20 >>> On 2020-Jan-27, at 10:20, Cy Schubert <Cy=2ESchubert at cschubert=2Eco= m> >wrote: >>>=20 >>>> On January 27, 2020 5:09:06 AM PST, Cy Schubert ><Cy=2ESchubert@cschubert=2Ecom> >>> wrote: >>>>>> =2E =2E =2E=20 >>>>>=20 >>>>> Setting a lower arc_max at boot is unlikely to help=2E Rust was >building >>>>> on=20 >>>>> the 8 GB and 5 GB 4 core machines last night=2E It completed >successfully >>>>> on=20 >>>>> the 8 GB machine, while using 12 MB of swap=2E ARC was at 1307 MB=2E >>>>>=20 >>>>> On the 5 GB 4 core machine the rust build died of OOM=2E 328 KB swap >was=20 >>>>> used=2E ARC was reported at 941 MB=2E arc_min on this machine is 489= =2E2 >MB=2E >>>>=20 >>>> MAKE_JOBS_NUMBER=3D3 worked building rust on the 5 GB 4 core >machine=2E ARC is >>> at 534 MB with 12 MB swap used=2E >>>=20 >>> If you increase vm=2Epageout_oom_seq to, say, 10 times what you now >use, >>> does MAKE_JOBS_NUMBER=3D4 complete --or at least go notably longer >before >>> getting OOM behavior from the system? (The default is 12 last I >checked=2E >>> So that might be what you are now using=2E) >>=20 >> It's already 4096 (default is 12)=2E > >Wow=2E Then the count of tries to get free RAM above the threshold >does not seem likely to be the source of the OOM kills=2E > >>>=20 >>> Have you tried also having: vm=2Epfault_oom_attempts=3D"-1" (Presuming >>> you are not worried about actually running out of swap/page space, >>> or can tolerate a deadlock if it does run out=2E) This setting >presumes >>> head, not release or stable=2E (Last I checked anyway=2E) >>=20 >> Already there=2E > >Then page-out delay does not seem likely to be the source of the OOM >kills=2E > >> The box is a sandbox with remote serial console access so deadlocks >are ok=2E >>=20 >>>=20 >>> It would be interesting to know what difference those two settings >>> together might make for your context: it seems to be a good context >>> for testing in this area=2E (But you might already have set them=2E >>> If so, it would be good to report the figures in use=2E) >>>=20 >>> Of course, my experiment ideas need not be your actions=2E >>=20 >> It's a sandbox machine=2E We already know 8 GB works with 4 threads on >as=20 >> many cores=2E And, 5 GB works with 3 threads on 4 cores=2E > >It would be nice to find out what category of issue in the kernel >is driving the OOM kills for your 5GB context with MAKE_JOBS_NUMBER=3D4= =2E >Too bad the first kill does not report a backtrace spanning the >code choosing to do the kill (or otherwise report the type of issue >leading the the kill)=2E > >Your is consistent with the small arm board folks reporting that >recently >contexts that were doing buildworld and the like fine under somewhat >older kernels have started getting OOM kills, despite the two settings=2E > >At the moment I'm not sure how to find the category(s) of issue(s) that >is(are) driving these OOM kills=2E > >Thanks for reporting what settings you were using=2E > >=3D=3D=3D >Mark Millard >marklmi at yahoo=2Ecom >( dsl-only=2Enet went >away in early 2018-Mar) I've been able to reproduce the problem at $JOB in a Virtualbox VM with 1 = vCPU, 1=2E5 GB vRAM, and 2 GB swap building graphics/graphviz: cc killed ou= t of swap space=2E The killed cc had an address space of ~ 500 MB, using on= ly 43 MB of the 2 GB swap=2E Free space is exhausted but swap used never ex= ceeds tens of MB=2E Doubling the swap to 4 GB had no effect=2E The VM doesn= 't use ZFS=2E This appears recent=2E --=20 Pardon the typos and autocorrect, small keyboard in use=2E=20 Cy Schubert <Cy=2ESchubert@cschubert=2Ecom> FreeBSD UNIX: <cy@FreeBSD=2Eorg> Web: https://www=2EFreeBSD=2Eorg The need of the many outweighs the greed of the few=2E Sent from my Android device with K-9 Mail=2E Please excuse my brevity=2E
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4682F012-E3C5-4B49-8099-659EBCB7B585>