From owner-freebsd-current@freebsd.org Mon Jan 27 19:09:54 2020 Return-Path: Delivered-To: freebsd-current@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 9B00F2284AE for ; Mon, 27 Jan 2020 19:09:54 +0000 (UTC) (envelope-from cy.schubert@cschubert.com) Received: from smtp-out-no.shaw.ca (smtp-out-no.shaw.ca [64.59.134.12]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "Client", Issuer "CA" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 485zp94b22z3Frs for ; Mon, 27 Jan 2020 19:09:53 +0000 (UTC) (envelope-from cy.schubert@cschubert.com) Received: from spqr.komquats.com ([70.67.125.17]) by shaw.ca with ESMTPA id w9lci6Tr6kqGXw9ldiIEDE; Mon, 27 Jan 2020 12:09:51 -0700 X-Authority-Analysis: v=2.3 cv=c/jVvi1l c=1 sm=1 tr=0 a=VFtTW3WuZNDh6VkGe7fA3g==:117 a=VFtTW3WuZNDh6VkGe7fA3g==:17 a=jpOVt7BSZ2e4Z31A5e1TngXxSK0=:19 a=IkcTkHD0fZMA:10 a=Jdjhy38mL1oA:10 a=iKhvJSA4AAAA:8 a=YxBL1-UpAAAA:8 a=6I5d2MoRAAAA:8 a=JJEPTso-pDyPorNxduYA:9 a=pum2oeXq2iKrGOMB:21 a=WZbiQQmD437hizDH:21 a=QEXdDO2ut3YA:10 a=7FgKuMUr8FYA:10 a=odh9cflL3HIXMm4fY7Wr:22 a=Ia-lj3WSrqcvXOmTRaiG:22 a=IjZwj45LgO3ly-622nXo:22 Received: from Resas-iPad.esitwifi.local (S0106788a207e2972.gv.shawcable.net [70.66.154.233]) by spqr.komquats.com (Postfix) with ESMTPSA id AEEC15F9; Mon, 27 Jan 2020 11:09:47 -0800 (PST) Date: Mon, 27 Jan 2020 11:09:24 -0800 User-Agent: K-9 Mail for Android In-Reply-To: <202001271819.00RIJo3e056049@gndrsh.dnsmgr.net> References: <202001271819.00RIJo3e056049@gndrsh.dnsmgr.net> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Subject: Re: After update to r357104 build of poudriere jail fails with 'out of swap space' To: "Rodney W. Grimes" CC: sgk@troutmask.apl.washington.edu, freebsd-current@freebsd.org, Mark Millard , yasu@utahime.org From: Cy Schubert Message-ID: <6EE01D8C-FD95-4C69-A8E6-AAA619135E5A@cschubert.com> X-CMAE-Envelope: MS4wfCjY1AFwju0JxBkF53HW5iQqtYZro9qDuEkMFvSsA0b1NgonQShjnKKKotCdZNl0wxYLdvpU5IdDzRBbp5SUcBWJuZd2kArJRW7MkeKEfy5YxwLwO/OT mVBL19xc9BMZf3v5oxpVj9oY27F1kSBlWwSV94r7aTkoDZxEwYY2d9cZV9uZp0FqPr4/QTQgQmRvRYtChkp9WKGd7unVPxixOEsi/sBt8BMYY5pU7VeiFxL0 l0VGGdy3kOFmwy1SD8OKnM9qQaLlHwYVRUY5IVXV8tW5bMwOVdZQoCCHIkJx0WlxV2T2IXc2CODlbIiz3EVzNNvGFslowk/xQjemwacGpms= X-Rspamd-Queue-Id: 485zp94b22z3Frs X-Spamd-Bar: ---- Authentication-Results: mx1.freebsd.org; dkim=none; spf=none (mx1.freebsd.org: domain of cy.schubert@cschubert.com has no SPF policy when checking 64.59.134.12) smtp.mailfrom=cy.schubert@cschubert.com X-Spamd-Result: default: False [-4.67 / 15.00]; ARC_NA(0.00)[]; RCVD_VIA_SMTP_AUTH(0.00)[]; RECEIVED_SPAMHAUS_PBL(0.00)[233.154.66.70.khpj7ygk5idzvmvt5x4ziurxhy.zen.dq.spamhaus.net : 127.0.0.11,17.125.67.70.khpj7ygk5idzvmvt5x4ziurxhy.zen.dq.spamhaus.net : 127.0.0.11]; FROM_HAS_DN(0.00)[]; TO_DN_SOME(0.00)[]; RWL_MAILSPIKE_GOOD(0.00)[12.134.59.64.rep.mailspike.net : 127.0.0.18]; NEURAL_HAM_LONG(-1.00)[-1.000,0]; MIME_GOOD(-0.10)[text/plain]; RCVD_TLS_LAST(0.00)[]; NEURAL_HAM_MEDIUM(-1.00)[-1.000,0]; RCPT_COUNT_FIVE(0.00)[5]; RCVD_COUNT_THREE(0.00)[3]; TO_MATCH_ENVRCPT_SOME(0.00)[]; R_SPF_NA(0.00)[]; RCVD_IN_DNSWL_LOW(-0.10)[12.134.59.64.list.dnswl.org : 127.0.5.1]; R_DKIM_NA(0.00)[]; MIME_TRACE(0.00)[0:+]; ASN(0.00)[asn:6327, ipnet:64.59.128.0/20, country:CA]; MID_RHS_MATCH_FROM(0.00)[]; IP_SCORE(-2.47)[ip: (-6.60), ipnet: 64.59.128.0/20(-3.18), asn: 6327(-2.47), country: CA(-0.09)]; FROM_EQ_ENVFROM(0.00)[] X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 27 Jan 2020 19:09:54 -0000 On January 27, 2020 10:19:50 AM PST, "Rodney W=2E Grimes" wrote: >> In message <202001261745=2E00QHjkuW044006@gndrsh=2Ednsmgr=2Enet>, "Rodn= ey >W=2E=20 >> Grimes" >> writes: >> > > In message <20200125233116=2EGA49916@troutmask=2Eapl=2Ewashington= =2Eedu>, >Steve=20 >> > > Kargl w >> > > rites: >> > > > On Sat, Jan 25, 2020 at 02:09:29PM -0800, Cy Schubert wrote: >> > > > > On January 25, 2020 1:52:03 PM PST, Steve Kargl >> > ingt >> > > > on=2Eedu> wrote: >> > > > > >On Sat, Jan 25, 2020 at 01:41:16PM -0800, Cy Schubert wrote: >> > > > > >>=20 >> > > > > >> It's not just poudeiere=2E Standard port builds of chromium, >rust >> > > > > >> and thunderbird also fail on my machines with less than 8 >GB=2E >> > > > > >> >> > > > > > >> > > > > >Interesting=2E I routinely build chromium, rust, firefox, >> > > > > >llvm and few other resource-hunger ports on a i386-freebsd >> > > > > >laptop with 3=2E4 GB available memory=2E This is done with >> > > > > >chrome running with a few tabs swallowing a 1-1=2E5 GB of >> > > > > >memory=2E No issues=2E =20 >> > > > >=20 >> > > > > Number of threads makes a difference too=2E How many >core/threads does yo >> > ur l >> > > > aptop have? >> > > > >> > > > 2 cores=2E >> > >=20 >> > > This is why=2E >> > >=20 >> > > > >> > > > > Reducing number of concurrent threads allowed my builds to >complete >> > > > > on the 5 GB machine=2E My build machines have 4 cores, 1 thread >per >> > > > > core=2E Reducing concurrent threads circumvented the issue=2E= =20 >> > > > >> > > > I use portmaster, and AFIACT, it uses 'make -j 2' for the >build=2E >> > > > Laptop isn't doing too much, but an update and browsing=2E It >does >> > > > take a long time especially if building llvm is required=2E >> > >=20 >> > > I use portmaster as well (for quick incidental builds)=2E It uses= =20 >> > > MAKE_JOBS_NUMBER=3D4 (which is equivalent to make -j 4)=2E I suppos= e >machines=20 >> > > with not enough memory to support their cores with certain builds >might=20 >> > > have a better chance of having this problem=2E >> > >=20 >> > > MAKE_JOBS_NUMBER_LIMIT to limit a 4 core machine with less than 2 >GB per=20 >> > > core might be an option=2E Looking at it this way, instead of an >extra 3 GB,=20 >> > > the extra 60% more memory in the other machine makes a big >difference=2E A=20 >> > > rule of thumb would probably be, have ~ 2 GB RAM for every core >or thread=20 >> > > when doing large parallel builds=2E >> > >> > Perhaps we need to redo some boot time calculations, for one the >> > ZFS arch cache, IMHO, is just silly at a fixed percent of total >> > memory=2E A high percentage at that=2E >> > >> > One idea based on what you just said might be: >> > >> > percore_memory_reserve =3D 2G (Your number, I personally would use 1G >here) >> > arc_max =3D MAX(memory size - (Cores * percore_memory_reserve), >512mb) >> > >> > I think that simple change would go a long ways to cutting down the >> > number of OOM reports we see=2E ALSO IMHO there should be a way for >> > sub systems to easily tell zfs they are memory pigs too and need to >> > share the space=2E Ie, bhyve is horrible if you do not tune zfs arc >> > based on how much memory your using up for VM's=2E >> > >> > Another formulation might be >> > percore_memory_reserve =3D alpha * memory_zire / cores >> > >> > Alpha most likely falling in the 0=2E25 to 0=2E5 range, I think this >one >> > would have better scalability, would need to run some numbers=2E >> > Probably needs to become non linear above some core count=2E >>=20 >> Setting a lower arc_max at boot is unlikely to help=2E Rust was >building on=20 >> the 8 GB and 5 GB 4 core machines last night=2E It completed >successfully on=20 >> the 8 GB machine, while using 12 MB of swap=2E ARC was at 1307 MB=2E >>=20 >> On the 5 GB 4 core machine the rust build died of OOM=2E 328 KB swap >was=20 >> used=2E ARC was reported at 941 MB=2E arc_min on this machine is 489=2E= 2 >MB=2E > >What is arc_max? =20 > >> Cy Schubert 3=2E8 GB=2E It never exceeds 1=2E5 to 2 GB when doing a NO_CLEAN buildworl= d, where it gets a 95-99% hit ratio with 8 threads=2E There are a couple of things going on here=2E First, four large multithrea= ded rust compiles in memory simultaneously=2E Secondly, a reluctance to use= swap=2E My guess is the working set for each of the four compiles was larg= e enough to trigger the OOM=2E I haven't had time to seriously look at thi= s though but I'm guessing that the locality of reference was large enough t= o keep much of the memory in RAM, so here we are=2E --=20 Pardon the typos and autocorrect, small keyboard in use=2E=20 Cy Schubert FreeBSD UNIX: Web: https://www=2EFreeBSD=2Eorg The need of the many outweighs the greed of the few=2E Sent from my Android device with K-9 Mail=2E Please excuse my brevity=2E