From owner-freebsd-current@freebsd.org Sat Jan 25 21:59:02 2020 Return-Path: Delivered-To: freebsd-current@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id EC7B01FC7F8 for ; Sat, 25 Jan 2020 21:59:02 +0000 (UTC) (envelope-from cy.schubert@cschubert.com) Received: from smtp-out-so.shaw.ca (smtp-out-so.shaw.ca [64.59.136.137]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "Client", Issuer "CA" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 484qfF63DZz48ym for ; Sat, 25 Jan 2020 21:59:01 +0000 (UTC) (envelope-from cy.schubert@cschubert.com) Received: from spqr.komquats.com ([70.67.125.17]) by shaw.ca with ESMTPA id vTSCiuCADRnrKvTSEia2C6; Sat, 25 Jan 2020 14:58:59 -0700 X-Authority-Analysis: v=2.3 cv=L7FjvNb8 c=1 sm=1 tr=0 a=VFtTW3WuZNDh6VkGe7fA3g==:117 a=VFtTW3WuZNDh6VkGe7fA3g==:17 a=jpOVt7BSZ2e4Z31A5e1TngXxSK0=:19 a=xqWC_Br6kY4A:10 a=IkcTkHD0fZMA:10 a=Jdjhy38mL1oA:10 a=iKhvJSA4AAAA:8 a=u4C43qGkAAAA:8 a=CjxXgO3LAAAA:8 a=YVOhz5M6AAAA:8 a=6I5d2MoRAAAA:8 a=YxBL1-UpAAAA:8 a=EeDaFvYApu3O_M04LsMA:9 a=rYmC72-wekE0iC1c:21 a=J_gUWEfmDNBqk2Lw:21 a=QEXdDO2ut3YA:10 a=odh9cflL3HIXMm4fY7Wr:22 a=P8QV4QaAdoZ3FB6hlmCP:22 a=sbbTL3E6IKcx-RquDtO-:22 a=IjZwj45LgO3ly-622nXo:22 a=Ia-lj3WSrqcvXOmTRaiG:22 Received: from [IPv6:2605:8d80:404:d7ee:2c2a:d9c0:62f1:7c09] (unknown [72.143.230.11]) by spqr.komquats.com (Postfix) with ESMTPSA id 0A16E515; Sat, 25 Jan 2020 13:58:55 -0800 (PST) Date: Sat, 25 Jan 2020 13:58:54 -0800 User-Agent: K-9 Mail for Android In-Reply-To: <202001252050.00PKoxth040249@gndrsh.dnsmgr.net> References: <202001252050.00PKoxth040249@gndrsh.dnsmgr.net> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Subject: Re: After update to r357104 build of poudriere jail fails with 'out of swap space' To: freebsd-current@freebsd.org, "Rodney W. Grimes" , Mark Millard CC: yasu@utahime.org,FreeBSD Current From: Cy Schubert Message-ID: <99F7A865-53E1-46D6-8C07-10DE4560C766@cschubert.com> X-CMAE-Envelope: MS4wfAPH8TTLj6OPvI5rREDrr6KJ+TGYnKVuhk1c0wHUCKWl8tRy7LY875+9fja2VNc7z0NV1d0O0/A5DsOSBlDYj94thW7LHGXykgrZMCtgDUoohXBW+uxb tTiptpTbaSV04z5BJ0flN7AmhO2vsYclPQ4Jtj5rm/besUnhufenBONTHBmH8wrLCjEHBRdyYh0cBGn0tyrD3DFpzUv9GG5XSuXtAMFe80neLtzUJJHFK2ZM gzT9wxhEMK5eAQ1VM9SOASp2l13VscztldsaBT+lbZvfr7Y7rZ8F1C2kTUw8Gmx2 X-Rspamd-Queue-Id: 484qfF63DZz48ym X-Spamd-Bar: ---- Authentication-Results: mx1.freebsd.org; dkim=none; spf=none (mx1.freebsd.org: domain of cy.schubert@cschubert.com has no SPF policy when checking 64.59.136.137) smtp.mailfrom=cy.schubert@cschubert.com X-Spamd-Result: default: False [-4.57 / 15.00]; ARC_NA(0.00)[]; RCVD_VIA_SMTP_AUTH(0.00)[]; RECEIVED_SPAMHAUS_PBL(0.00)[17.125.67.70.khpj7ygk5idzvmvt5x4ziurxhy.zen.dq.spamhaus.net : 127.0.0.11,11.230.143.72.khpj7ygk5idzvmvt5x4ziurxhy.zen.dq.spamhaus.net : 127.0.0.11]; FROM_HAS_DN(0.00)[]; TO_DN_SOME(0.00)[]; NEURAL_HAM_MEDIUM(-1.00)[-1.000,0]; NEURAL_HAM_LONG(-1.00)[-1.000,0]; MIME_GOOD(-0.10)[text/plain]; RCVD_TLS_LAST(0.00)[]; RCPT_COUNT_FIVE(0.00)[5]; RCVD_COUNT_THREE(0.00)[3]; TO_MATCH_ENVRCPT_SOME(0.00)[]; R_SPF_NA(0.00)[]; RCVD_IN_DNSWL_LOW(-0.10)[137.136.59.64.list.dnswl.org : 127.0.5.1]; R_DKIM_NA(0.00)[]; MIME_TRACE(0.00)[0:+]; ASN(0.00)[asn:6327, ipnet:64.59.128.0/20, country:CA]; MID_RHS_MATCH_FROM(0.00)[]; IP_SCORE(-2.37)[ip: (-6.13), ipnet: 64.59.128.0/20(-3.17), asn: 6327(-2.47), country: CA(-0.09)]; FROM_EQ_ENVFROM(0.00)[] X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 25 Jan 2020 21:59:03 -0000 On January 25, 2020 12:50:59 PM PST, "Rodney W=2E Grimes" wrote: >> Yasuhiro KIMURA yasu at utahime=2Eorg wrote on >> Sat Jan 25 14:45:13 UTC 2020 : >>=20 >> > I use VirtualBox to run 13-CURRENT=2E Host is 64bit Windows 10 1909 >and >> > spec of VM is as following=2E >> >=20 >> > * 4 CPU >> > * 8GB memory >> > * 100GB disk >> > - 92GB ZFS pool (zroot) >> > - 8GB swap >> >=20 >> > Today I updated this VM to r357104=2E And after that I tried to >update >> > poudriere jail with `poudriere jail -u -j jailname -b`=2E But it >failed >> > at install stage=2E After the failure I found following message is >> > written to syslog=2E >> >=20 >> > Jan 25 19:18:25 rolling-vm-freebsd1 kernel: pid 7963 (strip), jid >0, uid 0, was killed: out of swap space >>=20 >> This message text's detailed wording is a misnomer=2E >> Do you also have any messages of the form: >>=20 >> =2E =2E =2E sentinel kernel: swap_pager_getswapspace(32): failed >>=20 >> If yes: you really were out of swap space=2E >> If no: you were not out of swap space, >> or at least it is highly unlikely that you were=2E >>=20 >> FreeBSD kills processes for multiple potential reasons=2E >> For example: >>=20 >> a) Still low on free RAM after a number of tries to increase it above >a threshold=2E >> b) Slow paging I/O=2E >> c) =2E =2E =2E (I do not know the full list) =2E =2E =2E >>=20 >> Unfortunately, FreeBSD is not explicit about the category >> of problem that leads to the kill activity that happens=2E >>=20 >> You might learn more by watching how things are going >> via top or some such program or other way of monitoring=2E >>=20 >>=20 >> Below are some notes about specific tunables that might >> or might not be of help=2E (There may be more tunables >> that can help that I do not know about=2E) >>=20 >> For (a) there is a way to test if it is the issue by >> adding to the number of tries before it gives up and >> starts killing things=2E That will either: >>=20 >> 1) let it get more done before kills start >> 2) let it complete before the count is reached >> 3) make no significant difference >>=20 >> (3) would imply that (b) or (c) are involved instead=2E >>=20 >> (1) might be handled by having it do even more tries=2E >>=20 >> For delaying how long free RAM staying low is >> tolerated, one can increase vm=2Epageout_oom_seq from >> 12 to larger=2E The management of slow paging I've >> less experience with but do have some notes about >> below=2E >>=20 >> Examples follow that I use in contexts with >> sufficient RAM that I do not have to worry about >> out of swap/page space=2E These I've set in >> /etc/sysctl=2Econf =2E (Of course, I'm not trying to >> deliberately run out of RAM=2E) >>=20 >> # >> # Delay when persisstent low free RAM leads to >> # Out Of Memory killing of processes: >> vm=2Epageout_oom_seq=3D120 >>=20 >> I'll note that figures like 1024 or 1200 or >> even more are possible=2E This is controlling how >> many tries at regaining sufficient free RAM >> that that level would be tolerated long-term=2E >> After that it starts Out Of Memory kills to get >> some free RAM=2E >>=20 >> No figure is designed to make the delay >> unbounded=2E There may be large enough figures to >> effectively be bounded beyond any reasonable >> time to wait=2E >>=20 >>=20 >> As for paging I/O (this is specific to 13, >> or was last I checked): >>=20 >> # >> # For plunty of swap/paging space (will not >> # run out), avoid pageout delays leading to >> # Out Of Memory killing of processes: >> vm=2Epfault_oom_attempts=3D-1 >>=20 >> (Note: In my context "plunty" really means >> sufficient RAM that paging is rare=2E But >> others have reported on using the -1 in >> contexts where paging was heavy at times and >> OOM kills had been happening that were >> eliminated by the assignment=2E) >>=20 >> I've no experience with the below alternative >> to that -1 use: >>=20 >> # >> # For possibly insufficient swap/paging space >> # (might run out), increase the pageout delay >> # that leads to Out Of Memory killing of >> # processes: >> #vm=2Epfault_oom_attempts=3D ??? >> #vm=2Epfault_oom_wait=3D ??? >> # (The multiplication is the total but there >> # are other potential tradoffs in the factors >> # multiplied, even for nearly the same total=2E) >>=20 >>=20 >> I'm not claiming that these 3 vm=2E???_oom_??? >> figures are always sufficient=2E Nor am I >> claiming that tunables are always available >> that would be sufficient=2E Nor that it is easy >> to find the ones that do exist that might >> help for specific OOM kill issues=2E >>=20 >> I have seen reports of OOM kills for other >> reasons when both vm=2Epageout_oom_seq and >> vm=2Epfault_oom_attempts=3D-1 were in use=2E >> As I understand, FreeBSD did not report >> what kind of condition lead to the >> decision to do an OOM kill=2E >>=20 >> So the above notes may or may-not help you=2E > >All the advice by Mark above is very sound and solid, however my >first step would be to cut back the memory pig that is ZFS with: >vfs=2Ezfs=2Earc_max=3D4294967296 >added to loader=2Econf > >>=20 >> > To make sure I shutdown both VM and host, restarted them and tried >> > update of jail again=2E Then the problem was reproduced=2E >>=20 >>=20 >> =3D=3D=3D >> Mark Millard >> marklmi at yahoo=2Ecom >> ( dsl-only=2Enet went >> away in early 2018-Mar) >>=20 >> _______________________________________________ >> freebsd-current@freebsd=2Eorg mailing list >> https://lists=2Efreebsd=2Eorg/mailman/listinfo/freebsd-current >> To unsubscribe, send any mail to >"freebsd-current-unsubscribe@freebsd=2Eorg" >>=20 Arc_max can also be reduced by sysctl to get immediate relief=2E In the case when I had the problem it was on a machine with 5 GB RAM and a= rc was less than 500 mb=2E The problem was due to 4 rust complies building = rust or thunderbird on an amd64 kernel within an i386 jail=2E No such probl= ems were encountered on a machine with 8 GB with 2 GB arc=2E --=20 Pardon the typos and autocorrect, small keyboard in use=2E=20 Cy Schubert FreeBSD UNIX: Web: https://www=2EFreeBSD=2Eorg The need of the many outweighs the greed of the few=2E Sent from my Android device with K-9 Mail=2E Please excuse my brevity=2E