From owner-freebsd-current@freebsd.org Sat Jan 25 20:51:09 2020 Return-Path: Delivered-To: freebsd-current@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 47E011FA9F0 for ; Sat, 25 Jan 2020 20:51:09 +0000 (UTC) (envelope-from freebsd-rwg@gndrsh.dnsmgr.net) Received: from gndrsh.dnsmgr.net (br1.CN84in.dnsmgr.net [69.59.192.140]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 484p7w1Xmhz45C5 for ; Sat, 25 Jan 2020 20:51:07 +0000 (UTC) (envelope-from freebsd-rwg@gndrsh.dnsmgr.net) Received: from gndrsh.dnsmgr.net (localhost [127.0.0.1]) by gndrsh.dnsmgr.net (8.13.3/8.13.3) with ESMTP id 00PKoxNb040250; Sat, 25 Jan 2020 12:50:59 -0800 (PST) (envelope-from freebsd-rwg@gndrsh.dnsmgr.net) Received: (from freebsd-rwg@localhost) by gndrsh.dnsmgr.net (8.13.3/8.13.3/Submit) id 00PKoxth040249; Sat, 25 Jan 2020 12:50:59 -0800 (PST) (envelope-from freebsd-rwg) From: "Rodney W. Grimes" Message-Id: <202001252050.00PKoxth040249@gndrsh.dnsmgr.net> Subject: Re: After update to r357104 build of poudriere jail fails with 'out of swap space' In-Reply-To: To: Mark Millard Date: Sat, 25 Jan 2020 12:50:59 -0800 (PST) CC: yasu@utahime.org, FreeBSD Current X-Mailer: ELM [version 2.4ME+ PL121h (25)] MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset=US-ASCII X-Rspamd-Queue-Id: 484p7w1Xmhz45C5 X-Spamd-Bar: ++ Authentication-Results: mx1.freebsd.org; dkim=none; dmarc=none; spf=none (mx1.freebsd.org: domain of freebsd-rwg@gndrsh.dnsmgr.net has no SPF policy when checking 69.59.192.140) smtp.mailfrom=freebsd-rwg@gndrsh.dnsmgr.net X-Spamd-Result: default: False [2.23 / 15.00]; ARC_NA(0.00)[]; FROM_HAS_DN(0.00)[]; RCPT_COUNT_THREE(0.00)[3]; TO_DN_SOME(0.00)[]; MIME_GOOD(-0.10)[text/plain]; RCVD_TLS_LAST(0.00)[]; DMARC_NA(0.00)[dnsmgr.net]; AUTH_NA(1.00)[]; NEURAL_SPAM_MEDIUM(0.47)[0.472,0]; TO_MATCH_ENVRCPT_SOME(0.00)[]; NEURAL_SPAM_LONG(0.83)[0.828,0]; R_SPF_NA(0.00)[]; FREEMAIL_TO(0.00)[yahoo.com]; FROM_EQ_ENVFROM(0.00)[]; R_DKIM_NA(0.00)[]; MIME_TRACE(0.00)[0:+]; ASN(0.00)[asn:13868, ipnet:69.59.192.0/19, country:US]; MID_RHS_MATCH_FROM(0.00)[]; IP_SCORE(0.03)[ip: (0.13), ipnet: 69.59.192.0/19(0.07), asn: 13868(0.02), country: US(-0.05)]; RCVD_COUNT_TWO(0.00)[2] X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 25 Jan 2020 20:51:09 -0000 > Yasuhiro KIMURA yasu at utahime.org wrote on > Sat Jan 25 14:45:13 UTC 2020 : > > > I use VirtualBox to run 13-CURRENT. Host is 64bit Windows 10 1909 and > > spec of VM is as following. > > > > * 4 CPU > > * 8GB memory > > * 100GB disk > > - 92GB ZFS pool (zroot) > > - 8GB swap > > > > Today I updated this VM to r357104. And after that I tried to update > > poudriere jail with `poudriere jail -u -j jailname -b`. But it failed > > at install stage. After the failure I found following message is > > written to syslog. > > > > Jan 25 19:18:25 rolling-vm-freebsd1 kernel: pid 7963 (strip), jid 0, uid 0, was killed: out of swap space > > This message text's detailed wording is a misnomer. > Do you also have any messages of the form: > > . . . sentinel kernel: swap_pager_getswapspace(32): failed > > If yes: you really were out of swap space. > If no: you were not out of swap space, > or at least it is highly unlikely that you were. > > FreeBSD kills processes for multiple potential reasons. > For example: > > a) Still low on free RAM after a number of tries to increase it above a threshold. > b) Slow paging I/O. > c) . . . (I do not know the full list) . . . > > Unfortunately, FreeBSD is not explicit about the category > of problem that leads to the kill activity that happens. > > You might learn more by watching how things are going > via top or some such program or other way of monitoring. > > > Below are some notes about specific tunables that might > or might not be of help. (There may be more tunables > that can help that I do not know about.) > > For (a) there is a way to test if it is the issue by > adding to the number of tries before it gives up and > starts killing things. That will either: > > 1) let it get more done before kills start > 2) let it complete before the count is reached > 3) make no significant difference > > (3) would imply that (b) or (c) are involved instead. > > (1) might be handled by having it do even more tries. > > For delaying how long free RAM staying low is > tolerated, one can increase vm.pageout_oom_seq from > 12 to larger. The management of slow paging I've > less experience with but do have some notes about > below. > > Examples follow that I use in contexts with > sufficient RAM that I do not have to worry about > out of swap/page space. These I've set in > /etc/sysctl.conf . (Of course, I'm not trying to > deliberately run out of RAM.) > > # > # Delay when persisstent low free RAM leads to > # Out Of Memory killing of processes: > vm.pageout_oom_seq=120 > > I'll note that figures like 1024 or 1200 or > even more are possible. This is controlling how > many tries at regaining sufficient free RAM > that that level would be tolerated long-term. > After that it starts Out Of Memory kills to get > some free RAM. > > No figure is designed to make the delay > unbounded. There may be large enough figures to > effectively be bounded beyond any reasonable > time to wait. > > > As for paging I/O (this is specific to 13, > or was last I checked): > > # > # For plunty of swap/paging space (will not > # run out), avoid pageout delays leading to > # Out Of Memory killing of processes: > vm.pfault_oom_attempts=-1 > > (Note: In my context "plunty" really means > sufficient RAM that paging is rare. But > others have reported on using the -1 in > contexts where paging was heavy at times and > OOM kills had been happening that were > eliminated by the assignment.) > > I've no experience with the below alternative > to that -1 use: > > # > # For possibly insufficient swap/paging space > # (might run out), increase the pageout delay > # that leads to Out Of Memory killing of > # processes: > #vm.pfault_oom_attempts= ??? > #vm.pfault_oom_wait= ??? > # (The multiplication is the total but there > # are other potential tradoffs in the factors > # multiplied, even for nearly the same total.) > > > I'm not claiming that these 3 vm.???_oom_??? > figures are always sufficient. Nor am I > claiming that tunables are always available > that would be sufficient. Nor that it is easy > to find the ones that do exist that might > help for specific OOM kill issues. > > I have seen reports of OOM kills for other > reasons when both vm.pageout_oom_seq and > vm.pfault_oom_attempts=-1 were in use. > As I understand, FreeBSD did not report > what kind of condition lead to the > decision to do an OOM kill. > > So the above notes may or may-not help you. All the advice by Mark above is very sound and solid, however my first step would be to cut back the memory pig that is ZFS with: vfs.zfs.arc_max=4294967296 added to loader.conf > > > To make sure I shutdown both VM and host, restarted them and tried > > update of jail again. Then the problem was reproduced. > > > === > Mark Millard > marklmi at yahoo.com > ( dsl-only.net went > away in early 2018-Mar) > > _______________________________________________ > freebsd-current@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-current > To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org" > -- Rod Grimes rgrimes@freebsd.org