Date: Sat, 25 Jan 2020 12:50:59 -0800 (PST) From: "Rodney W. Grimes" <freebsd-rwg@gndrsh.dnsmgr.net> To: Mark Millard <marklmi@yahoo.com> Cc: yasu@utahime.org, FreeBSD Current <freebsd-current@freebsd.org> Subject: Re: After update to r357104 build of poudriere jail fails with 'out of swap space' Message-ID: <202001252050.00PKoxth040249@gndrsh.dnsmgr.net> In-Reply-To: <DAA3A910-25F2-447A-B540-C35985DA822E@yahoo.com>
next in thread | previous in thread | raw e-mail | index | archive | help
> Yasuhiro KIMURA yasu at utahime.org wrote on > Sat Jan 25 14:45:13 UTC 2020 : > > > I use VirtualBox to run 13-CURRENT. Host is 64bit Windows 10 1909 and > > spec of VM is as following. > > > > * 4 CPU > > * 8GB memory > > * 100GB disk > > - 92GB ZFS pool (zroot) > > - 8GB swap > > > > Today I updated this VM to r357104. And after that I tried to update > > poudriere jail with `poudriere jail -u -j jailname -b`. But it failed > > at install stage. After the failure I found following message is > > written to syslog. > > > > Jan 25 19:18:25 rolling-vm-freebsd1 kernel: pid 7963 (strip), jid 0, uid 0, was killed: out of swap space > > This message text's detailed wording is a misnomer. > Do you also have any messages of the form: > > . . . sentinel kernel: swap_pager_getswapspace(32): failed > > If yes: you really were out of swap space. > If no: you were not out of swap space, > or at least it is highly unlikely that you were. > > FreeBSD kills processes for multiple potential reasons. > For example: > > a) Still low on free RAM after a number of tries to increase it above a threshold. > b) Slow paging I/O. > c) . . . (I do not know the full list) . . . > > Unfortunately, FreeBSD is not explicit about the category > of problem that leads to the kill activity that happens. > > You might learn more by watching how things are going > via top or some such program or other way of monitoring. > > > Below are some notes about specific tunables that might > or might not be of help. (There may be more tunables > that can help that I do not know about.) > > For (a) there is a way to test if it is the issue by > adding to the number of tries before it gives up and > starts killing things. That will either: > > 1) let it get more done before kills start > 2) let it complete before the count is reached > 3) make no significant difference > > (3) would imply that (b) or (c) are involved instead. > > (1) might be handled by having it do even more tries. > > For delaying how long free RAM staying low is > tolerated, one can increase vm.pageout_oom_seq from > 12 to larger. The management of slow paging I've > less experience with but do have some notes about > below. > > Examples follow that I use in contexts with > sufficient RAM that I do not have to worry about > out of swap/page space. These I've set in > /etc/sysctl.conf . (Of course, I'm not trying to > deliberately run out of RAM.) > > # > # Delay when persisstent low free RAM leads to > # Out Of Memory killing of processes: > vm.pageout_oom_seq=120 > > I'll note that figures like 1024 or 1200 or > even more are possible. This is controlling how > many tries at regaining sufficient free RAM > that that level would be tolerated long-term. > After that it starts Out Of Memory kills to get > some free RAM. > > No figure is designed to make the delay > unbounded. There may be large enough figures to > effectively be bounded beyond any reasonable > time to wait. > > > As for paging I/O (this is specific to 13, > or was last I checked): > > # > # For plunty of swap/paging space (will not > # run out), avoid pageout delays leading to > # Out Of Memory killing of processes: > vm.pfault_oom_attempts=-1 > > (Note: In my context "plunty" really means > sufficient RAM that paging is rare. But > others have reported on using the -1 in > contexts where paging was heavy at times and > OOM kills had been happening that were > eliminated by the assignment.) > > I've no experience with the below alternative > to that -1 use: > > # > # For possibly insufficient swap/paging space > # (might run out), increase the pageout delay > # that leads to Out Of Memory killing of > # processes: > #vm.pfault_oom_attempts= ??? > #vm.pfault_oom_wait= ??? > # (The multiplication is the total but there > # are other potential tradoffs in the factors > # multiplied, even for nearly the same total.) > > > I'm not claiming that these 3 vm.???_oom_??? > figures are always sufficient. Nor am I > claiming that tunables are always available > that would be sufficient. Nor that it is easy > to find the ones that do exist that might > help for specific OOM kill issues. > > I have seen reports of OOM kills for other > reasons when both vm.pageout_oom_seq and > vm.pfault_oom_attempts=-1 were in use. > As I understand, FreeBSD did not report > what kind of condition lead to the > decision to do an OOM kill. > > So the above notes may or may-not help you. All the advice by Mark above is very sound and solid, however my first step would be to cut back the memory pig that is ZFS with: vfs.zfs.arc_max=4294967296 added to loader.conf > > > To make sure I shutdown both VM and host, restarted them and tried > > update of jail again. Then the problem was reproduced. > > > === > Mark Millard > marklmi at yahoo.com > ( dsl-only.net went > away in early 2018-Mar) > > _______________________________________________ > freebsd-current@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-current > To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org" > -- Rod Grimes rgrimes@freebsd.org
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?202001252050.00PKoxth040249>