Date: Sat, 25 Jan 2020 12:02:07 -0800 From: Mark Millard <marklmi@yahoo.com> To: yasu@utahime.org, FreeBSD Current <freebsd-current@freebsd.org> Subject: Re: After update to r357104 build of poudriere jail fails with 'out of swap space' Message-ID: <DAA3A910-25F2-447A-B540-C35985DA822E@yahoo.com> References: <DAA3A910-25F2-447A-B540-C35985DA822E.ref@yahoo.com>
next in thread | previous in thread | raw e-mail | index | archive | help
Yasuhiro KIMURA yasu at utahime.org wrote on Sat Jan 25 14:45:13 UTC 2020 : > I use VirtualBox to run 13-CURRENT. Host is 64bit Windows 10 1909 and > spec of VM is as following. >=20 > * 4 CPU > * 8GB memory > * 100GB disk > - 92GB ZFS pool (zroot) > - 8GB swap >=20 > Today I updated this VM to r357104. And after that I tried to update > poudriere jail with `poudriere jail -u -j jailname -b`. But it failed > at install stage. After the failure I found following message is > written to syslog. >=20 > Jan 25 19:18:25 rolling-vm-freebsd1 kernel: pid 7963 (strip), jid 0, = uid 0, was killed: out of swap space This message text's detailed wording is a misnomer. Do you also have any messages of the form: . . . sentinel kernel: swap_pager_getswapspace(32): failed If yes: you really were out of swap space. If no: you were not out of swap space, or at least it is highly unlikely that you were. FreeBSD kills processes for multiple potential reasons. For example: a) Still low on free RAM after a number of tries to increase it above a = threshold. b) Slow paging I/O. c) . . . (I do not know the full list) . . . Unfortunately, FreeBSD is not explicit about the category of problem that leads to the kill activity that happens. You might learn more by watching how things are going via top or some such program or other way of monitoring. Below are some notes about specific tunables that might or might not be of help. (There may be more tunables that can help that I do not know about.) For (a) there is a way to test if it is the issue by adding to the number of tries before it gives up and starts killing things. That will either: 1) let it get more done before kills start 2) let it complete before the count is reached 3) make no significant difference (3) would imply that (b) or (c) are involved instead. (1) might be handled by having it do even more tries. For delaying how long free RAM staying low is tolerated, one can increase vm.pageout_oom_seq from 12 to larger. The management of slow paging I've less experience with but do have some notes about below. Examples follow that I use in contexts with sufficient RAM that I do not have to worry about out of swap/page space. These I've set in /etc/sysctl.conf . (Of course, I'm not trying to deliberately run out of RAM.) # # Delay when persisstent low free RAM leads to # Out Of Memory killing of processes: vm.pageout_oom_seq=3D120 I'll note that figures like 1024 or 1200 or even more are possible. This is controlling how many tries at regaining sufficient free RAM that that level would be tolerated long-term. After that it starts Out Of Memory kills to get some free RAM. No figure is designed to make the delay unbounded. There may be large enough figures to effectively be bounded beyond any reasonable time to wait. As for paging I/O (this is specific to 13, or was last I checked): # # For plunty of swap/paging space (will not # run out), avoid pageout delays leading to # Out Of Memory killing of processes: vm.pfault_oom_attempts=3D-1 (Note: In my context "plunty" really means sufficient RAM that paging is rare. But others have reported on using the -1 in contexts where paging was heavy at times and OOM kills had been happening that were eliminated by the assignment.) I've no experience with the below alternative to that -1 use: # # For possibly insufficient swap/paging space # (might run out), increase the pageout delay # that leads to Out Of Memory killing of # processes: #vm.pfault_oom_attempts=3D ??? #vm.pfault_oom_wait=3D ??? # (The multiplication is the total but there # are other potential tradoffs in the factors # multiplied, even for nearly the same total.) I'm not claiming that these 3 vm.???_oom_??? figures are always sufficient. Nor am I claiming that tunables are always available that would be sufficient. Nor that it is easy to find the ones that do exist that might help for specific OOM kill issues. I have seen reports of OOM kills for other reasons when both vm.pageout_oom_seq and vm.pfault_oom_attempts=3D-1 were in use. As I understand, FreeBSD did not report what kind of condition lead to the decision to do an OOM kill. So the above notes may or may-not help you. > To make sure I shutdown both VM and host, restarted them and tried > update of jail again. Then the problem was reproduced. =3D=3D=3D Mark Millard marklmi at yahoo.com ( dsl-only.net went away in early 2018-Mar)
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?DAA3A910-25F2-447A-B540-C35985DA822E>