From: Mark Millard <marklmi@yahoo.com> To: bob prohaska <fbsd@www.zefox.net>, freebsd-current@freebsd.org Subject: Re: Buildworld finishes despite swap exhaustion In-Reply-To: <aieNCJnCU3QfyJDV@www.zefox.net>
index | | previous in thread | raw e-mail
On 6/8/26 20:48, bob prohaska wrote: > Lately a Pi2B running buildworld reported an > exhaustion of swap, but buildworld kept running > and seemingly finished successfully. > > The report came on the serial console, I didn't > find anything in the buildworld log. > > This seems a very great improvement. Swap exhaustion > differs from other sorts of failure, in that one can > simply re-try the job with some hope of success when > the workload is lighter. > > Am I interpreting this correctly? [Because the actual messages are not reported, I'm making some assumptions about the exact messages that you got.] Remember vm.pageout_oom_seq ? The larger that value used, the longer the system operates with the amount of free RAM below the target threshold: in other words, it makes more tries at getting to the threshold before giving up and starting to kill processes to get the free RAM. Running out of swap of itself just means that SWAP can not be used to gain free RAM when such is not essential. RAM+SWAP can still be (marginally) sufficient over such a time if no memory allocations actually fail. If sufficient RAM/SWAP ends up being freed before vm.pageout_oom_seq related kills happen, no overall failure happens. As for the messages as I understand them: kernel: swap_pager: out of swap space does not report a failure, just a limiting condition. By contrast: kernel: swp_pager_getswapspace(2): failed reports a failure: the swap space allocation was necessary. It normally nleads to the likes of: kernel: pid ??? (???), jid ???, uid ???, was killed: failed to reclaim memory It is possible to fail to reclaim memory despite swap space being available. Even just one always-active process can keep so much memory in the active category over such a duration that the vm.pageout_oom_seq related process kills can start because free RAM threshold was not met. Overall: you were lucky in a marginal context. It is not some sort of new guarantee of avoiding oom kills. There are also the messages: proc ??? (???) failed to alloc page on fault, starting OOM pid ??? (???), jid ???, uid ???, was killed: a thread waited too long to allocate a page -- === Mark Millard marklmi at yahoo.comhome | help
