Skip site navigation (1)Skip section navigation (2)
From:      Mark Millard <marklmi@yahoo.com>
To:        bob prohaska <fbsd@www.zefox.net>, freebsd-current@freebsd.org
Subject:   Re: Buildworld finishes despite swap exhaustion
In-Reply-To: <aieNCJnCU3QfyJDV@www.zefox.net>

index | | previous in thread | raw e-mail

On 6/8/26 20:48, bob prohaska wrote:
> Lately a Pi2B running buildworld reported an
> exhaustion of swap, but buildworld kept running
> and seemingly finished successfully.
> 
> The report came on the serial console, I didn't
> find anything in the buildworld log. 
> 
> This seems a very great improvement. Swap exhaustion
> differs from other sorts of failure, in that one can 
> simply re-try the job with some hope of success when 
> the workload is lighter.
> 
> Am I interpreting this correctly?

[Because the actual messages are not reported, I'm making some
assumptions about the exact messages that you got.]


Remember vm.pageout_oom_seq ?

The larger that value used, the longer the system operates with the
amount of free RAM below the target threshold: in other words, it makes
more tries at getting to the threshold before giving up and starting to
kill processes to get the free RAM.

Running out of swap of itself just means that SWAP can not be used to
gain free RAM when such is not essential. RAM+SWAP can still be
(marginally) sufficient over such a time if no memory allocations
actually fail. If sufficient RAM/SWAP ends up being freed before
vm.pageout_oom_seq related kills happen, no overall failure happens.


As for the messages as I understand them:

kernel: swap_pager: out of swap space

does not report a failure, just a limiting condition.

By contrast:

kernel: swp_pager_getswapspace(2): failed

reports a failure: the swap space allocation was necessary. It normally
nleads to the likes of:

kernel: pid ??? (???), jid ???, uid ???, was killed: failed to reclaim
memory


It is possible to fail to reclaim memory despite swap space being
available. Even just one always-active process can keep so much memory
in the active category over such a duration that the vm.pageout_oom_seq
related process kills can start because free RAM threshold was not met.


Overall: you were lucky in a marginal context. It is not some sort of
new guarantee of avoiding oom kills.


There are also the messages:

proc ??? (???) failed to alloc page on fault, starting OOM

pid ??? (???), jid ???, uid ???, was killed: a thread waited too long to
allocate a page


-- 
===
Mark Millard
marklmi at yahoo.com


home | help