Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 11 Jun 2026 08:51:17 -0700
From:      bob prohaska <fbsd@www.zefox.net>
To:        Mark Millard <marklmi@yahoo.com>
Cc:        freebsd-current@freebsd.org
Subject:   Re: Buildworld finishes despite swap exhaustion
Message-ID:  <airZdW54ZsQ8tjPR@www.zefox.net>
In-Reply-To: <040df279-5f61-4f4f-ae4a-79bd44797b53@yahoo.com>
References:  <aieNCJnCU3QfyJDV@www.zefox.net> <040df279-5f61-4f4f-ae4a-79bd44797b53@yahoo.com>

index | next in thread | previous in thread | raw e-mail

On Tue, Jun 09, 2026 at 08:22:02AM -0700, Mark Millard wrote:
> On 6/8/26 20:48, bob prohaska wrote:
> > Lately a Pi2B running buildworld reported an
> > exhaustion of swap, but buildworld kept running
> > and seemingly finished successfully.
> > 
> > The report came on the serial console, I didn't
> > find anything in the buildworld log. 
> > 
> > This seems a very great improvement. Swap exhaustion
> > differs from other sorts of failure, in that one can 
> > simply re-try the job with some hope of success when 
> > the workload is lighter.
> > 
> > Am I interpreting this correctly?
> 
> [Because the actual messages are not reported, I'm making some
> assumptions about the exact messages that you got.]
> 
> 
> Remember vm.pageout_oom_seq ?
> 

Yes, /boot/loader.conf contains:
vm.pageout_oom_seq="4096"
vm.pfault_oom_attempts="3"
#vm.pfault_oom_attempts="120"
vm.pfault_oom_wait="20"

I'll admit to not remembering how 4096 was chosen....
probably just a wild guess.

> The larger that value used, the longer the system operates with the
> amount of free RAM below the target threshold: in other words, it makes
> more tries at getting to the threshold before giving up and starting to
> kill processes to get the free RAM.
> 
> Running out of swap of itself just means that SWAP can not be used to
> gain free RAM when such is not essential. RAM+SWAP can still be
> (marginally) sufficient over such a time if no memory allocations
> actually fail. If sufficient RAM/SWAP ends up being freed before
> vm.pageout_oom_seq related kills happen, no overall failure happens.
> 
> 
> As for the messages as I understand them:
> 
> kernel: swap_pager: out of swap space
> 
> does not report a failure, just a limiting condition.
> 
> By contrast:
> 
> kernel: swp_pager_getswapspace(2): failed
> 
> reports a failure: the swap space allocation was necessary. It normally
> nleads to the likes of:
> 
> kernel: pid ??? (???), jid ???, uid ???, was killed: failed to reclaim
> memory

A more recent incident reported in /var/log/messages:

Jun  4 12:34:39 www kernel: swap_pager: out of swap space
Jun  4 12:34:39 www kernel: swp_pager_getswapspace(12): failed

but wasn't followed  by a "...was killed..." message.

Eventually there appeared what look like repeated disk errors, ending with:

Jun 11 02:04:59 www kernel: (da0:umass-sim0:0:0:0): Info: 0
Jun 11 02:04:59 www kernel: (da0:umass-sim0:0:0:0): Retrying command (per sense 
data)
Jun 11 02:04:59 www kernel: (da0:umass-sim0:0:0:0): READ(10). CDB: 28 00 04 63 3
4 50 00 00 18 00 
Jun 11 02:04:59 www kernel: (da0:umass-sim0:0:0:0): CAM status: SCSI Status Erro
r
Jun 11 02:04:59 www kernel: (da0:umass-sim0:0:0:0): SCSI status: Check Condition
Jun 11 02:04:59 www kernel: (da0:umass-sim0:0:0:0): SCSI sense: MEDIUM ERROR asc
:10,0 (ID CRC or ECC error)
Jun 11 02:04:59 www kernel: (da0:umass-sim0:0:0:0): Info: 0
Jun 11 02:04:59 www kernel: (da0:umass-sim0:0:0:0): Retrying command (per sense data)

which ended in a debugger prompt on the console. 

There was considerable network
activity around the same time 
which resembled an ssh attack.

The machine rebooted without incident, buildworld has been resumed with -j3.

If it happens again I'll save a backtrace if it'll be of interest.

Thanks for writing!

bob prohaska



home | help

Want to link to this message? Use this
URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?airZdW54ZsQ8tjPR>