Date: Thu, 11 Jun 2026 11:57:07 -0700 From: Mark Millard <marklmi@yahoo.com> To: bob prohaska <fbsd@www.zefox.net> Cc: freebsd-current@freebsd.org Subject: Re: Buildworld finishes despite swap exhaustion Message-ID: <cedb9383-0b9f-4ad9-9953-e426d78f0483@yahoo.com> In-Reply-To: <airZdW54ZsQ8tjPR@www.zefox.net> References: <aieNCJnCU3QfyJDV@www.zefox.net> <040df279-5f61-4f4f-ae4a-79bd44797b53@yahoo.com> <airZdW54ZsQ8tjPR@www.zefox.net>
index | next in thread | previous in thread | raw e-mail
On 6/11/26 08:51, bob prohaska wrote: > On Tue, Jun 09, 2026 at 08:22:02AM -0700, Mark Millard wrote: >> On 6/8/26 20:48, bob prohaska wrote: >>> Lately a Pi2B running buildworld reported an >>> exhaustion of swap, but buildworld kept running >>> and seemingly finished successfully. >>> >>> The report came on the serial console, I didn't >>> find anything in the buildworld log. >>> >>> This seems a very great improvement. Swap exhaustion >>> differs from other sorts of failure, in that one can >>> simply re-try the job with some hope of success when >>> the workload is lighter. >>> >>> Am I interpreting this correctly? >> >> [Because the actual messages are not reported, I'm making some >> assumptions about the exact messages that you got.] >> >> >> Remember vm.pageout_oom_seq ? >> > > Yes, /boot/loader.conf contains: > vm.pageout_oom_seq="4096" > vm.pfault_oom_attempts="3" > #vm.pfault_oom_attempts="120" > vm.pfault_oom_wait="20" > > I'll admit to not remembering how 4096 was chosen.... > probably just a wild guess. > >> The larger that value used, the longer the system operates with the >> amount of free RAM below the target threshold: in other words, it makes >> more tries at getting to the threshold before giving up and starting to >> kill processes to get the free RAM. >> >> Running out of swap of itself just means that SWAP can not be used to >> gain free RAM when such is not essential. RAM+SWAP can still be >> (marginally) sufficient over such a time if no memory allocations >> actually fail. If sufficient RAM/SWAP ends up being freed before >> vm.pageout_oom_seq related kills happen, no overall failure happens. >> >> >> As for the messages as I understand them: >> >> kernel: swap_pager: out of swap space >> >> does not report a failure, just a limiting condition. >> >> By contrast: >> >> kernel: swp_pager_getswapspace(2): failed >> >> reports a failure: the swap space allocation was necessary. It normally >> nleads to the likes of: >> >> kernel: pid ??? (???), jid ???, uid ???, was killed: failed to reclaim >> memory > > A more recent incident reported in /var/log/messages: > > Jun 4 12:34:39 www kernel: swap_pager: out of swap space > Jun 4 12:34:39 www kernel: swp_pager_getswapspace(12): failed > > but wasn't followed by a "...was killed..." message. Interesting. I've not had that combination as far as I know. Now I know it is possible. Thanks. > > Eventually there appeared what look like repeated disk errors, ending with: > > Jun 11 02:04:59 www kernel: (da0:umass-sim0:0:0:0): Info: 0 > Jun 11 02:04:59 www kernel: (da0:umass-sim0:0:0:0): Retrying command (per sense > data) > Jun 11 02:04:59 www kernel: (da0:umass-sim0:0:0:0): READ(10). CDB: 28 00 04 63 3 > 4 50 00 00 18 00 > Jun 11 02:04:59 www kernel: (da0:umass-sim0:0:0:0): CAM status: SCSI Status Erro > r > Jun 11 02:04:59 www kernel: (da0:umass-sim0:0:0:0): SCSI status: Check Condition > Jun 11 02:04:59 www kernel: (da0:umass-sim0:0:0:0): SCSI sense: MEDIUM ERROR asc > :10,0 (ID CRC or ECC error) The above looks like reporting of a drive problem. Getting to be time for a replacement? > Jun 11 02:04:59 www kernel: (da0:umass-sim0:0:0:0): Info: 0 > Jun 11 02:04:59 www kernel: (da0:umass-sim0:0:0:0): Retrying command (per sense data) > > which ended in a debugger prompt on the console. > > There was considerable network > activity around the same time > which resembled an ssh attack. I'd guess that such was not likely to contribute to a false "MEDIUM ERROR" with "(ID CRC or ECC error)". > > The machine rebooted without incident, buildworld has been resumed with -j3. > > If it happens again I'll save a backtrace if it'll be of interest. > -- === Mark Millard marklmi at yahoo.comhome | help
Want to link to this message? Use this
URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?cedb9383-0b9f-4ad9-9953-e426d78f0483>
