Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 4 Jan 2020 16:05:59 -0800
From:      Mark Millard <marklmi@yahoo.com>
To:        Wojciech Puchar <wojtek@puchar.net>, freebsd-hackers@freebsd.org
Subject:   Re: processes are killed because of out of swap space
Message-ID:  <C1C8F724-88B0-49D9-A9DF-DB0AA8AF3164@yahoo.com>
References:  <C1C8F724-88B0-49D9-A9DF-DB0AA8AF3164.ref@yahoo.com>

next in thread | previous in thread | raw e-mail | index | archive | help

Wojciech Puchar wojtek at puchar.net wrote on
Sat Jan 4 22:35:35 UTC 2020 :

> when i try to use more virtual memory (tested by putting files to tmpfs 
> /tmp).
> 
> 
> like that
> 
> pid 16977 (bhyve), jid 0, uid 0, was killed: out of swap space

Unfortunately, the wording of this type of message is a misnomer for
what typically drives the kills: it is actually driven by being unable
to gain more free memory when it is below threshold but FreeBSD will
not swap-out processes that stay runnable (or are running), only ones
that are waiting. Even a single process that stays runnable and keeps 
lots of RAM in the active category can lead to kills when swap is
unused or little used. So the kill-behavior is very workload dependent.

Real "out of swap" conditions (tend to?) also have messages
similar to:

Aug  5 17:54:01 sentinel kernel: swap_pager_getswapspace(32): failed

If you are not seeing such swap_pager_getswapspace  messages, then
it is likely that the mount of swap space still available is not the
actual thing driving the kills.

Another thing that can lead to kills is paging I/O that is
slow.

> the problem is that it's less than 10GB swap used while i have 120GB 
> available.

That fits with the above comments.

> before processed begin to be killed system stalls for a while.


The below notes may or may not prove useful
for your context.

For delaying how long free RAM staying low is
tolerated, one can increase vm.pageout_oom_seq from
12 to larger. The management of slow paging I've
less experience with but do have some notes about
below.

Examples follow that I use in contexts with
sufficient RAM that I do not have to worry about
out of swap/page space. These I've set in
/etc/sysctl.conf . (Of coruse, I'm not trying to
deliberately run out of RAM.)

#
# Delay when persisstent low free RAM leads to
# Out Of Memory killing of processes:
vm.pageout_oom_seq=120

(I'll note that figures like 1024 or 1200 or
even more are possible. This is controlling how
many tries at regaining sufficient free RAM
that that level would be tolerated long-term.
After that it starts Out Of Memory kills to get
some free RAM.)

#
# For plunty of swap/paging space (will not
# run out), avoid pageout delays leading to
# Out Of Memory killing of processes:
vm.pfault_oom_attempts=-1

(Note: In my context "plunty" really means
sufficient RAM that paging is rare. But
others have reported on using the -1 in
contexts where paging was heavy at times and
OOM kills had been happening that were
eliminated by the assignment.)

I've no experience with the below alternative
to that -1 use:

#
# For possibly insufficient swap/paging space
# (might run out), increase the pageout delay
# that leads to Out Of Memory killing of
# processes:
#vm.pfault_oom_attempts= ???
#vm.pfault_oom_wait= ???
# (The multiplication is the total but there
# are other potential tradoffs in the factors
# multiplied, even for nearly the same total.)



I'm not claiming that these 3 vm.???_oom_???
figures are always sufficient. Nor am I
claiming that tunables are always available
that would be sufficient. Nor that it is easy
to find the ones that do exist that might
help for specific OOM kill issues.

I have seen reports of OOM kills for other
reasons when both vm.pageout_oom_seq and
vm.pfault_oom_attempts=-1 were in use.
As I understand, FreeBSD did not report
what kibnd of condition lead to the
decision to do an OOM kill.

(I do not remember the vm.pageout_oom_seq
figures from those reports but no figure is
designed to make the delay unbounded. There
may be large enough figures to effectively
be bounded beyond any reasonable time to
wait for an oom.)



===
Mark Millard
marklmi at yahoo.com
( dsl-only.net went
away in early 2018-Mar)




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?C1C8F724-88B0-49D9-A9DF-DB0AA8AF3164>