Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 23 Feb 2021 13:49:49 -0700
From:      Alan Somers <asomers@freebsd.org>
To:        FreeBSD Hackers <freebsd-hackers@freebsd.org>
Subject:   The out-of-swap killer makes poor choices
Message-ID:  <CAOtMX2jYmrK7ftx62_NEfNCWS7O=giHKL1p9kXCqq1t5E1arxA@mail.gmail.com>

next in thread | raw e-mail | index | archive | help
To me it's always seemed like the out-of-swap killer kills the wrong
process.  Oh, it does the right thing with a trivial while(1) {malloc()}
test program, but not with real workloads.  To summarize the logic in
vm_pageout_oom:

* Don't kill system, protected, or killed processes
* Don't kill processes with a thread that isn't running or suspended
* Kill whichever process is using the most swap or swap + ram, depending on
the shortage variable.  On ties, kill the newest one.

This algorithm probably made sense in the days when computers had much more
swap than RAM.  But now it leads to several problems:

* It's almost guaranteed to do the wrong thing when shortage ==
VM_OOM_SWAPZ and there is little or no swap configured.  If no swap is
configured, it will kill the newest running or suspended process.  If a
little bit is configured, it will probably kill some idle process, like
zfsd, that is swapped out because it doesn't run very often.

* Even if multiple GB of swap are configured, the OOM killer is still
biased towards killing idle processes when shortage == VM_OOM_SWAPZ.  Most
often, the process responsible for an out-of-memory condition is not idle,
and is consuming large amounts of RAM.

* It ignores RLIMIT_RSS.  We consider that rlimit when deciding whether to
move a process from RAM to swap.

* The "out of swap space" kernel message doesn't specify whether the
process was killed because of insufficient swap or RAM (the shortage
variable)

I propose the following changes:

* Incorporate shortage into the "out of swap space" message.
* When walking the process list, if any process exceeds its RLIMIT_RSS,
choose it immediately, without bothering to compare it to older processes.
* Always consider the sum of a process's RAM + swap, regardless of the
shortage variable.

Does this make sense?  Am I missing something about shortage ==
VM_OOM_SWAPZ?  I don't understand why you would ever want to exclude
processes' RAM usage.  That logic was added in revision
2025d69ba7a68a5af173007a8072c45ad797ea23, but I don't understand the
rationale.

-Alan



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAOtMX2jYmrK7ftx62_NEfNCWS7O=giHKL1p9kXCqq1t5E1arxA>