Date: Tue, 23 Feb 2021 14:20:21 -0700 From: Alan Somers <asomers@freebsd.org> To: Konstantin Belousov <kostikbel@gmail.com> Cc: FreeBSD Hackers <freebsd-hackers@freebsd.org> Subject: Re: The out-of-swap killer makes poor choices Message-ID: <CAOtMX2jeyuM_cEygW=vEjhMSqO1jM2UDs29xnYYkCZN2CLKFxA@mail.gmail.com> In-Reply-To: <YDVvenUpLMhGoLR4@kib.kiev.ua> References: <CAOtMX2jYmrK7ftx62_NEfNCWS7O=giHKL1p9kXCqq1t5E1arxA@mail.gmail.com> <YDVvenUpLMhGoLR4@kib.kiev.ua>
next in thread | previous in thread | raw e-mail | index | archive | help
On Tue, Feb 23, 2021 at 2:11 PM Konstantin Belousov <kostikbel@gmail.com> wrote: > On Tue, Feb 23, 2021 at 01:49:49PM -0700, Alan Somers wrote: > > To me it's always seemed like the out-of-swap killer kills the wrong > > process. Oh, it does the right thing with a trivial while(1) {malloc()} > > test program, but not with real workloads. To summarize the logic in > > vm_pageout_oom: > > > > * Don't kill system, protected, or killed processes > > * Don't kill processes with a thread that isn't running or suspended > > * Kill whichever process is using the most swap or swap + ram, depending > on > > the shortage variable. On ties, kill the newest one. > > > > This algorithm probably made sense in the days when computers had much > more > > swap than RAM. But now it leads to several problems: > > > > * It's almost guaranteed to do the wrong thing when shortage == > > VM_OOM_SWAPZ and there is little or no swap configured. If no swap is > > configured, it will kill the newest running or suspended process. If a > > little bit is configured, it will probably kill some idle process, like > > zfsd, that is swapped out because it doesn't run very often. > > > > * Even if multiple GB of swap are configured, the OOM killer is still > > biased towards killing idle processes when shortage == VM_OOM_SWAPZ. > Most > > often, the process responsible for an out-of-memory condition is not > idle, > > and is consuming large amounts of RAM. > > > > * It ignores RLIMIT_RSS. We consider that rlimit when deciding whether > to > > move a process from RAM to swap. > > > > * The "out of swap space" kernel message doesn't specify whether the > > process was killed because of insufficient swap or RAM (the shortage > > variable) > > > > I propose the following changes: > > > > * Incorporate shortage into the "out of swap space" message. > ok with me, not sure if users could make any action based on discretion > > > * When walking the process list, if any process exceeds its RLIMIT_RSS, > > choose it immediately, without bothering to compare it to older > processes. > RSS was never supposed to be a limit on how many pages are resident. > It only provided some preference for more aggressive paging out process' > pages. > > Or put it differently, RSS is not supposed to be the working set size > in VMS/NT sense. > Sure, but given that we must kill _something_, preferentially killing a process that was specifically limited sounds better than killing a process that wasn't, won't you agree? > > > * Always consider the sum of a process's RAM + swap, regardless of the > > shortage variable. > > > > Does this make sense? Am I missing something about shortage == > > VM_OOM_SWAPZ? I don't understand why you would ever want to exclude > > processes' RAM usage. That logic was added in revision > > 2025d69ba7a68a5af173007a8072c45ad797ea23, but I don't understand the > > rationale. > > SWAPZ means that swap zone is exhausted. In this case, killing a process > that does not use swap, would not free any space in the zone. Similarly, > we should select a process with largest swap (== metadata kept in swap > zone) > use to free something in swap zone. > But killing a process that does not use swap could reduce the need for more swap by other processes. How many cases are there where a process needs more SWAP and won't settle for RAM instead? > > In other words, such kill could be not enough and really require more and > more rounds of OOM, esp. on machine with very small swap configured. >
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAOtMX2jeyuM_cEygW=vEjhMSqO1jM2UDs29xnYYkCZN2CLKFxA>