From owner-freebsd-hackers@freebsd.org Tue Feb 23 21:11:38 2021 Return-Path: Delivered-To: freebsd-hackers@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 75879552F9D for ; Tue, 23 Feb 2021 21:11:38 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 4DlWvF2pNwz4dD1; Tue, 23 Feb 2021 21:11:37 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from tom.home (kib@localhost [127.0.0.1]) by kib.kiev.ua (8.16.1/8.16.1) with ESMTPS id 11NLBMZt028046 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NO); Tue, 23 Feb 2021 23:11:25 +0200 (EET) (envelope-from kostikbel@gmail.com) DKIM-Filter: OpenDKIM Filter v2.10.3 kib.kiev.ua 11NLBMZt028046 Received: (from kostik@localhost) by tom.home (8.16.1/8.16.1/Submit) id 11NLBMW4028045; Tue, 23 Feb 2021 23:11:22 +0200 (EET) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: tom.home: kostik set sender to kostikbel@gmail.com using -f Date: Tue, 23 Feb 2021 23:11:22 +0200 From: Konstantin Belousov To: Alan Somers Cc: FreeBSD Hackers Subject: Re: The out-of-swap killer makes poor choices Message-ID: References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Spam-Status: No, score=-1.0 required=5.0 tests=ALL_TRUSTED,BAYES_00, DKIM_ADSP_CUSTOM_MED,FORGED_GMAIL_RCVD,FREEMAIL_FROM, NML_ADSP_CUSTOM_MED autolearn=no autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on tom.home X-Rspamd-Queue-Id: 4DlWvF2pNwz4dD1 X-Spamd-Bar: / Authentication-Results: mx1.freebsd.org; dkim=none; dmarc=fail reason="No valid SPF, No valid DKIM" header.from=gmail.com (policy=none); spf=softfail (mx1.freebsd.org: 2001:470:d5e7:1::1 is neither permitted nor denied by domain of kostikbel@gmail.com) smtp.mailfrom=kostikbel@gmail.com X-Spamd-Result: default: False [0.98 / 15.00]; RCVD_TLS_ALL(0.00)[]; ARC_NA(0.00)[]; DMARC_POLICY_SOFTFAIL(0.10)[gmail.com : No valid SPF, No valid DKIM,none]; FROM_HAS_DN(0.00)[]; RBL_DBL_DONT_QUERY_IPS(0.00)[2001:470:d5e7:1::1:from]; FREEMAIL_FROM(0.00)[gmail.com]; TO_MATCH_ENVRCPT_ALL(0.00)[]; MIME_GOOD(-0.10)[text/plain]; HAS_XAW(0.00)[]; NEURAL_HAM_LONG(-1.00)[-1.000]; R_SPF_SOFTFAIL(0.00)[~all]; NEURAL_SPAM_MEDIUM(0.98)[0.983]; SPAMHAUS_ZRD(0.00)[2001:470:d5e7:1::1:from:127.0.2.255]; TO_DN_ALL(0.00)[]; RCPT_COUNT_TWO(0.00)[2]; NEURAL_SPAM_SHORT(1.00)[1.000]; FROM_EQ_ENVFROM(0.00)[]; R_DKIM_NA(0.00)[]; FREEMAIL_ENVFROM(0.00)[gmail.com]; ASN(0.00)[asn:6939, ipnet:2001:470::/32, country:US]; MIME_TRACE(0.00)[0:+]; MAILMAN_DEST(0.00)[freebsd-hackers]; RCVD_COUNT_TWO(0.00)[2] X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Technical discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 23 Feb 2021 21:11:38 -0000 On Tue, Feb 23, 2021 at 01:49:49PM -0700, Alan Somers wrote: > To me it's always seemed like the out-of-swap killer kills the wrong > process. Oh, it does the right thing with a trivial while(1) {malloc()} > test program, but not with real workloads. To summarize the logic in > vm_pageout_oom: > > * Don't kill system, protected, or killed processes > * Don't kill processes with a thread that isn't running or suspended > * Kill whichever process is using the most swap or swap + ram, depending on > the shortage variable. On ties, kill the newest one. > > This algorithm probably made sense in the days when computers had much more > swap than RAM. But now it leads to several problems: > > * It's almost guaranteed to do the wrong thing when shortage == > VM_OOM_SWAPZ and there is little or no swap configured. If no swap is > configured, it will kill the newest running or suspended process. If a > little bit is configured, it will probably kill some idle process, like > zfsd, that is swapped out because it doesn't run very often. > > * Even if multiple GB of swap are configured, the OOM killer is still > biased towards killing idle processes when shortage == VM_OOM_SWAPZ. Most > often, the process responsible for an out-of-memory condition is not idle, > and is consuming large amounts of RAM. > > * It ignores RLIMIT_RSS. We consider that rlimit when deciding whether to > move a process from RAM to swap. > > * The "out of swap space" kernel message doesn't specify whether the > process was killed because of insufficient swap or RAM (the shortage > variable) > > I propose the following changes: > > * Incorporate shortage into the "out of swap space" message. ok with me, not sure if users could make any action based on discretion > * When walking the process list, if any process exceeds its RLIMIT_RSS, > choose it immediately, without bothering to compare it to older processes. RSS was never supposed to be a limit on how many pages are resident. It only provided some preference for more aggressive paging out process' pages. Or put it differently, RSS is not supposed to be the working set size in VMS/NT sense. > * Always consider the sum of a process's RAM + swap, regardless of the > shortage variable. > > Does this make sense? Am I missing something about shortage == > VM_OOM_SWAPZ? I don't understand why you would ever want to exclude > processes' RAM usage. That logic was added in revision > 2025d69ba7a68a5af173007a8072c45ad797ea23, but I don't understand the > rationale. SWAPZ means that swap zone is exhausted. In this case, killing a process that does not use swap, would not free any space in the zone. Similarly, we should select a process with largest swap (== metadata kept in swap zone) use to free something in swap zone. In other words, such kill could be not enough and really require more and more rounds of OOM, esp. on machine with very small swap configured.