From owner-freebsd-hackers@freebsd.org Tue Feb 23 21:24:04 2021 Return-Path: Delivered-To: freebsd-hackers@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 576435537D5 for ; Tue, 23 Feb 2021 21:24:04 +0000 (UTC) (envelope-from marklmi@yahoo.com) Received: from sonic311-24.consmr.mail.gq1.yahoo.com (sonic311-24.consmr.mail.gq1.yahoo.com [98.137.65.205]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 4DlX9b3KfLz4fZ7 for ; Tue, 23 Feb 2021 21:24:03 +0000 (UTC) (envelope-from marklmi@yahoo.com) X-SONIC-DKIM-SIGN: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s2048; t=1614115441; bh=Hi/dq99XsMym3XrxRi8QGkjBrBb2sdAN1p5lPO5lAOu=; h=X-Sonic-MF:From:Subject:Date:To:From:Subject; b=AY8xDKSrG+pnDXFW3EYawH+s9LpnNhQ16p9i2GZYRcCReuSIMsr3aKGNvtpD41vogZCx3zW+JtfMGHCSx0ZsionRnCD0T373OSZFABcy0ry/ZC7jS7rTglzeqezCRQyRRspYGPJct7IE1XxP2FovHS6qqnB3Ze7ErBCSVxqhMY9xFv2lFiTkIOb2L3hJiZNTP8pxcZwxX1+XwIEmwQvEa7t8q5kb2AIqVVuNJfdaLEGRKOc6gXjdMiUhr+HdeBdF+OJFxcKnAG99CG0UOW+gHqr7Dtk9RZCNrvCwG/9+YxzVXPpLtDTAI/HJm49tF/Rnir78X+hF085dDaNfzA2tiw== X-YMail-OSG: o0RRbo8VM1nJbY7A_QjhUxZ73HDed7h96uwr32yfe7PgygJ3ZvWv_nSzl57JdT0 Ydn6WcohN6bYPtEtx3NlGxSQIuq5axy52nFi3skq.0G1EJqyecIrKBl5SoBnnXPCQowhCrZLxpeH wzvgVvdFr_mTodXJx5vO2Rtu9iNwyhUcxUX498nl0XJucS324ff9mE3Qiad1xj0UXX7PEwxhxuA8 9dChBptP0._PDHS9MnzBgHNBi_J.GUrMeWvwNWkwMwWHaJH7tz9gEgJ9FYqKg05WvtSIMkOdL4cD oaMFkjVJHVP6XD6NTrb45iwBx11jM1wr0oBqRDx0IU70hvpvagpWdhYroROzwoUGV2ajio3eEZzz 1dUdfhIcw03CV8eBa6Q1h_CHEVHdrPBAyv99ZYvKUVcYmzVJqn0m.OVtORGUf5wAMcNYXE6CsJLU 3zvQxkF03Hw6s45x4M7kBH4Wx8cGj6so.jBZRFKq4XqoluyX0JOkGJ1winU7yLxY5uJ51PoXn8za hk6EP7CJnWYSefqvKzLnG0lU2ZWVwOlGGOjwjn7fyLEP0ObpJaf.yYPSihAM6QU2sHYyhTHaTJyu qPHFbIg97D_ZVhQXi9rpVWqBFgfSOfSI4.g4QdAlzljdUVC309fNlcrmC7HSeoTFMarhzhUW2.QD vlcLa81.U0r2ob13Q2jUQJb3jWV1sKD.dF.KEmL7M_r8Z1kuFr0gIjrO4s4W0joxSjCiGp3uDzJw 9hMz8PrZo5YBC3U77ZEF_bI2.QFNVlcFdKisc96NFzhegG7rnZwDPh._S9E6HHMzb3sqbTHI9rlo n1nfGaMYQhLU2MVTCI4uCoMxK5D0qjoFAeDqSB1sBNZ4ijGNuJjhRvPsYTJto5xZent3odxha.8V HLZkdozuYVMx2F4XDmQ1lqFRKFDu0xiGpje5CTUE.uXh4Me7OtcIczrUkCnEEWNPNlX0oFiMEAYj e.Q1HTLCNLmUuZKTF20e39QXtiynOO7DQgFk1lFacmBauv8rZb8sjk7YRmtnhhN2vECZtmx_kT53 xu2.Wdxg.rTSukqlXZmyFH3B15cXgVC_RKC87b0zVIKSEwlIJCXgo3rkgA6cKcIvdiyGpneE1ui0 qYuivYXvTlVWWyCznQOcc0gNPNu8xpXyCOfpCF0QXrwxkfC7llzkUmCSUNaRCaOACgBQLzDpBQy0 vuxTYTFpD5j1FvEgodOQvyY024_Y_CRK_yLqfPsCbASjRcGxaxnavU20Rw387s86ahCxiHzAw1mV CRM.yqD_2gt0dMKut1D3SY5zbF_SLm0B1_AVQHwa_QIBzECozNv5ZEftltsEHEVXOdLITv7kah2X k_5Q7o6MvFbSY53T4VPADx6.WpdTvaVFDOQ1TAk3RvVtN9HmOE.7T1APENCH7T75hoR3ZwXIfVcB .lrV1cK65O9MvIaYjZP7IGHb8kR3tkz0cNQXPuFLygt9AnQEVCAtlpJzvW8W2iru94Jv4Q0z9NE4 aN.Pq_8UXtBVxTLNwxGX.q9fitdErh4g.HN5dGTfCSnj0FD5sKtWDNOR1gGXo8RcQhuRx4E9a9Yd 1TMeA4sCr0wX3hWLmhHKbimDwUPtNL.ZioZ5Yt8WYrcoV7wHnd5qVhhgnKRgT6xy0QCCpt9fFLiD nYE1lx7d.XVOrb6oErxXo0qf4ik11VMNr5f4mw9QLZK5lbdeVigfn9OtsninQb7q2ZMOEfEay_nL yxPOIGTXSATDwgUkAcnDpFMd_mGJqzFRU3f.T09feZ9xgfujpQfXfW2UpaAOOWNL5LrCzW3I8QOw 2HKSh6kJPZ1CvMCnp7H87oPc69k4xZt4RV6mV75CPACwNpbK6CV9LTQSJUfZGsDyAd8MBnbAE7N5 0wciuV6zSUBq5oM3gu0uKKLiy9tOBrsMZUqwHAG3S1Jt2uTYvZQcYSWNgYcNg5l1HBHfiM.x6HY9 t3.lZ1ri9ZmzTNmwbfK7kZv9lYRkOoMk1Qydx8aNvM15eQNl9EeDe2lcCBxRHJ61w8cnDHsCOPJx ErQurq31bfFFKQUtggrkkvsNe5gg.3bbiaQV4O3tz9svbHwoKdhGTKd4u_45dbBbJte6WHDsXWUH Vg.CuYnL_nxUUrb8vFW_to3H6lwR02k13NueYFe1cptZEnP1aVSdzK2qjoYcv2eDqLlM6PWAAZbI m5Db_ALUlMNlllTrMjiROOU_9rxBu4RGhX8gfkgTsUzpSIMC2DDPxIrYXZJ.Q8QjxDUP5GfCxWt4 FDENozYY6M92OvK3UB965H8cEZVrU4L9j0Q_taVAKCrkR0Sgv6vZNM6B41WYp2.u3txK3llDjFdB zGi0ctS1hp2a90aXFVV3rF4rei6CGwXe4SEiALGIKuCVaHS._98tZ7vp.TePlk.2Ra..UE0u7vo1 4fYrThVv7NwUvcR6kt0TsnRDyn5c0r4nfElfB9hxBsQsWZU45krL9KjW.e0Mpp4SXwJIRgTkCfGg xM98gnpXLwtxD9jYSwST2PQPHTC_6ldLbm0wnK7hmnkhHWp4dEMxlMQ-- X-Sonic-MF: Received: from sonic.gate.mail.ne1.yahoo.com by sonic311.consmr.mail.gq1.yahoo.com with HTTP; Tue, 23 Feb 2021 21:24:01 +0000 Received: by smtp423.mail.gq1.yahoo.com (VZM Hermes SMTP Server) with ESMTPA ID 3aa986d8c7310bca367b3d92bd394e1f; Tue, 23 Feb 2021 21:23:57 +0000 (UTC) From: Mark Millard Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: quoted-printable Mime-Version: 1.0 (Mac OS X Mail 14.0 \(3654.60.0.2.21\)) Subject: Re: The out-of-swap killer makes poor choices Message-Id: <93DA798A-1109-48B0-AD5E-063B5A182BFB@yahoo.com> Date: Tue, 23 Feb 2021 13:23:56 -0800 To: asomers@freebsd.org, FreeBSD Hackers X-Mailer: Apple Mail (2.3654.60.0.2.21) References: <93DA798A-1109-48B0-AD5E-063B5A182BFB.ref@yahoo.com> X-Rspamd-Queue-Id: 4DlX9b3KfLz4fZ7 X-Spamd-Bar: --- X-Spamd-Result: default: False [-3.50 / 15.00]; TO_DN_SOME(0.00)[]; FREEMAIL_FROM(0.00)[yahoo.com]; MV_CASE(0.50)[]; R_SPF_ALLOW(-0.20)[+ptr:yahoo.com]; DKIM_TRACE(0.00)[yahoo.com:+]; RCPT_COUNT_TWO(0.00)[2]; DMARC_POLICY_ALLOW(-0.50)[yahoo.com,reject]; NEURAL_HAM_SHORT(-1.00)[-1.000]; FROM_EQ_ENVFROM(0.00)[]; RCVD_TLS_LAST(0.00)[]; MIME_TRACE(0.00)[0:+]; FREEMAIL_ENVFROM(0.00)[yahoo.com]; ASN(0.00)[asn:36647, ipnet:98.137.64.0/20, country:US]; RBL_DBL_DONT_QUERY_IPS(0.00)[98.137.65.205:from]; DWL_DNSWL_NONE(0.00)[yahoo.com:dkim]; MID_RHS_MATCH_FROM(0.00)[]; ARC_NA(0.00)[]; R_DKIM_ALLOW(-0.20)[yahoo.com:s=s2048]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; FROM_HAS_DN(0.00)[]; NEURAL_HAM_LONG(-1.00)[-1.000]; MIME_GOOD(-0.10)[text/plain]; SPAMHAUS_ZRD(0.00)[98.137.65.205:from:127.0.2.255]; TO_MATCH_ENVRCPT_SOME(0.00)[]; RCVD_IN_DNSWL_NONE(0.00)[98.137.65.205:from]; RWL_MAILSPIKE_POSSIBLE(0.00)[98.137.65.205:from]; RCVD_COUNT_TWO(0.00)[2]; MAILMAN_DEST(0.00)[freebsd-hackers] X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Technical discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 23 Feb 2021 21:24:04 -0000 Alan Somers asomers at freebsd.org wrote on Tue Feb 23 20:50:02 UTC 2021 : . . . > * The "out of swap space" kernel message doesn't specify whether the > process was killed because of insufficient swap or RAM (the shortage > variable) . . . I'm only dealing with "why" notifications part of the Email. I'm not sure your notes are complete for coverage, although I can not claim to fully understand the implications of the below. So, just some things to consider in that area . . . When I looked at the code I found 4 things that lead to the same "out of swap space" messages for which no "swap_pager_getswapspace(...): failed" seemed to need to be involved: Sustained low free RAM (via stays-runnable processes). A sufficiently delayed pageout. The swap blk uma zone was exhausted. The swap pctrie uma zone was exhausted. (I run a modified kernel that reports messages about which of the 4 initiated the OOM. I depend on the "swap_pager_getswapspace(...): failed" notices to detect actual out of swap/paging space conditions.) The first 2 of the 4 above have some tunables: # # Delay when persistent low free RAM leads to # Out Of Memory killing of processes. The # delay is a count of kernel-attempts to gain # free RAM (so not time units). vm.pageout_oom_seq=3D120 (The default is 12 as far as I know. I systematically use the above value.) NOTE: stable/12 -r351776 got the support for the following: (I've not checked the match to releases.) # # For plunty of swap/paging space (will not # run out), avoid pageout delays leading to # Out Of Memory killing of processes: vm.pfault_oom_attempts=3D-1 (I systematically use the above value but am careful to strongly expect that I'd not actually run out of swap space.) That last has the alternative structure needed for when out of swap is a concern as I understand what I've been told/read (replace ???'s with notation for positive integers): # # For possibly insufficient swap/paging space # (might run out), increase the pageout delay # that leads to Out Of Memory killing of # processes: #vm.pfault_oom_attempts=3D ??? #vm.pfault_oom_wait=3D ??? # (The multiplication of the two values is the # total but there are other potential tradoffs # in the factors multiplied for the same total.) For reference: # sysctl -d vm.pfault_oom_wait vm.pfault_oom_wait: Number of seconds to wait for free pages before = retrying the page fault handler # sysctl -d vm.pfault_oom_attempts vm.pfault_oom_attempts: Number of page allocation attempts in page fault = handler before it triggers OOM handling # sysctl -d vm.pageout_oom_seq vm.pageout_oom_seq: back-to-back calls to oom detector to start OOM I'm definitely not a fan of the misleading "out of swap space" notices. They greatly mislead me until Mark J. got involved and did some basic fixing to my understanding of the context, including pointing out vm.pageout_oom_seq at the time. (vm.pfault_oom_attempts and vm.pfault_oom_wait are from a later discovery.) It seems that the detailed reason for the OOM helps drive the appropriate future configuration choices for avoiding or managing things. In my view the the specific reason should be explicitly reported. If nothing else, it provides context to indicate in a question to the lists about what is then appropriate to do give the occurrence observed. =3D=3D=3D Mark Millard marklmi at yahoo.com ( dsl-only.net went away in early 2018-Mar)