From owner-freebsd-arm@freebsd.org Tue Jan 28 05:30:01 2020 Return-Path: Delivered-To: freebsd-arm@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id DA14723C48E for ; Tue, 28 Jan 2020 05:30:01 +0000 (UTC) (envelope-from marklmi@yahoo.com) Received: from sonic309-22.consmr.mail.gq1.yahoo.com (sonic309-22.consmr.mail.gq1.yahoo.com [98.137.65.148]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 486FYh2Cklz4hXg for ; Tue, 28 Jan 2020 05:30:00 +0000 (UTC) (envelope-from marklmi@yahoo.com) X-YMail-OSG: HgzgAewVM1m_dMFeIInC42Gy1tZK7tXOWysSCh3.ZvgqZ1_RLiIWLmb7_lPD4SG 3IXrbEJrtFv41l.AmLsDyVe8ypz3cAdVNSdAc8hfAHE9Hk6mr8B2Sjy9lzLRRcCkgXSCHmCj.3kX b75rOrVvzi5DuQ1D_kDaamnA7fQhUN4sCD8XjFLEx8dEtiuBgaULExT4o2AKn92Oi1M6gCxvpRAM qEAgO694do2_lgddzQ0LRbBtC95IbwOmFFjz5FrDaaZBZkuStsqmWfp2BLr5J8kT24lsquUARdI4 uZp0hBHK8QOzH5cMWevkklg5lGnU97gYYbbCur.3rdW7wAYdSN4SGNN_aAazLwMb0wrIIpy1fMoL _wZokjK67U3NrE8DveEPJ21bsGHMahuPpBIbtfqjAGOV4_xigua5NQkJdyklrGSsOUlUDkKK2T9Q Ucrpxs5s23K9S_p1UA0qTYYdkAV14HBm6NturjN_ipWMoZRbMpXRBgdZU7Ioq6A2U_kusUQCKpS_ 5NyglfiXZjGOlYVp4cb6Ll2o0QrMJxBAne_sVzcmyDhuT272McafA.MSKFm21j8S4SofxiUCnssw zcVvJZo4WCNlYpiN.NHCj6z2ejZdhswJp6ewjbFZsq_4n_uFkS8co9XRgEAWQK94pdFwxsWDlAaO xiLt.8Ew.8oa5yFQMtoWL6c3ZiXvsw6o7eADOi4mbJiZps6n5bawB21mAjiWUdthUQ7q1WqGt.IF YLblYuIf_Ux0WHnfzOkQRtwW3e6V8I5iFyIheLCaKTQyY.38UDft6EICDH7tpsr.XK5m9kKdAOBH s4X6u5JLF5KjPsgbDvzC4wPxpZkbnGfwJy0tsT..xha9ZyON3t_PRMNAUoQ8afnT_dkY3LN4Ihfj _5qWZdnH0zkg.WZv5cooKYC73Mx..H_ascV94c4Em0kcdhoWHkKHndc0TBNAXSfLFIADrjCiqmxl aASqwbN6v2.eFnMqXtKipXLlt6kjtYVMjhjFLzm6wlQNl7KK0RhyKF_cBE78gLk_.cf1.CB01syn Vqng8jp_kgfM3YQqpFvANmfm6H9Pe9qa4DCY4g7mdex65oSHIIB21mE7w8W7X_MRXP4d5G65Qgws _x6wgzicSrg7gSP3f0XRUU_u7DHJzBlY71vBlrKOs3nQKHT0SnorNwuPNv24_cNddR.5Uf5v7260 P3JBE7bRfCRIPYPL2U8RjZMybMqIneEYboes85Gyq8C9UYVvVMLXUyz8zVQcoMpYDyXrpmEgc.5B B8ueyfE0RrwnKhJic9GCCJivt_Q22TWLzzzXWIBpDbGKHveNfvAKhEHGddGIvt0feMGL18gfgi6E jWbL7gyNoNj5rU_ie7ZOtty3RkbtesA.Voooc5EUu2s7yIDM1I.vQUOTlQqyt0COwcEMYplJvR4V Pc.HODwOo687E5AR2maM.VeHlmGLcb0zz6A-- Received: from sonic.gate.mail.ne1.yahoo.com by sonic309.consmr.mail.gq1.yahoo.com with HTTP; Tue, 28 Jan 2020 05:29:58 +0000 Received: by smtp426.mail.ne1.yahoo.com (Oath Hermes SMTP Server) with ESMTPA ID 9b21b4bcea3a4224488b3b33387beaea; Tue, 28 Jan 2020 05:29:54 +0000 (UTC) Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 13.0 \(3608.40.2.2.4\)) Subject: Re: OOMA kill with vm.pfault_oom_attempts="-1" on RPi3 at r357147 From: Mark Millard In-Reply-To: <20200128035317.GA12644@www.zefox.net> Date: Mon, 27 Jan 2020 21:29:52 -0800 Cc: freebsd-arm@freebsd.org, freebsd-current@freebsd.org Content-Transfer-Encoding: quoted-printable Message-Id: <18150258-6210-451E-A5B9-528129A05974@yahoo.com> References: <20200127190709.GA11328@www.zefox.net> <20200128035317.GA12644@www.zefox.net> To: bob prohaska X-Mailer: Apple Mail (2.3608.40.2.2.4) X-Rspamd-Queue-Id: 486FYh2Cklz4hXg X-Spamd-Bar: -- X-Spamd-Result: default: False [-2.50 / 15.00]; ARC_NA(0.00)[]; NEURAL_HAM_MEDIUM(-1.00)[-0.998,0]; R_DKIM_ALLOW(-0.20)[yahoo.com:s=s2048]; FROM_HAS_DN(0.00)[]; RCPT_COUNT_THREE(0.00)[3]; R_SPF_ALLOW(-0.20)[+ptr:yahoo.com]; FREEMAIL_FROM(0.00)[yahoo.com]; MIME_GOOD(-0.10)[text/plain]; MV_CASE(0.50)[]; NEURAL_HAM_LONG(-1.00)[-1.000,0]; TO_DN_SOME(0.00)[]; IP_SCORE_FREEMAIL(0.00)[]; TO_MATCH_ENVRCPT_SOME(0.00)[]; DKIM_TRACE(0.00)[yahoo.com:+]; DMARC_POLICY_ALLOW(-0.50)[yahoo.com,reject]; RCVD_IN_DNSWL_NONE(0.00)[148.65.137.98.list.dnswl.org : 127.0.5.0]; RCVD_TLS_LAST(0.00)[]; RCVD_COUNT_TWO(0.00)[2]; FROM_EQ_ENVFROM(0.00)[]; MIME_TRACE(0.00)[0:+]; FREEMAIL_ENVFROM(0.00)[yahoo.com]; ASN(0.00)[asn:36647, ipnet:98.137.64.0/21, country:US]; MID_RHS_MATCH_FROM(0.00)[]; IP_SCORE(0.00)[ip: (-2.15), ipnet: 98.137.64.0/21(0.84), asn: 36647(0.67), country: US(-0.05)]; DWL_DNSWL_NONE(0.00)[yahoo.com.dwl.dnswl.org : 127.0.5.0] X-BeenThere: freebsd-arm@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Porting FreeBSD to ARM processors." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 28 Jan 2020 05:30:01 -0000 On 2020-Jan-27, at 19:53, bob prohaska wrote: > On Mon, Jan 27, 2020 at 06:22:20PM -0800, Mark Millard wrote: >>=20 >> So far as I know, in the past progress was only made when someone >> already knowledgable got involved in isolating what was happening >> and how to control it. >>=20 > Indeed. One can only hope said knowledgeables are reading.... May be I can suggest something that might kick-start evidence gathering a little bit: add 4 unconditional printf's to the kernel code, each just before one of the vm_pageout_oom(. . .) calls. Have the message uniquely identify which of the 4 it is before. The details of what I found that suggested this follows. I found: #define VM_OOM_MEM 1 #define VM_OOM_MEM_PF 2 #define VM_OOM_SWAPZ 3 In vm_fault(. . .) : . . . if (vm_pfault_oom_attempts < 0 || oom < vm_pfault_oom_attempts) { oom++; vm_waitpfault(dset, vm_pfault_oom_wait * hz); goto RetryFault_oom; } if (bootverbose) printf( "proc %d (%s) failed to alloc page on fault, starting OOM\n", curproc->p_pid, = curproc->p_comm); vm_pageout_oom(VM_OOM_MEM_PF); . . . (I'd not have guessed that bootverbose would control messages about OOM activity.) The above one looks to be blocked by the "-1" setting that we have been using. In vm_pageout_mightbe_oom(. . .) : . . . if (starting_page_shortage <=3D 0 || starting_page_shortage !=3D page_shortage) vmd->vmd_oom_seq =3D 0; else vmd->vmd_oom_seq++; if (vmd->vmd_oom_seq < vm_pageout_oom_seq) { if (vmd->vmd_oom) { vmd->vmd_oom =3D FALSE; atomic_subtract_int(&vm_pageout_oom_vote, 1); } return; } =20 /* * Do not follow the call sequence until OOM condition is * cleared. */ vmd->vmd_oom_seq =3D 0; =20 if (vmd->vmd_oom) return; =20 vmd->vmd_oom =3D TRUE; old_vote =3D atomic_fetchadd_int(&vm_pageout_oom_vote, 1); if (old_vote !=3D vm_ndomains - 1) return; =20 /* * The current pagedaemon thread is the last in the quorum to * start OOM. Initiate the selection and signaling of the * victim. */ vm_pageout_oom(VM_OOM_MEM); =20 /* * After one round of OOM terror, recall our vote. On the * next pass, current pagedaemon would vote again if the low * memory condition is still there, due to vmd_oom being * false. */ vmd->vmd_oom =3D FALSE; atomic_subtract_int(&vm_pageout_oom_vote, 1); . . . The above is where the other setting we have been using extends the number of tries before doing the OOM kill. If the rate of attempts increased, less time would go by for the same figure? This case might still be happening, even for the > 4000 figure used on the 5 GiByte amd64 system with the i386 jail that was reported? No specific printf above as things are. In swp_pager_meta_build(. . .) : . . . if (uma_zone_exhausted(swblk_zone)) { if = (atomic_cmpset_int(&swblk_zone_exhausted, 0, 1)) printf("swap blk zone exhausted, = " "increase = kern.maxswzone\n"); vm_pageout_oom(VM_OOM_SWAPZ); pause("swzonxb", 10); } else uma_zwait(swblk_zone); . . . if (uma_zone_exhausted(swpctrie_zone)) { if = (atomic_cmpset_int(&swpctrie_zone_exhausted, 0, 1)) printf("swap pctrie zone = exhausted, " "increase = kern.maxswzone\n"); vm_pageout_oom(VM_OOM_SWAPZ); pause("swzonxp", 10); } else uma_zwait(swpctrie_zone); . . . The above we have not been controlling: uma zone exhaustion for swblk_zone and swpctrie_zone. (Not that I'm familiar with them or the rest of this material.) On a small memory machine, there may be nothing that can be directly done that does not have other, nasty tradeoffs. Of course, there might be reasons that one or both of these exhaust faster then they used to. There are the 2 printf messages, but they are conditional. Still, they give something else to look for in console or log output. One possibility is always having an unconditional printf just before each of the 4 vm_pageout_oom calls, each of which identifies which of the 4 contexts is making the call. That would at least be a start at figuring things out. (swp_pager_meta_build's code means that the argument to vm_pageout_oom is not as specific for such identification.) The vm_pageout_oom(. . .) routine has: . . . if (bigproc !=3D NULL) { if (vm_panic_on_oom !=3D 0) panic("out of swap space"); PROC_LOCK(bigproc); killproc(bigproc, "out of swap space"); sched_nice(bigproc, PRIO_MIN); _PRELE(bigproc); PROC_UNLOCK(bigproc); } . . . That is where the can-be-a-misnomer "out of swap space" is from. Looks like it is correct for some conditions, but not the conditions we have historically got for our contexts. It takes looking at other messages to figure out if it is a misnomer: Another type of message carries the actual out-of-swap information and if that message is not present then the one based on the above is a misnomer. vm_pageout_oom could use its argument to be somewhat more specific for the text it passes to killproc(. . .). For reference: # grep -r "VM_OOM_" /usr/src/sys/ | more /usr/src/sys/vm/vm_fault.c: = vm_pageout_oom(VM_OOM_MEM_PF); /usr/src/sys/vm/vm_pageout.c: vm_pageout_oom(VM_OOM_MEM); /usr/src/sys/vm/vm_pageout.c: if (shortage =3D=3D VM_OOM_MEM_PF && /usr/src/sys/vm/vm_pageout.c: if (shortage =3D=3D = VM_OOM_MEM || shortage =3D=3D VM_OOM_MEM_PF) /usr/src/sys/vm/swap_pager.c: = vm_pageout_oom(VM_OOM_SWAPZ); /usr/src/sys/vm/swap_pager.c: = vm_pageout_oom(VM_OOM_SWAPZ); /usr/src/sys/vm/vm_pageout.h:#define VM_OOM_MEM 1 /usr/src/sys/vm/vm_pageout.h:#define VM_OOM_MEM_PF 2 /usr/src/sys/vm/vm_pageout.h:#define VM_OOM_SWAPZ 3 =3D=3D=3D Mark Millard marklmi at yahoo.com ( dsl-only.net went away in early 2018-Mar)