From owner-freebsd-current@freebsd.org Tue Jan 28 19:28:23 2020 Return-Path: Delivered-To: freebsd-current@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 64BE0239172 for ; Tue, 28 Jan 2020 19:28:23 +0000 (UTC) (envelope-from marklmi@yahoo.com) Received: from sonic308-55.consmr.mail.gq1.yahoo.com (sonic308-55.consmr.mail.gq1.yahoo.com [98.137.68.31]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 486c922YV4z4Qj0 for ; Tue, 28 Jan 2020 19:28:21 +0000 (UTC) (envelope-from marklmi@yahoo.com) X-YMail-OSG: XwsVmHwVM1nI3sYXDIau3Y2Rp.vCgvBkpuEiE3itAffERb6VBVjTu6wILfYnhLW w8PcmEa.x1ORv.LoO0Xp_osQbhLBW7AWau1aleZr99godSeltbsBx42oUhkT5YNBQiLJjU_lmGp7 QVlEIeGwMflei4D7HFeBEAccpuVoERtWwVPTu7.Dkj5FLnkbqzeke0dKtONbHfAffVPCvTe6b7Nk IVvO4MlhM8jnVkx_299dG.gnwKfdBGDSgzY_qUv8FSOduZn1CxjjlZu_TP9U.UPzMEzB12TAfw7C LqG18g8Ge7baCh3WjXFmMVs8eY7phCXOzKc93AkhcHJsAkL_9l0kqvukGxmHI3Ts48b5Fbefxu0H YRUslCki1CBV5g6Y.G4nqTqMgMgefx6wTx7bUYLokffWWYiBUkTrTk_gCH9My5E5LgInMz8rN4MA rRwUfr0MTJRRjElzBLqs6l3O7oP5Oa7dJ.dCREAfByZgnG7MEUMuk25bC8HVfP781CkRFJ9kVFVU Uuq57acdlp31bD1xM8e58S4h4uhZe5Z6SGcw6vQtQwO2D4WhROpjh1X6Kp5yBI8VdB4qZkaM1WIh PxqNak2hoClaULckWwPp_KqrjSmY51.pKrM2RENGv6mw13vUIBMYWatcvaH0f5iMJEZXRUDmDLQQ OH8XVLtQ29.blgs_pP3EzzCAQ2eVNgCLSC35JesT9ZuQ9AcHGo_5FcOKg.iHhxWw7JdEWvaoThkt Jn693uFi5fDU9MRnKL1Arz5in6jB4lybPRuz03feKKDe2X60VLO.UoH1K5z1DlLxMZ6m7BafTlpq Zv.As.O_awSWjJg4RMAoW8XSezKoyVLD5lqWhREczJ6B9_PtsVk_EAmedWfPOlKisbCFb6eo3QO5 mPHm0OIdS3VGTJn51fPB0_71nWG8KUhcpvDQ.HWjlHyifDUDTeSwUEQTpvPxoAg6Zat9u1gVXoLl WmPCzrmpLEaj1I_emjC67jvIuJ3xL7KVGtaQwXpddT8..cNdC.yKa5GVZQXmlT41MFi4zWDqDNhT y41uG_RnzioXb6PXi2BmoKUv9MsqbNPUm2N2AgxCeDaDHaCFms97UtuVWiIWFOxUwhR7VnKRmXgl iCMuAZ40Og5eCPJgVh0LCx8IBD.7jMLVnNB_fm6STb_DbjtUs5TB7rd5ZbQKC2IxBv1vzQ25pWbw tQPkaBnZeeSNeFYmHR.o5xvw8omXrBjL98ng5bSEROYqU17nk94wBEDrctMSc2RGPmEXzZ13bQMP D_T397vxSI2.g8ANtbbKfWb802gfQUaeUCQy7OSXOVk_4q_wtPIRW8rNzEFjmsHq9o3v8Xd3xk7U 8HkEfx0sSxWFVYZPK7UwEUYPDZrfuxnYIzTRvq.lTcEH4fqH3xx8TqBvtGQkkpClEhAuvhXEJAzW D1y8MwZGVDhHykysSxz0wi7bBEw5HbxXt.blgB5aGir2IWFNv7njBshcSRIbGYjT7_2H9yw5tUBl 5HokQ27Q- Received: from sonic.gate.mail.ne1.yahoo.com by sonic308.consmr.mail.gq1.yahoo.com with HTTP; Tue, 28 Jan 2020 19:28:19 +0000 Received: by smtp429.mail.gq1.yahoo.com (Oath Hermes SMTP Server) with ESMTPA ID faf0b610faf7debc14864154abeefc8a; Tue, 28 Jan 2020 19:28:14 +0000 (UTC) Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 13.0 \(3608.40.2.2.4\)) Subject: Re: OOMA kill with vm.pfault_oom_attempts="-1" on RPi3 at r357147 (a vm_pfault_oom_attempts < 0 handling bug as of head -r357026) From: Mark Millard In-Reply-To: <20200128190210.GA14784@www.zefox.net> Date: Tue, 28 Jan 2020 11:28:14 -0800 Cc: freebsd-arm , FreeBSD Current Content-Transfer-Encoding: quoted-printable Message-Id: <94E68249-7751-4B27-AE95-E9C2776D730B@yahoo.com> References: <20200127190709.GA11328@www.zefox.net> <20200128035317.GA12644@www.zefox.net> <18150258-6210-451E-A5B9-528129A05974@yahoo.com> <9BF68EF1-F83A-473B-9A7B-B3956D6A5EFD@yahoo.com> <20200128170518.GA14654@www.zefox.net> <5A3CE2DA-C5B8-4CC1-BEEA-8B9649A20B8B@yahoo.com> <20200128190210.GA14784@www.zefox.net> To: bob prohaska , Konstantin Belousov X-Mailer: Apple Mail (2.3608.40.2.2.4) X-Rspamd-Queue-Id: 486c922YV4z4Qj0 X-Spamd-Bar: - X-Spamd-Result: default: False [-1.92 / 15.00]; R_SPF_ALLOW(-0.20)[+ptr:yahoo.com]; FREEMAIL_FROM(0.00)[yahoo.com]; MV_CASE(0.50)[]; TO_DN_ALL(0.00)[]; DKIM_TRACE(0.00)[yahoo.com:+]; DMARC_POLICY_ALLOW(-0.50)[yahoo.com,reject]; FROM_EQ_ENVFROM(0.00)[]; MIME_TRACE(0.00)[0:+]; RCVD_TLS_LAST(0.00)[]; FREEMAIL_ENVFROM(0.00)[yahoo.com]; ASN(0.00)[asn:36647, ipnet:98.137.64.0/21, country:US]; MID_RHS_MATCH_FROM(0.00)[]; DWL_DNSWL_NONE(0.00)[yahoo.com.dwl.dnswl.org : 127.0.5.0]; ARC_NA(0.00)[]; NEURAL_HAM_MEDIUM(-0.80)[-0.796,0]; R_DKIM_ALLOW(-0.20)[yahoo.com:s=s2048]; FROM_HAS_DN(0.00)[]; RCPT_COUNT_THREE(0.00)[4]; NEURAL_HAM_LONG(-0.62)[-0.621,0]; MIME_GOOD(-0.10)[text/plain]; IP_SCORE(0.00)[ip: (1.84), ipnet: 98.137.64.0/21(0.84), asn: 36647(0.67), country: US(-0.05)]; IP_SCORE_FREEMAIL(0.00)[]; TO_MATCH_ENVRCPT_SOME(0.00)[]; RCVD_IN_DNSWL_NONE(0.00)[31.68.137.98.list.dnswl.org : 127.0.5.0]; RWL_MAILSPIKE_POSSIBLE(0.00)[31.68.137.98.rep.mailspike.net : 127.0.0.17]; RCVD_COUNT_TWO(0.00)[2] X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 28 Jan 2020 19:28:23 -0000 On 2020-Jan-28, at 11:02, bob prohaska wrote: > On Tue, Jan 28, 2020 at 09:42:17AM -0800, Mark Millard wrote: >>=20 >>=20 >>=20 > The (partly)modified kernel compiled and booted without > obvious trouble. It's trying to finish buildworld now. >=20 >> If you are testing with vm.pfault_oom_attempts=3D"-1" then >> the vm_fault printf message should never happen anyway. >>=20 > Would it not be interesting if the message appeared in that > case?=20 Thanks for the question: looking at the new code found a bug causing oom where it used to be avoided in head -r357025 and before. After vm_waitpfault(dset, vm_pfault_oom_wait * hz) the -r357026 code does a vm_pageout_oom(VM_OOM_MEM_PF) no matter what, even when vm_pfault_oom_attempts < 0 || fs->oom < vm_pfault_oom_attempts : New code in head -r357026 ( nothing to avoid the vm_pageout_oom(VM_OOM_MEM_PF) for vm_pfault_oom_attempts < 0 || fs->oom < vm_pfault_oom_attempts ): if (fs->m =3D=3D NULL) { unlock_and_deallocate(fs); if (vm_pfault_oom_attempts < 0 || fs->oom < vm_pfault_oom_attempts) { fs->oom++; vm_waitpfault(dset, vm_pfault_oom_wait * hz); } if (bootverbose) printf( "proc %d (%s) failed to alloc page on fault, starting OOM\n", curproc->p_pid, curproc->p_comm); vm_pageout_oom(VM_OOM_MEM_PF); return (KERN_RESOURCE_SHORTAGE); } Old code in head -r357025 ( has the goto RetryFault_oom after vm_waitpfault(. . .), thereby avoiding the vm_pageout_oom(VM_OOM_MEM_PF) for vm_pfault_oom_attempts < 0 || fs->oom < vm_pfault_oom_attempts ) : if (fs.m =3D=3D NULL) { unlock_and_deallocate(&fs); if (vm_pfault_oom_attempts < 0 || oom < vm_pfault_oom_attempts) { oom++; vm_waitpfault(dset, vm_pfault_oom_wait * hz); goto RetryFault_oom; } if (bootverbose) printf( "proc %d (%s) failed to alloc page on fault, starting OOM\n", curproc->p_pid, = curproc->p_comm); vm_pageout_oom(VM_OOM_MEM_PF); goto RetryFault; } I expect this is the source of the behavioral difference folks have been seeing for OOM kills. As for "gather evidence" messages . . . >> You may be able to just look and manually delete or >> comment out the bootverbose line in the more modern >> source that currently looks like: >>=20 >> if (bootverbose) >> printf( >> "proc %d (%s) failed to alloc page on fault, starting OOM\n", >> curproc->p_pid, curproc->p_comm); >> vm_pageout_oom(VM_OOM_MEM_PF); >> return (KERN_RESOURCE_SHORTAGE); >>=20 >=20 > I can find those lines in /usr/src/sys/vm/vm_fault.c, but > unclear on the motivation to comment the lines out. Perhaps=20 > to eliminate the return(...) ? Anyway, is it sufficient=20 > to insert /* before and */ after?=20 The only line to delete or comment out in that code block is: if (bootverbose) Disabling that line makes the following printf always happen, even when a verbose boot was not done. Based on the above reported code change, having a message before vm_pageout_oom(VM_OOM_MEM_PF) is important to getting a report of the kill being via that code. >> and is now in vm_fault_allocate(. . .). (That file has >> hd a reorganization since where I'm synchronized.) >>=20 >> Having the message indicate vm_fault_allocate is >> optional but would look like: >>=20 >> "vm_fault_allocate: proc %d (%s) failed to alloc page on fault, = starting OOM\n", >>=20 >> Doing the delete/comment-out would avoid waiting for me. >>=20 >>=20 > I'll do it after the next stoppage. =3D=3D=3D Mark Millard marklmi at yahoo.com ( dsl-only.net went away in early 2018-Mar)