From owner-freebsd-arm@freebsd.org Mon Aug 13 04:36:54 2018 Return-Path: Delivered-To: freebsd-arm@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 597EE1065207 for ; Mon, 13 Aug 2018 04:36:54 +0000 (UTC) (envelope-from marklmi@yahoo.com) Received: from sonic311-23.consmr.mail.gq1.yahoo.com (sonic311-23.consmr.mail.gq1.yahoo.com [98.137.65.204]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id D40D98900A for ; Mon, 13 Aug 2018 04:36:53 +0000 (UTC) (envelope-from marklmi@yahoo.com) X-YMail-OSG: Bi1Qnc8VM1narJrqMkWZX7hL3OyPtUBJqim0KguonyLUdAMaZ2nqRSvIg0T6AT7 I_dhi5MqbvceI8XmZ6zdxTDpmBCuRpdA_bKd4wcmgAvy1J8uLPLpGak_MXYMM5HkYTu.yzel9OIy PbQCZ0Ko7Vjw6x.npov70FXYuMM_MIC6TmGnAxMm3f5Xvd88mp7sPZ9XZrUp64Dn2WBZ9YA8X4AN 2A7os07M3j4edonJMvJJDFk3OFdWLOasCQxVqkxcujgHrm4355UkDyvyzpkfb65KqvzyVpjakLUU kRHgs04Pcqv.lI8.GIuEEgbkUsWsnFLMXO1RPESP3l8ba201YyViXxcC3C_eBHdZe8GcXe1yOKHS DAglMlTADuYWW7cnSARGa7lAYRVSZk4S3Gd7j9rTgh8kN1bcqVqg.5q2nx1Azot_6cz86_so..US 5a74aBJHO.hz4M.yNipZEAIRYPPN2YAgIPtScSdbxYn9guXpMAub8K1mxqzZYuGGL10mRGhciTxI BMZYVeeBb0Ekv7AyS7yxmFhH2aZLzpYGWTwLDSLS4YNBhfmDe2kc6Sd5B44kX1Ej0flX56UbC1YA _PtY1Y_FkZozMEo.IlacS5rktFUxPwVBN5MSqm8kTS0JyuBKAetJT5b0OOP2AfFeDKOD53X_Fjeg 8nsrvp25pVXgT0u2ZyBUtiu_g3f44TaJzgMflf2HunONjd6FUAwCKZnp9D4GP2WSXMrMQFyfEJ2s GiQ914QZtDVH5dk6mNOcUcSEzvYw4F81TiccTyLsbEsi2JJusiNs0.kRwxg03ll0aDWUR0pJLlsT Fe6etD6TS0XjGQs7HgU.o3vsJK8uNyQDRVMzBPOIm6N7bc6vdg0eS3NRkHVoME0qD4SIu2n2TXrj R6k4AMwbF4itGbyTHTaFDmmnvalHs5qhz8pTcBTnvv.igsjosTiV0XKsscwmABltbBkXkj2xG2Bj Fx4vPKwT1RdeLERNNTWMye39Ndbpa8fB1K7ZwN5uqz_MTdXt2rVBtM._E42tTfmtbRm71seIsODx x.suJxWI6QA-- Received: from sonic.gate.mail.ne1.yahoo.com by sonic311.consmr.mail.gq1.yahoo.com with HTTP; Mon, 13 Aug 2018 04:36:47 +0000 Received: from ip70-189-131-151.lv.lv.cox.net (EHLO [192.168.0.105]) ([70.189.131.151]) by smtp412.mail.gq1.yahoo.com (Oath Hermes SMTP Server) with ESMTPA ID 6d28b8dc749cdb8bea5a2acc1fac8f19; Mon, 13 Aug 2018 03:36:03 +0000 (UTC) Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 11.5 \(3445.9.1\)) Subject: Re: RPI3 swap experiments ["was killed: out of swap space" with: "v_free_count: 5439, v_inactive_count: 1"] From: Mark Millard In-Reply-To: <20180813021226.GA46750@www.zefox.net> Date: Sun, 12 Aug 2018 20:36:01 -0700 Cc: Mark Johnston , John Kennedy , freebsd-arm Content-Transfer-Encoding: quoted-printable Message-Id: <0D8B9A29-DD95-4FA3-8F7D-4B85A3BB54D7@yahoo.com> References: <20180806155837.GA6277@raichu> <20180808153800.GF26133@www.zefox.net> <20180808204841.GA19379@raichu> <2DC1A479-92A0-48E6-9245-3FF5CFD89DEF@yahoo.com> <20180809033735.GJ30738@phouka1.phouka.net> <20180809175802.GA32974@www.zefox.net> <20180812173248.GA81324@phouka1.phouka.net> <20180812224021.GA46372@www.zefox.net> <20180813021226.GA46750@www.zefox.net> To: bob prohaska X-Mailer: Apple Mail (2.3445.9.1) X-BeenThere: freebsd-arm@freebsd.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: "Porting FreeBSD to ARM processors." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 13 Aug 2018 04:36:54 -0000 On 2018-Aug-12, at 7:12 PM, bob prohaska wrote: > On Sun, Aug 12, 2018 at 04:23:31PM -0700, Mark Millard wrote: >> On 2018-Aug-12, at 3:40 PM, bob prohaska = wrote: >>=20 >>> On Sun, Aug 12, 2018 at 10:32:48AM -0700, John Kennedy wrote: >>>> . . . >>> Setting vm.pageout_oom_seq to 120 made a decisive improvement, = almost allowing >>> buildworld to finish. By the time I tried CAM_IOSCHED_DYNAMIC = buildworld was >>> getting only about half as far, so it seems the patches were harmful = to a degree. >>> Changes were applied in the order=20 >>=20 >> You could experiment with figures bigger than 120 for >> vm.pageout_oom_seq . >>=20 > Could anybody hazard a guess as to how much? The leap from 12 to 120 = rather > startled me, I thought a factor of two a big adjustment. Maybe go to = 240, > or is that insignificant? I'd keep multiplying by 10 until it works (or fails some other way), then back off by smaller factors if you want a narrower range to be known between failing and working (or failing differently). >> I'll note that the creation of this mechanism seems >> to be shown for -r290920 at: >>=20 >> = https://lists.freebsd.org/pipermail/svn-src-head/2015-November/078968.html= >>=20 >> In part is says: >>=20 >> . . . only raise OOM when pagedaemon is unable to produce a free >> page in several back-to-back passes. Track the failed passes per >> pagedaemon thread. >>=20 >> The number of passes to trigger OOM was selected empirically and >> tested both on small (32M-64M i386 VM) and large (32G amd64) >> configurations. If the specifics of the load require tuning, sysctl >> vm.pageout_oom_seq sets the number of back-to-back passes which must >> fail before OOM is raised. Each pass takes 1/2 of seconds. Less = the >> value, more sensible the pagedaemon is to the page shortage. >>=20 >> The code shows: >>=20 >> int vmd_oom_seq >>=20 >> and it looks like fairly large values would be >> tolerated. You may be able to scale beyond >> the problem showing up in your context. >=20 > Would 1024 be enough to turn OOMA off completely? That's what I = originally wanted to=20 > try. As far as I know until arithmetic fails for the sizes involved, it scales. The factor of 10 rule makes the number of tests logarithmic to find an sufficient upper bound (if there is an upper bound). After that with high/low bounds binary searching is a possibility. (That ignores any effort at determining repeatability.) >>=20 >>> pageout=20 >>> batchqueue >>> slow_swap >>> iosched >>=20 >> For my new Pine64+ 2GB experiments I've only applied >> the Mark J. reporting patches, not the #define one. >> Nor have I involved CAM_IOSCHED_DYNAMIC. >>=20 >> But with 2 GiBytes of RAM and the default 12 for >> vm.pageout_oom_seq I got: >>=20 >> v_free_count: 7773, v_inactive_count: 1 >> Aug 12 09:30:13 pine64 kernel: pid 80573 (c++), uid 0, was killed: = out of swap space >>=20 >> with no other reports from Mark Johnston's reporting >> patches. >>=20 >> It appears that long I/O latencies as seen by the >> subsystem are not necessary to ending up with OOM >> kills, even if they can contribute when they occur. >>=20 >=20 > It has seemed to me in the past that OOMA kills aren't closely-tied to = busy > swap. They do seem closely-related to busy storage (swap and disk). My Pine64+ 2GB experiment suggests to me that for 4 cores running 4 processes (threads) at basically 100% per core, with the processes/threads allocating and using ever the more memory actively over time, without freeing=20 until near the end, will lead to the OOM kills if they run long enough. (I'm taking the rest of the processes as being relatively idle, not freeing up very much memory explicitly very often. This is much like the -j4 buildworld buildkernel in my context.) I'd not be surprised if a programs (threads) that do no explicit I/O would get the same result if the memory use and the "compute/memory bound" property was similar. >> (7773 * 4 KiBytes =3D 31,838,298 Bytes, by the way.) >>=20 > The RPI3 seems to start adding to swap use when free memory drops = below about 20 MB, > Does that seem consistent with your observations? I did not record anything that would show when for the first Pine64+ 2GB experiment. There were around 19 MiBytes of in-use swap left around from before at the start of the 2nd test. Also not the best for finding when things start. But the first increment beyond 19M was (two lines from top output for each time): Sun Aug 12 16:58:19 PDT 2018 Mem: 1407M Active, 144M Inact, 18M Laundry, 352M Wired, 202M Buf, 43M = Free Swap: 3072M Total, 19M Used, 3053M Free Sun Aug 12 16:58:20 PDT 2018 Mem: 1003M Active, 147M Inact, 15M Laundry, 350M Wired, 202M Buf, 453M = Free Swap: 3072M Total, 22M Used, 3050M Free >>> My RPI3 is now updating to 337688 with no patches/config changes. = I'll start the >>> sequence over and would be grateful if anybody could suggest a = better sequence. >>=20 > It seems rather clear that turning up vm.pageout_oom_seq is the first = thing to try. > The question is how much: 240 (double Mark J.'s number), 1024 (small = for an int on > a 64 bit machine)? I made a recommendation earlier above. I'm still at the 120 test in my context. > If in fact the reporting patches do increase the load on the machine, = is the=20 > slow swap patch the next thing to try, or the iosched option? Maybe = something else > altogether? The slow_swap.patch material is reporting material, and so is one of the patches that I put in place so that I might see messages about: waited ?s for swap suffer [happens for 3 <=3D s] waited ?s for async swap write [happens for 3 <=3D s] thread ? waiting for memory (None of which were produced in my test. As far as I know no one has gotten the thread one.) CAM_IOSCHED_DYNAMIC does not seem to apply to my Pine64+ 2GB test that did not report any I/O latency problems for the subsystem. I've no reason to go that direction from the evidence available. And my tests do not help with identifying how to survive I/O latency problems (so far). For now vm.pageout_oom_seq variation is all the control that seems to fit my context. (Presumes your negative result for VM_BATCHQUEUE_SIZE making an improvement applies.) Other goals/cpontexts get into doing other things. I've no clue if there is anything interesting to control for CAM_IOSCHED_DYNAMIC. Nor for variations on the VM_BATCHQUEUE_SIZE figure beyond the 1 and 7 that did not help your I/O latency context. It does appear to me that you have a bigger problem, more difficult to control, because of the I/O latency involvement. What might work for me might not be sufficient for you, even if it is involved for you. > There's no immediate expectation of fixing things; just to shed a = little light. >=20 For now, as far as I know, Mark Johnston's reporting patches are the means of exposing useful information for whatever range of contexts/configurations. For now I'm just exploring vm.pageout_oom_seq value variations and what is reported (or if it finished without a OOM kill). =3D=3D=3D Mark Millard marklmi at yahoo.com ( dsl-only.net went away in early 2018-Mar)