Date: Thu, 26 Mar 2020 16:24:57 -0700 From: Mark Millard <marklmi@yahoo.com> To: bob prohaska <fbsd@www.zefox.net> Cc: freebsd-arm@freebsd.org Subject: Re: Belated out of swap kill on rpi3 at r359216 Message-ID: <0A8CF8D1-8D0F-40E2-A10D-EB44BEEAB557@yahoo.com> In-Reply-To: <20200326220649.GA99824@www.zefox.net> References: <20200324155753.GA91922@www.zefox.net> <83E41A13-6C24-4B56-A837-779044038FBC@yahoo.com> <20200324185518.GA92311@www.zefox.net> <75CE3C07-8A0A-4D32-84C6-24BEA967447E@yahoo.com> <20200324224658.GA92726@www.zefox.net> <764D5A86-6A42-44E0-A706-F1C49BB198DA@yahoo.com> <20200325015633.GA93057@www.zefox.net> <0FF6BC4C-296F-49F3-8FB8-AA87A49349E2@yahoo.com> <20200326220649.GA99824@www.zefox.net>
next in thread | previous in thread | raw e-mail | index | archive | help
On 2020-Mar-26, at 15:06, bob prohaska <fbsd at www.zefox.net> wrote: > Just to wrap up, I tend to agree that > delays writing to the microSD filesystem were > blocking swap traffic and causing OOM kills. > Turning off OOM allowed the OS build/install > to complete successfully. There are still 2 types of "uma zone exhausted" issues that can lead to OOM activity, as well as vm.pageout_oom_seq being more of a "needs to be sufficiently large" than a direct-disable of the OOM handling of sustained-low-free-RAM periods. (And, only trying examples establishes what figures are sufficient for specific activities.) > Why this behavior started recently is less clear. > The card was placed in service in October of 2018 and > has been in strenuous use since then. The first hints > of trouble occurred in late 2019. Perhaps this phenomenon > is a useful warning of impending wearout. I'm not sure of the relative timing, but vm.pfault_oom_attempts and vm.pfault_oom_wait are fairly new and were (are still?) specific to head at 13. I expect they were in place for a while before we learned of them and what to do with them ( and so started using vm.pfault_oom_attempts=-1 ). This is sort of like vm.pageout_oom_seq being around for a long time before I learned of it and the background to understand using it: Mark J.'s earlier material from an earlier round of finding why OOM activity was happening on small arm boards. > The gstat logs are at > http://www.zefox.net/~fbsd/rpi3/swaptests/r359216/ > in case anybody's curious. > One of the unfortunate things is that having the logging of gstat and such also go to the same media (or even over a common channel but different media) also adds to the competing I/O load for paging/swapping. Also, if I remember right, all USB ports on the RPi3 share a common channel in the path (internal USB hub), limiting the utility of using multiple USB drives for helping manage these issues. It might be that one USB drive and one microsd card are as far as one can go for independent channels (ignoring WiFi and such). Microsd cards have issues of their own when involved, however. Side note: I've got access to the old RPi3 because the Pine64+2GB is having problems with I/O failures to longterm media (microsd cards, USB media directly attached, USB media via a powered hub). It might go a week without failure but when it does it tends to be a large sequence of failures. Various separate media all got such a problem eventually. I'm trying to see if the RPi3 also gets such issues with some of the same media the Pine64+2GB did. Anyway, I may, for a time, have one context that is more like yours than is normal for me. As stands, the RPi3 is doing a from-scratch buildworld buildkernel . (Reconstructing the head -r358966 that it is already running.) It is not splitting the I/O load but is using a USB SSD (via a powered hub), not the microsd card. No extra logging. vm.pfault_oom_attempts=-1 and vm.pageout_oom_seq=120 for this attempt. 3072 MiBytes of page/swap space. It is a -j4 build attempt. === Mark Millard marklmi at yahoo.com ( dsl-only.net went away in early 2018-Mar)
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?0A8CF8D1-8D0F-40E2-A10D-EB44BEEAB557>