Date: Sat, 28 Mar 2020 11:38:19 -0700 From: Mark Millard <marklmi@yahoo.com> To: bob prohaska <fbsd@www.zefox.net> Cc: freebsd-arm@freebsd.org Subject: Re: Belated out of swap kill on rpi3 at r359216 Message-ID: <5CBFD168-D533-4BF4-AB9C-64B8B98F4B84@yahoo.com> In-Reply-To: <20200328161742.GA7571@www.zefox.net> References: <83E41A13-6C24-4B56-A837-779044038FBC@yahoo.com> <20200324185518.GA92311@www.zefox.net> <75CE3C07-8A0A-4D32-84C6-24BEA967447E@yahoo.com> <20200324224658.GA92726@www.zefox.net> <764D5A86-6A42-44E0-A706-F1C49BB198DA@yahoo.com> <20200325015633.GA93057@www.zefox.net> <0FF6BC4C-296F-49F3-8FB8-AA87A49349E2@yahoo.com> <20200326220649.GA99824@www.zefox.net> <0A8CF8D1-8D0F-40E2-A10D-EB44BEEAB557@yahoo.com> <5549E63B-0784-4B58-AD36-2A2EDC518308@yahoo.com> <20200328161742.GA7571@www.zefox.net>
next in thread | previous in thread | raw e-mail | index | archive | help
On 2020-Mar-28, at 09:17, bob prohaska <fbsd@www.zefox.net> wrote: > On Fri, Mar 27, 2020 at 07:25:45PM -0700, Mark Millard wrote: >>=20 >>=20 >> On 2020-Mar-26, at 16:24, Mark Millard <marklmi at yahoo.com> wrote: >>=20 >>>=20 >>> Anyway, I may, for a time, have one context that is >>> more like yours than is normal for me. As stands, the >>> RPi3 is doing a from-scratch buildworld buildkernel . >>> (Reconstructing the head -r358966 that it is already >>> running.) It is not splitting the I/O load but is >>> using a USB SSD (via a powered hub), not the microsd >>> card. No extra logging. vm.pfault_oom_attempts=3D-1 >>> and vm.pageout_oom_seq=3D120 for this attempt. 3072 >>> MiBytes of page/swap space. It is a -j4 build attempt. >>>=20 >>=20 >> ("No extra logging" meant: beyond my normal typescript >> recording of the build output. That file ended up at >> 7741518 Bytes for size.) >=20 > Does the process capture all the output from make buildworld? > On my machines (pi2 and pi3) that's usually ~30 MB.=20 A likely explanation is that I use WITH_META_MODE and you might not: WITH_META_MODE . . . The build hides commands that are executed unless NO_SILENT = is defined. Errors cause make(1) to show some of its = environment for further debugging. . . . (I do not use NO_SILENT, so I get the hiding.) Over 1/2 of the lines recorded looked like sequences similar to: . . . Building = /usr/obj/cortexA53_clang/arm64.aarch64/usr/src/arm64.aarch64/tmp/obj-tools= /tools/build/dummy.o Building = /usr/obj/cortexA53_clang/arm64.aarch64/usr/src/arm64.aarch64/tmp/obj-tools= /tools/build/libegacy.a . . . In this case: # grep "^Building " = /root/sys_typescripts/typescript_make_cortexA53_nodebug_clang_bootstrap-aa= rch64-host-2020-03-26:12:02:47 | wc 52487 104974 5767152 vs. the file overall: # wc = /root/sys_typescripts/typescript_make_cortexA53_nodebug_clang_bootstrap-aa= rch64-host-2020-03-26:12:02:47 94908 256377 7741518 = /root/sys_typescripts/typescript_make_cortexA53_nodebug_clang_bootstrap-aa= rch64-host-2020-03-26:12:02:47 WITH_META_MODE does record details for each "Building" line in a .meta file specific to that line. A .meta file even includes a list of what files were involved (opened) for that step. So their is still file I/O for such logging, likely more in total than when not using WITH_META_MODE. (Not that I'd thought about that before.) >>=20 >> The build completed without any /var/log/message or >> console output during the build. My modified version >> of top reported (details copied from a ssh window) . . . >>=20 >=20 > That seems to settle matters. My problems are with the old > microSD card. New, it was marginally ok. Old, it's not. That > crudely quantifies lifespan at around a year of active use, > with trouble appearing roughly when the card was 75% full, > at least a hint of required overprovisioning. Since FreeBSD provides no means of having the SATA drive in the USB enclosure trimmed(?), I do not know how long before it would have issues from that. It is a small form factor 240 GByte SSD [user space, not GiByte, likely from internal over-provisioning of a 240 GiByte media]. I left a 21 GiByte area at the end free as well. The 197 GiByte ufs file system is only about 19% used. smartctl reports for the USB SSD internals: ATA Version is: ATA8-ACS, ACS-2 T13/2015-D revision 3 SATA Version is: SATA 3.0, 6.0 Gb/s (current: 6.0 Gb/s) The Firmware version 609ABBF0 listed suggests a Seagate SATA controller is involved, if I understand right. The USB SSD drive is far from new. It now gets the report: Device Statistics (GP Log 0x04) Page Offset Size Value Flags Description . . . 0x01 0x018 6 5499176259 --- Logical Sectors Written 0x01 0x028 6 2406890437 --- Logical Sectors Read . . . where earlier smartctl reported: Sector Size: 512 bytes logical/physical > Out of curiosity, have you tried leaving vm.pfault_oom_attempts at=20 > its default value? An OOM kill would be unexpected, but interesting=20 > if observed.=20 Nope. I've thought of locally updating gstat to do something similar to what I did with top: record and report the maximum observed figures for ms/r, ms/w, ms/d, but for each line of data in this case. I'd not be surprised if the heavier paging times had some large figures compared to what I saw when watching the display. (Rarely more than 20ms.) But my observations are not much of a sample. I'd be more likely to try picking vm.pfault_oom_wait after seeing what is reported, then picking a positive vm.pfault_oom_attempts value to go with it. I'm not sure if I'll ever do this sort of experiment. The resulting figures used would be rather context-specific as well. >> For Mem: 738512Ki MaxObsActive, 190608Ki MaxObsWired, 906372Ki = MaxObs(Act+Wir) >> For Swap: 1927Mi MaxObsUsed >>=20 >=20 > . . . =3D=3D=3D Mark Millard marklmi at yahoo.com ( dsl-only.net went away in early 2018-Mar)
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?5CBFD168-D533-4BF4-AB9C-64B8B98F4B84>