Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 28 Mar 2020 11:38:19 -0700
From:      Mark Millard <marklmi@yahoo.com>
To:        bob prohaska <fbsd@www.zefox.net>
Cc:        freebsd-arm@freebsd.org
Subject:   Re: Belated out of swap kill on rpi3 at r359216
Message-ID:  <5CBFD168-D533-4BF4-AB9C-64B8B98F4B84@yahoo.com>
In-Reply-To: <20200328161742.GA7571@www.zefox.net>
References:  <83E41A13-6C24-4B56-A837-779044038FBC@yahoo.com> <20200324185518.GA92311@www.zefox.net> <75CE3C07-8A0A-4D32-84C6-24BEA967447E@yahoo.com> <20200324224658.GA92726@www.zefox.net> <764D5A86-6A42-44E0-A706-F1C49BB198DA@yahoo.com> <20200325015633.GA93057@www.zefox.net> <0FF6BC4C-296F-49F3-8FB8-AA87A49349E2@yahoo.com> <20200326220649.GA99824@www.zefox.net> <0A8CF8D1-8D0F-40E2-A10D-EB44BEEAB557@yahoo.com> <5549E63B-0784-4B58-AD36-2A2EDC518308@yahoo.com> <20200328161742.GA7571@www.zefox.net>

next in thread | previous in thread | raw e-mail | index | archive | help
On 2020-Mar-28, at 09:17, bob prohaska <fbsd@www.zefox.net> wrote:

> On Fri, Mar 27, 2020 at 07:25:45PM -0700, Mark Millard wrote:
>>=20
>>=20
>> On 2020-Mar-26, at 16:24, Mark Millard <marklmi at yahoo.com> wrote:
>>=20
>>>=20
>>> Anyway, I may, for a time, have one context that is
>>> more like yours than is normal for me. As stands, the
>>> RPi3 is doing a from-scratch buildworld buildkernel .
>>> (Reconstructing the head -r358966 that it is already
>>> running.) It is not splitting the I/O load but is
>>> using a USB SSD (via a powered hub), not the microsd
>>> card. No extra logging. vm.pfault_oom_attempts=3D-1
>>> and vm.pageout_oom_seq=3D120 for this attempt. 3072
>>> MiBytes of page/swap space. It is a -j4 build attempt.
>>>=20
>>=20
>> ("No extra logging" meant: beyond my normal typescript
>> recording of the build output. That file ended up at
>> 7741518 Bytes for size.)
>=20
> Does the process capture all the output from make buildworld?
> On my machines (pi2 and pi3) that's usually ~30 MB.=20

A likely explanation is that I use WITH_META_MODE
and you might not:

    WITH_META_MODE
 . . .
             The build hides commands that are executed unless NO_SILENT =
is
             defined.  Errors cause make(1) to show some of its =
environment
             for further debugging.
. . .

(I do not use NO_SILENT, so I get the hiding.)

Over 1/2 of the lines recorded looked like
sequences similar to:

. . .
Building =
/usr/obj/cortexA53_clang/arm64.aarch64/usr/src/arm64.aarch64/tmp/obj-tools=
/tools/build/dummy.o
Building =
/usr/obj/cortexA53_clang/arm64.aarch64/usr/src/arm64.aarch64/tmp/obj-tools=
/tools/build/libegacy.a
. . .

In this case:

# grep "^Building " =
/root/sys_typescripts/typescript_make_cortexA53_nodebug_clang_bootstrap-aa=
rch64-host-2020-03-26:12:02:47 | wc
   52487  104974 5767152

vs. the file overall:

# wc =
/root/sys_typescripts/typescript_make_cortexA53_nodebug_clang_bootstrap-aa=
rch64-host-2020-03-26:12:02:47
   94908  256377 7741518 =
/root/sys_typescripts/typescript_make_cortexA53_nodebug_clang_bootstrap-aa=
rch64-host-2020-03-26:12:02:47

WITH_META_MODE does record details for each "Building"
line in a .meta file specific to that line. A .meta
file even includes a list of what files were involved
(opened) for that step.

So their is still file I/O for such logging, likely
more in total than when not using WITH_META_MODE.
(Not that I'd thought about that before.)

>>=20
>> The build completed without any /var/log/message or
>> console output during the build. My modified version
>> of top reported (details copied from a ssh window) . . .
>>=20
>=20
> That seems to settle matters. My problems are with the old
> microSD card. New, it was marginally ok. Old, it's not. That
> crudely quantifies lifespan at around a year of active use,
> with trouble appearing roughly when the card was 75% full,
> at least a hint of required overprovisioning.

Since FreeBSD provides no means of having the
SATA drive in the USB enclosure trimmed(?), I do
not know how long before it would have issues
from that. It is a small form factor 240 GByte
SSD [user space, not GiByte, likely from internal
over-provisioning of a 240 GiByte media]. I left
a 21 GiByte area at the end free as well. The 197
GiByte ufs file system is only about 19% used.

smartctl reports for the USB SSD internals:

ATA Version is:   ATA8-ACS, ACS-2 T13/2015-D revision 3
SATA Version is:  SATA 3.0, 6.0 Gb/s (current: 6.0 Gb/s)

The Firmware version 609ABBF0 listed suggests a
Seagate SATA controller is involved, if I
understand right.

The USB SSD drive is far from new. It now gets the
report:

Device Statistics (GP Log 0x04)
Page  Offset Size        Value Flags Description
. . .
0x01  0x018  6      5499176259  ---  Logical Sectors Written
0x01  0x028  6      2406890437  ---  Logical Sectors Read
. . .

where earlier smartctl reported:

Sector Size:      512 bytes logical/physical


> Out of curiosity, have you tried leaving vm.pfault_oom_attempts at=20
> its default value? An OOM kill would be unexpected, but interesting=20
> if observed.=20

Nope. I've thought of locally updating gstat to do
something similar to what I did with top: record and
report the maximum observed figures for ms/r, ms/w,
ms/d, but for each line of data in this case.

I'd not be surprised if the heavier paging times had
some large figures compared to what I saw when watching
the display. (Rarely more than 20ms.) But my observations
are not much of a sample.

I'd be more likely to try picking vm.pfault_oom_wait
after seeing what is reported, then picking a
positive vm.pfault_oom_attempts value to go with it.
I'm not sure if I'll ever do this sort of experiment.
The resulting figures used would be rather
context-specific as well.

>> For Mem: 738512Ki MaxObsActive, 190608Ki MaxObsWired, 906372Ki =
MaxObs(Act+Wir)
>> For Swap:  1927Mi MaxObsUsed
>>=20
>=20
> . . .

=3D=3D=3D
Mark Millard
marklmi at yahoo.com
( dsl-only.net went
away in early 2018-Mar)




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?5CBFD168-D533-4BF4-AB9C-64B8B98F4B84>