FreeBSD Mail Archives

Date:      Tue, 4 Jul 2023 14:51:24 -0700
From:      Mark Millard <marklmi@yahoo.com>
To:        bob prohaska <fbsd@www.zefox.net>
Cc:        FreeBSD Mailing List <freebsd-ports@freebsd.org>, freebsd-arm@freebsd.org
Subject:   Re: More swap trouble with armv7, was Re: -current on armv7 stuck with flashing disk light
Message-ID:  <286ABDA5-BB1A-47C1-A187-168FFD86A441@yahoo.com>
In-Reply-To: <9A15D619-3274-44AC-B7E1-A1D6C7D334F2@yahoo.com>
References:  <ZJpFqAnnKPq/XmxJ@www.zefox.net> <A91FF89C-2BAA-4E93-96FA-C75C6FA4A0A0@yahoo.com> <ZJsOTzp%2Bb7O2%2BbhQ@www.zefox.net> <E1670A16-2F8E-4E94-A44C-DF7886233F62@yahoo.com> <066FD282-1637-448C-99FF-BA62718386F0@yahoo.com> <ZJsZiQGs0QlHhzTV@www.zefox.net> <ZKRt4ryCGyv9n%2BQ/@www.zefox.net> <9A15D619-3274-44AC-B7E1-A1D6C7D334F2@yahoo.com>

[I continued to type MAX_JOBS_NUMBER where
MAKE_JOBS_NUMBER should have been what I
typed.]

On Jul 4, 2023, at 14:22, Mark Millard <marklmi@yahoo.com> wrote:

> On Jul 4, 2023, at 12:07, bob prohaska <fbsd@www.zefox.net> wrote:
>=20
>> On Tue, Jun 27, 2023 at 10:16:57AM -0700, bob prohaska wrote:
>>> On Tue, Jun 27, 2023 at 09:59:40AM -0700, Mark Millard wrote:
>>>>>=20
>>>>> If you want to identify system hangs, please
>>>>> put back:
>>>>>=20
>>>>> vm.swap_enabled=3D0
>>>>> vm.swap_idle_enabled=3D0
>>>>>=20
>>>=20
>>> They're reinstated now, but I don't want to disturb the system
>>> while it seems to be building world acceptably.=20
>>>=20
>> Reinstating=20
>> vm.swap_enabled=3D0
>> vm.swap_idle_enabled=3D0
>>=20
>> and limiting buildworld to -j3 allows buildworld to complete =
successfully in 1 GB of swap.
>>=20
>> Meanwhile, attempts to compile sysutils/usbtop using poudriere still =
cause swap exhaustion
>> while compiling /devel/llvm15 even with 2 GB of swap allocated.=20
>=20
> What sort of parallelism settings in poudriere for the
> devel/llvm15 build attempt? Have you tried allowing
> less parallelism (if there is a less for what you have
> tried)?
>=20
> What options are enabled vs. disabled for devel/llvm15 ?
>=20
> BE_STANDARD vs. BE_FREEBSD vs. BE_NATIVE ?
>=20
> BE_NATIVE probably help limit resource use the most if it
> happens to be sufficient. BE_FREEBSD would be in the
> middle of the 3 options for this issue.
>=20
> Is MLIR enabled? If having it disabled is sufficient, it
> being disabled should help avoid as much resource use.
> Simiarly for FLANG. (Building FLANG requires MLIR, so
> having MLIR disabled implies FLANG needing to also be
> disabled.)
>=20
>> The messages are
>> Jul  4 11:18:48 www kernel: pid 1074 (getty), jid 0, uid 0, was =
killed: out of swap space
>=20
> In my view the "out of swap space" is still a misleading
> misnomer for this context, but at least the following
> messages are more specific to the actual internal
> data-structure(s) problem(s). My understanding is that
> the data structures can have fragmentation issues.
>=20
> For fragmentation issues, prior history since booting
> might contribute, and building just after a reboot may
> end up with less fragmentation. (Unknown if sufficiently
> less.)
>=20
> Also, over allocating the swap partition (by not having
> kern.maxswzone appropriately matching) likely makes
> "swap blk zone exhausted" more likely. It is one of the
> reasons I avoid using swap partitioning with a total
> size that generates the message about possible
> mistuning.
>=20
>> swap blk zone exhausted, increase kern.maxswzone
>=20
> Have you ever gotten the above line before? I was
> unaware of any examples of it showing up.
>=20
>> swblk zone ok
>=20
> I'll note that there is another potential message
> pair for "swap pctrie zone exhausted"/"swpctrie zone ok"
> that you have not reported getting.
>=20
> Have you ever seen the "swap pctrie zone exhausted"
> notice? (Just curiosity on my part.)
>=20
>> IIRC the "increase kern.maxswzone" is unhelpful, if not impossible. =
The
>> "swblk zone ok" seems new.=20
>=20
> Are you using the default kern.maxswzone for your context?
> What is its value?
>=20
> Did you get the notice about possible mistuning for your
> combination of swap partition sizing and kern.maxswzone
> value? Or did "swap blk zone" happen even without that
> notice happening?
>=20
>> =46rom the gstat output near peak swap use the system wasn't I/O =
bound,
>=20
> The "swap blk zone" contains an in-kernel-RAM data
> structure that is involved in managing the swap space
> usage.
>=20
>> the disk was less than 25% busy at the time of the first OOMA kill.
>=20
> "swap blk zone" can end up with fragmentation issues, where
> the total available is only made up of a bunch of tiny chunks
> and nothing large can be handled as a unit any more. (A general
> description of "fragmented".)
>=20
>> Eventually it was possible to log in on the serial console and run =
top:
>>=20
>> 33 processes:  1 running, 29 sleeping, 3 zombie
>> CPU:  0.0% user,  0.0% nice, 10.6% system,  0.2% interrupt, 89.2% =
idle
>> Mem: 139M Active, 8256K Inact, 252M Laundry, 221M Wired, 98M Buf, =
292M Free
>> Swap: 2048M Total, 1291M Used, 756M Free, 63% Inuse
>>=20
>> PID   JID USERNAME    THR PRI NICE   SIZE    RES STATE    C   TIME    =
WCPU COMMAND
>> 40719     0 root          1  20  -20     0B  8192B swzonx   0   0:12  =
 9.15% cron
>> 40717     0 root          1  20  -20     0B  8192B swzonx   0   0:34  =
 9.08% sh
>> 40709     0 root          1  20  -20     0B  8192B swzonx   0   0:38  =
 9.01% sshd
>> 40720     0 root          1  20  -20     0B  8192B swzonx   3   0:13  =
 7.47% sh
>=20
> Unfortunately the swzonx text is truncated. There is
> actually:
>=20
> pause("swzonxb", 10); for swblk zone
> and:
> pause("swzonxp", 10); for swap pctrie zone
>=20
> top's display leaves it unclear which was involved.
>=20
>> 40721     0 bob           1  20    0  6608K  2600K CPU1     1   0:00  =
 0.32% top
>> 25761     0 bob           1  20    0    14M  6136K select   0   0:02  =
 0.03% sshd
>> 25852     0 root          1  20    0  4668K  1648K ttyin    1   0:01  =
 0.03% tip
>> 1237     0 root          1  20    0  5820K  1540K wait     1   0:12   =
0.00% sh
>> 25381     0 root          1  23    0    14M  5868K select   1   0:01  =
 0.00% sshd
>> 1030     0 root          1  24    0    13M  2416K vmbckw   1   0:00   =
0.00% sshd
>> 12715     0 root          1  68    0  5820K  1660K wait     0   0:00  =
 0.00% sh
>> 12710     0 root          1  20    0  5820K  1556K piperd   1   0:00  =
 0.00% sh
>> 929     0 root          1  20    0  5356K  1256K select   3   0:00   =
0.00% syslogd
>> 1014     0 root          1  20    0  5124K  1356K nanslp   2   0:00   =
0.00% cron
>> 25770     0 bob           1  36    0  6844K  3116K pause    1   0:00  =
 0.00% tcsh
>> 25794     0 bob           1  24    0  5380K  2188K wait     2   0:00  =
 0.00% su
>> 39626     0 root          1  20    0  5424K  2404K wait     2   0:00  =
 0.00% login
>> 40635     0 bob           1  20    0  6824K  3272K pause    1   0:00  =
 0.00% tcsh
>> 25820     0 root          1  21    0  5608K  2204K wait     0   0:00  =
 0.00% sh
>> 25851     0 root          1  20    0  4668K  1656K ttyin    3   0:00  =
 0.00% tip
>> 40454     0 root          1  24    0  4636K  1780K ttyin    3   0:00  =
 0.00% getty
>>=20
>> I'll let it go for a while to see if poudriere notices it's failed =
and cleans up.
>>=20
>> At the moment /boot/loader.conf contains
>>=20
>> # Configure USB OTG; see usb_template(4).
>> hw.usb.template=3D3
>> umodem_load=3D"YES"
>> # Disable the beastie menu and color
>> beastie_disable=3D"YES"
>> loader_color=3D"NO"
>> vm.pageout_oom_seq=3D"4096"
>> vm.pfault_oom_attempts=3D"3"
>> vm.pfault_oom_attempts=3D"120"
>=20
> 2 assignments to the same thing in a row?
> The 2nd ends up controlling the value.
>=20
>> vm.pfault_oom_wait=3D"20"
>=20
> So you are allowing it 120 * 20 sec =3D=3D 2400 sec
> (in other words, 40 minutes of retrying every 20
> seconds) to handle a page fault.
>=20
> That time scale may have contributed to why it
> failed first for "swap blk zone exhausted"
> instead of more usual types of OOM cause:
> How many page faults had active 40 minute
> intervals at the time?
>=20
> You may be just moving around where a problem
> shows up, not leading to lack of a failure
> overall.
>=20
>> kern.cam.boot_delay=3D"20000"
>> vfs.ffs.dotrimcons=3D"1"
>> vfs.root_mount_always_wait=3D"1"
>> filemon_load=3D"YES"
>>=20
>> /usr/local/etc/poudriere.conf contains
>> USE_TMPFS=3Dno
>> NOHANG_TIME=3D28800
>> MAX_EXECUTION_TIME_EXTRACT=3D14400
>> MAX_EXECUTION_TIME_INSTALL=3D14400
>> MAX_EXECUTION_TIME_PACKAGE=3D432000
>> ALLOW_MAKE_JOBS=3Dyes
>> MAX_JOBS_NUMBER=3D2
>=20
> I do not remember there being a MAX_JOBS_NUMBER in
> the infrastructure. So I will ignore that line. It
> probably should be deleted.
>=20
>> MAKE_JOBS_NUMBER=3D2
>>=20
>> Do these settings look reasonable?
>=20
> ALLOW_MAKE_JOBS/MAX_JOBS_NUMBER is not independent
> of what is being built. There is no global, single
> answer to "looks reasonable" for them.

Sorry: ALLOW_MAKE_JOBS/MAKE_JOBS_NUMBER

> However, MAX_JOBS_NUMBER is in the wrong file.

Sorry: MAKE_JOBS_NUMBER

> It is from/for make, not from/for poudriere
> directly. (But there is a way for poudriere
> to contribute such to make.)
>=20
> For example (from a grep):
>=20
> /usr/local/etc/poudriere.d/make.conf:MAKE_JOBS_NUMBER=3D2
>=20
> ( MAKE_JOBS_NUMBER_LIMIT is the same for where it
> goes. )
>=20
> You might need to use MAX_JOBS_NUMBER=3D1 or

Sorry yet again: MAKE_JOBS_NUMBER

> to not assign to ALLOW_MAKE_JOBS to have a
> chance to have the devel/llvm15 build fit
> if you have already turned off options that
> avoid using resources for building what you
> do not need.
>=20


=3D=3D=3D
Mark Millard
marklmi at yahoo.com

Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?286ABDA5-BB1A-47C1-A187-168FFD86A441>

Header And Logo

Peripheral Links

Site Navigation

Header And Logo

Peripheral Links

Search

Site Navigation