Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 14 Apr 2025 00:48:17 -0700
From:      Mark Millard <marklmi@yahoo.com>
To:        FreeBSD ARM List <freebsd-arm@freebsd.org>
Subject:   Re: FYI: aarch64 FreeBSD under Parallels on macOS: how to avoid paging OOMs for "was killed: a thread waited too long to allocate a page" (somewhat)
Message-ID:  <31E0ED6B-59FE-4E01-95B4-22D2F03B490E@yahoo.com>
In-Reply-To: <114C309C-06F6-48E4-ABE1-C2CD4C617450@yahoo.com>
References:  <114C309C-06F6-48E4-ABE1-C2CD4C617450@yahoo.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Apr 12, 2025, at 18:41, Mark Millard <marklmi@yahoo.com> wrote:

> I've been doing more poudriere-devel based "bulk -a" experiments
> (with pkg 2.1.0 involved) and was still getting "was killed: a
> thread waited too long to allocate a page".
>=20
> I made an adjustment to the Parallels configuration that I've
> been using and, so far, the paging OOM's have stopped --even
> when I pushed the paging harder than I had been before the
> change. The adjustment doing this may depend on the RAM
> allocation used.
>=20
> Context: M4 MAX with 16 cores: 12 Performance and 4 Efficiency.
> Also:    The M4 MAX has 128 GiBytes of RAM.
>=20
> I previously had 14 virtual/FreeBSD cpus. I changed that to 12,
> matching the Performance core count. This might have helped in
> one or both of a couple of ways of note:
>=20
> ) FreeBSD was no longer competing for Efficiency Core use.
>  (Parallels automatically uses just Performance cores
>   when the count is small enough.)
>=20
> ) Same amount of RAM allocated to FreeBSD as before --but
>  for fewer FreeBSD cpus: less memory pressure internal to
>  FreeBSD.
>=20
>=20
> I will note that I've observed as high as 15035 MiBytes
> of Swap Space in use. Not necessarily at that same time:
>=20
> 76729 MiByte for: Active+Wired+Laundry+SwapUsed
> and at that time:
> 76740 MiByte for: Active+Wired+Laundry+SwapUsed+InAct
> (so: not much InAct at the time)
>=20
> (Monitored with personal top patches.)
>=20
> The style of "bulk -a" use allows large load averages
> relative to the FreeBSD cpu count, such as the
> maximums for the 3 having observed values (each with
> its own time frame) 57.68,  44.03,  39.85 when there
> are 12 FreeBSD cpus.
>=20
>=20
> For reference for the RAM allocated to FreeBSD:
>=20
> I allocate 64 GiBytes of RAM to the FreeBSD VM. Its intent
> is that, for my context of use, this leaves macOS with over
> 32 GiBytes of RAM free when parallels has allocated the
> whole 64 GiBytes for the FreeBSD: more like 36 GiBytes free
> as it turns out. That, in turn, avoids macOS doing things
> like compressing memory to get back Free Space. (That seems
> to start at around 28 GiBytes or less free on macOS, if I
> remember what I observed right. I have margin for handling
> variability.)

The above did seem to help for a long time.

But eventually the Laundry got to be large and was dumped
out to swap space, apparently approximately in a rapid,
sustained sequence. That again lead to some "was killed:
a thread waited too long to allocate a page" notices
for processes from one jid, all the processes being "dot"
processes:

Apr 13 22:58:15 aarch64-main-pbase kernel: pid 6721 (dot), jid 15, uid =
0, was killed: a thread waited too long to allocate a page
Apr 13 23:01:41 aarch64-main-pbase kernel: pid 6756 (dot), jid 15, uid =
0, was killed: a thread waited too long to allocate a page
Apr 13 23:01:41 aarch64-main-pbase kernel: pid 6739 (dot), jid 15, uid =
0, was killed: a thread waited too long to allocate a page
Apr 13 23:01:41 aarch64-main-pbase kernel: pid 6732 (dot), jid 15, uid =
0, was killed: a thread waited too long to allocate a page
Apr 13 23:01:41 aarch64-main-pbase kernel: pid 6747 (dot), jid 15, uid =
0, was killed: a thread waited too long to allocate a page
Apr 13 23:01:41 aarch64-main-pbase kernel: pid 6764 (dot), jid 15, uid =
0, was killed: a thread waited too long to allocate a page
Apr 13 23:02:13 aarch64-main-pbase kernel: pid 6715 (dot), jid 15, uid =
0, was killed: a thread waited too long to allocate a page
Apr 13 23:09:03 aarch64-main-pbase kernel: pid 6728 (dot), jid 15, uid =
0, was killed: a thread waited too long to allocate a page
Apr 13 23:11:35 aarch64-main-pbase kernel: pid 6766 (dot), jid 15, uid =
0, was killed: a thread waited too long to allocate a page
Apr 13 23:16:39 aarch64-main-pbase kernel: pid 6748 (dot), jid 15, uid =
0, was killed: a thread waited too long to allocate a page

The processes that showed page fault in top where not
limited to that jid. Once the swap space use stopped
growing, things got back to normal.

Maximum Observed Swap Used grow to 27802 MiBytes during
this. Also, where the VM has 64 GiBytes of RAM assigned
as available to it:

91056Mi MaxObs    (Active+Wired+Laundry+SwapUsed)
91056Mi Same Time (Active+Wired+Laundry+SwapUsed+InAct)

The prior round of such a notice sequence for a jid:

Apr 12 08:43:14 aarch64-main-pbase kernel: pid 19731 (dot), jid 23, uid =
0, was killed: a thread waited too long to allocate a page
Apr 12 08:44:38 aarch64-main-pbase kernel: pid 21592 (dot), jid 23, uid =
0, was killed: a thread waited too long to allocate a page
Apr 12 08:45:56 aarch64-main-pbase kernel: pid 21716 (dot), jid 23, uid =
0, was killed: a thread waited too long to allocate a page
Apr 12 08:46:44 aarch64-main-pbase kernel: pid 21554 (dot), jid 23, uid =
0, was killed: a thread waited too long to allocate a page
Apr 12 08:47:33 aarch64-main-pbase kernel: pid 21814 (dot), jid 23, uid =
0, was killed: a thread waited too long to allocate a page
Apr 12 08:48:43 aarch64-main-pbase kernel: pid 21765 (dot), jid 23, uid =
0, was killed: a thread waited too long to allocate a page
Apr 12 08:48:54 aarch64-main-pbase kernel: pid 21830 (dot), jid 23, uid =
0, was killed: a thread waited too long to allocate a page

And the one before that:

Apr 12 06:52:38 aarch64-main-pbase kernel: pid 12132 (dot), jid 5, uid =
0, was killed: a thread waited too long to allocate a page
Apr 12 06:53:23 aarch64-main-pbase kernel: pid 13431 (dot), jid 5, uid =
0, was killed: a thread waited too long to allocate a page
Apr 12 06:54:04 aarch64-main-pbase kernel: pid 13608 (dot), jid 5, uid =
0, was killed: a thread waited too long to allocate a page
Apr 12 06:54:14 aarch64-main-pbase kernel: pid 13667 (dot), jid 5, uid =
0, was killed: a thread waited too long to allocate a page

Hmm. "dot" seems to rather common for some reason.
Looking at even older history of examples of the
type of message also shows such.


=3D=3D=3D
Mark Millard
marklmi at yahoo.com




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?31E0ED6B-59FE-4E01-95B4-22D2F03B490E>