Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 27 Jan 2022 12:12:20 -0800
From:      Mark Millard <marklmi@yahoo.com>
To:        bob prohaska <fbsd@www.zefox.net>, Mark Johnston <markj@freebsd.org>
Cc:        freebsd-arm@freebsd.org
Subject:   Re: devel/llvm13 failed to reclaim memory on 8 GB Pi4 running -current
Message-ID:  <2C7E741F-4703-4E41-93FE-72E1F16B60E2@yahoo.com>
In-Reply-To: <C8BDF77F-5144-4234-A453-8DEC9EA9E227@yahoo.com>
References:  <20220127164512.GA51200@www.zefox.net> <C8BDF77F-5144-4234-A453-8DEC9EA9E227@yahoo.com>

next in thread | previous in thread | raw e-mail | index | archive | help


On 2022-Jan-27, at 11:31, Mark Millard <marklmi@yahoo.com> wrote:

> On 2022-Jan-27, at 08:45, bob prohaska <fbsd@www.zefox.net> wrote:
>=20
>> Attempts to compile devel/llvm13 on a Pi4 running -current (updated
>> on 20220126) with 8 GB of RAM and 8 GB of swap has failed on two =
occasions using=20
>> make -DBATCH > make.log &=20
>> in /usr/ports/devel/llvm13 using the system compiler. The system is
>> self-hosted.=20

Context question: ZFS? UFS?

(In things involving memory usage issues, knowing which is
always appropriate because of differences in memory use
patterns.)

>> The first failure reported clang error 139, but the second
>> was different, reporting only:
>> FAILED: =
tools/flang/lib/Evaluate/CMakeFiles/obj.FortranEvaluate.dir/check-expressi=
on.cpp.o
>> along with a console report of
>> +swap_pager: indefinite wait buffer: bufobj: 0, blkno: 1258432, size: =
4096
>> +swap_pager: indefinite wait buffer: bufobj: 0, blkno: 627221, size: =
8192
>> +swap_pager: indefinite wait buffer: bufobj: 0, blkno: 240419, size: =
4096
>> +swap_pager: out of swap space
>=20
> In recent builds, such as yours, the above "out of swap" is a
> misnomer but is very interesting for what it is actually about.
>=20
> Mark Johnston later wrote on 2022-Jan-15 about his "git:
> 4a864f624a70 - main - vm_pageout: Print a more accurate message
> to the console before an OOM kill" that produced the above report
> of "out of swap space":
>=20
> QUOTE
> Hmm, those cases should likely be changed from "out of swap space" to
> "failed to allocate swap metadata" or something like that.
> END QUOTE
>=20
> Your context proves the metadata problem really happens, so
> the messaging should be fixed to not be misleading.
>=20
> In my builds I've code that is more explicit:
>=20
> diff --git a/sys/vm/swap_pager.c b/sys/vm/swap_pager.c
> index 01cf9233329f..280621ca51be 100644
> --- a/sys/vm/swap_pager.c
> +++ b/sys/vm/swap_pager.c
> @@ -2091,6 +2091,7 @@ swp_pager_meta_build(vm_object_t object, =
vm_pindex_t pindex, daddr_t swapblk)
>                                   0, 1))
>                                       printf("swap blk zone exhausted, =
"
>                                           "increase =
kern.maxswzone\n");
> +                               printf("swp_pager_meta_build: swap blk =
uma zone exhausted\n");
>                               vm_pageout_oom(VM_OOM_SWAPZ);
>                               pause("swzonxb", 10);
>                       } else
> @@ -2121,6 +2122,7 @@ swp_pager_meta_build(vm_object_t object, =
vm_pindex_t pindex, daddr_t swapblk)
>                                   0, 1))
>                                       printf("swap pctrie zone =
exhausted, "
>                                           "increase =
kern.maxswzone\n");
> +                               printf("swp_pager_meta_build: swap =
pctrie uma zone exhausted\n");
>                               vm_pageout_oom(VM_OOM_SWAPZ);
>                               pause("swzonxp", 10);
>                       } else
>=20
> The "metadata" is the "swap blk uma zone" and "swap pctrie
> uma zone". Unfortuantely, which got the failure is not still
> indicated in the standard builds.
>=20
>> +swp_pager_getswapspace(12): failed
>> +pid 61012 (c++), jid 0, uid 0, was killed: failed to reclaim memory
>=20
> Abssent being able to swap, it tries to reclaim --and that
> too failed. That finally leads to the kills.
>=20
>> Swap use peaked a little over 50%.
>=20
> So at around 50% "swap blk uma zone" and/or "swap pctrie uma zone"
> had problems, probably fragmentation related problems.
>=20
>> After the first failure a restart
>> of make using MAKE_JOBS_UNSAFE=3Dyes ran to completion with one =
thread.
>>=20
>> A copy of the build log, logging script and other notes is at
>> http://www.zefox.net/~fbsd/rpi4/20220127/
>>=20
>> Clang error 139 has been seen several times during make buildworld on =
a Pi3 running
>> stable/13 with 2 GB of swap as well. Perhaps the two failures are =
related. The Pi3=20
>> failures didn't report out of swap, all were clang error 139 with =
"failed to reclaim=20
>> memory". Even with only 1 thread (j1) the failure reproduced.
>>=20
>=20
> Note in your report above: obj.FortranEvaluate.dir
>=20
> If you use the options to disable building flang (a.k.a.,
> the Fortran compiler build), your builds on the RPi4B
> will likely work in the current configuration.
>=20
> But it looks like you have identified a test context
> for the "swap blk uma zone" and "swap pctrie uma zone"
> handling.


=3D=3D=3D
Mark Millard
marklmi at yahoo.com




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?2C7E741F-4703-4E41-93FE-72E1F16B60E2>