Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 27 Mar 2022 09:02:49 -0700
From:      Mark Millard <marklmi@yahoo.com>
To:        Dimitry Andric <dim@FreeBSD.org>
Cc:        "toolchain@freebsd.org" <toolchain@FreeBSD.org>
Subject:   Re: [package - 130arm64-default][lang/gcc12-devel] Failed for gcc12-devel-12.0.1.s20220306_2 in build/runaway
Message-ID:  <17DDA9F6-CFD0-479B-B3B5-51B570893863@yahoo.com>
In-Reply-To: <2D081409-B3E7-422D-98C4-AC7394915F72@yahoo.com>
References:  <202203261416.22QEGtRR065106@ampere3.nyi.freebsd.org> <A4CB59C1-229B-4F61-837D-5B557DFA8339@FreeBSD.org> <21D1C2BF-151E-4252-936C-B5B22C9C8071@yahoo.com> <75A61EB5-70D1-4E1F-89D2-524407854D6F@yahoo.com> <FE5F8CCE-BBC2-4A3F-B95D-22B51C6A9833@yahoo.com> <17CAD266-C7C0-4CD7-B255-3DC07F422EB5@yahoo.com> <2D081409-B3E7-422D-98C4-AC7394915F72@yahoo.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On 2022-Mar-27, at 08:52, Mark Millard <marklmi@yahoo.com> wrote:

> On 2022-Mar-26, at 16:15, Mark Millard <marklmi@yahoo.com> wrote:
>=20
>> On 2022-Mar-26, at 14:37, Mark Millard <marklmi@yahoo.com> wrote:
>>=20
>>> On 2022-Mar-26, at 13:16, Mark Millard <marklmi@yahoo.com> wrote:
>>>=20
>>>> On 2022-Mar-26, at 12:35, Mark Millard <marklmi@yahoo.com> wrote:
>>>>=20
>>>>> On 2022-Mar-26, at 07:26, Dimitry Andric <dim@FreeBSD.org> wrote:
>>>>>=20
>>>>>> On 26 Mar 2022, at 15:16, pkg-fallout@freebsd.org =
<pkg-fallout@FreeBSD.org> wrote:
>>>>>>>=20
>>>>>>> You are receiving this mail as a port that you maintain
>>>>>>> is failing to build on the FreeBSD package build server.
>>>>>>> Please investigate the failure and submit a PR to fix
>>>>>>> build.
>>>>>>>=20
>>>>>>> Maintainer:     toolchain@FreeBSD.org
>>>>>>> Log URL:        =
http://ampere3.nyi.freebsd.org/data/130arm64-default/60ab72786154/logs/gcc=
12-devel-12.0.1.s20220306_2.log
>>>>>>> Build URL:      =
http://ampere3.nyi.freebsd.org/build.html?mastername=3D130arm64-default&bu=
ild=3D60ab72786154
>>>>>>=20
>>>>>> So there isn't any actual error message in this log, except at =
the end:
>>>>>>=20
>>>>>> ...
>>>>>> =3D>> Cleaning up wrkdir
>>>>>> =3D=3D=3D>  Cleaning for gcc12-devel-12.0.1.s20220306_2
>>>>>> Killed
>>>>>> build of lang/gcc12-devel | gcc12-devel-12.0.1.s20220306_2 ended =
at Sat Mar 26 14:16:58 UTC 2022
>>>>>> build time: 12:31:35
>>>>>> !!! build failure encountered !!!
>>>>>>=20
>>>>>> It looks like the last command being run before "Killed" is the =
cc1plus
>>>>>> executable being linked with LTO, so I am assuming the build is =
killed
>>>>>> due to an out-of-memory condition?
>>>>>>=20
>>>>>> But this is only visible to people that have access to the =
machine the
>>>>>> poudriere instance is running on. Can somebody with access please =
check?
>>>>>>=20
>>>>>=20
>>>>> I do not have access but I've started a poudriere build
>>>>> of my own on a HoneyComb. I've a patched top that monitors
>>>>> and reports various Maximum Observed (MaxObs????) figures,
>>>>> 64 GiBytes of RAM and slightly over 246 GiBytes of swap.
>>>>> So hopefully it will report on about how big the memory use
>>>>> gets. But it is allowed to use all 16 cores and there will
>>>>> be no competing bulk builds using resources. So not a match
>>>>> to the build server context.
>>>>>=20
>>>>> Note: It is a ZFS context, so MaxObsWired is normally large
>>>>> and shrinks over times where memory needs to be used for
>>>>> other things. So the primary memory figures would be:
>>>>>=20
>>>>> MaxObsSwapUsed (if any)
>>>>> MaxObsActive
>>>>> MaxObs(Act+Lndry+SwapUsed)
>>>>>=20
>>>>>=20
>>>>> Side Note:
>>>>>=20
>>>>> =
http://ampere3.nyi.freebsd.org/build.html?mastername=3D130arm64-default&bu=
ild=3D60ab72786154
>>>>>=20
>>>>> reports a Time of 11:48:41 but the log reports "build time: =
12:31:35".
>>>>> My guess is that processing the log file for extracting the type =
of
>>>>> error makes some (much?) of the difference. (Type being =
runaway_process
>>>>> in this case.)
>>>>>=20
>>>>>=20
>>>>=20
>>>> I did just observe a cc1plus take somewhat over 30min
>>>> of CPU time before completing and the lto1 related activity
>>>> starting. It was under 5 GiBytes MaxObs(Act+Lndry+SwapUsed)
>>>> [No swap use observed] before the lto1 related activity
>>>> started.
>>>>=20
>>>> For the lto1 related activity MaxObs(Act+Lndry+SwapUsed)
>>>> has, so far, gotten up to around 12 GiBytes, still
>>>> no swap use observed:
>>>>=20
>>>> 12079Mi MaxObsActive
>>>> 12278Mi MaxObs(Act+Lndry+SwapUsed)
>>>>=20
>>>> I'll note that:
>>>>=20
>>>> last pid: . . .;  load averages:  . . . MaxObs:  28.02,  16.88,  =
15.82
>>>> . . . threads:   . . . running, . . . sleeping, 77 MaxObsRunning
>>>>=20
>>>> So, on the timescale of the first load average, it does
>>>> not always stay limited to the hardware threads available.
>>>>=20
>>>> No process with sustained CPU activity sticks around across
>>>> the lto1 activity. So I'll not be able to observe much about
>>>> cpu time.
>>>>=20
>>>> The elasped time doing lto1 activity has been going for a
>>>> while but I'm unlikely to be able to observe its end happen.
>>>> So I'll likely not have a good clue about that.
>>>>=20
>>>=20
>>> Looks it spent about 1.5 or so hours on the particular block of
>>> lto1 related activity. For reference, somewhat after that:
>>>=20
>>> last pid: . . .;  load averages:  . . . MaxObs:  28.02,  17.04,  =
16.87
>>>=20
>>> The 16 core are Cortex-A72's.
>>>=20
>>> The following did not change (so far):
>>>=20
>>> 12079Mi MaxObsActive
>>> 12278Mi MaxObs(Act+Lndry+SwapUsed)
>>>=20
>>> Still no observed swap use reported.=20
>>>=20
>>>=20
>>> The build is continuing. The build phase has been a little
>>> over 2.5 hr so far.
>>>=20
>>=20
>> I'll note that:
>>=20
>> =
http://ampere1.nyi.freebsd.org/build.html?mastername=3D130arm64-default&bu=
ild=3Dd09582c49c48
>>=20
>> reports for lang/gcc12-devel a build time of 21:53:01 .
>>=20
>> (So lang/gcc12-devel has built after LTO was enabled.)
>>=20
>>=20
>> Back to my lang/gcc12-devel build . . .
>>=20
>> The earlier large block of lto1 activity was from 7 =
.usr/.local/bin/ld 's
>> going in parallel. (I had a ps -auxd output in a file to look at.) =
Another
>> block of 7 is going and taking significant time. (Seen via top.)
>>=20
>> [Waiting for it to get past that block of lto1 related activity. . . =
.]
>>=20
>> Hmm. I was not there to observe the end of the lto1 activity and what
>> immediately followed. It is now in package:
>>=20
>> [00:26:46] [01] [00:00:00] Building lang/gcc12-devel | =
gcc12-devel-12.0.1.s20220306_1
>> . . .
>> load: 1.36  cmd: sh 10124 [nanslp] 15151.01r 21.72u 70.93s 0% 4096k
>> mi_switch+0x104 sleepq_catch_signals+0x41c sleepq_timedwait_sig+0x14 =
_sleep+0x1c0 kern_clock_nanosleep+0x1ac sys_nanosleep+0x3c =
do_el0_sync+0x5cc handle_el0_sync+0x40=20
>> [13_0R-CA72-default] [2022-03-26_11h31m18s] [parallel_build:] Queued: =
19 Built: 18 Failed: 0  Skipped: 0  Ignored: 0  Fetched: 0  Tobuild: 1   =
Time: 04:12:31
>> ID  TOTAL              ORIGIN   PKGNAME                          =
PHASE PHASE    TMPFS     CPU% MEM%
>> [01] 03:45:51 lang/gcc12-devel | gcc12-devel-12.0.1.s20220306_1 =
package 00:08:12 32.00 KiB 100% 0.2%
>> [04:12:37] Logs: =
/usr/local/poudriere/data/logs/bulk/13_0R-CA72-default/2022-03-26_11h31m18=
s
>>=20
>> So somewhat over 3 hrs to reach the package stage.
>>=20
>> The following have not changed since the prior reported figures:
>>=20
>> last pid: . . .;  load averages:  . . . MaxObs:  28.02,  17.04,  =
16.87
>> . . . threads:   . . . running, . . . sleeping, 77 MaxObsRunning
>> and:
>> 12079Mi MaxObsActive
>> 12278Mi MaxObs(Act+Lndry+SwapUsed)
>> (No report of observed swap usage.)
>>=20
>> [Waiting for the "Finished" message from poudriere. . . .]
>>=20
>> [04:33:13] [01] [04:06:27] Finished lang/gcc12-devel | =
gcc12-devel-12.0.1.s20220306_1: Success
>>=20
>=20
> Just for reference: I tested the build time for reverting the
> code in my context:
>=20
> # git -C /usr/ports/ diff lang/gcc12-devel/Makefile
> diff --git a/lang/gcc12-devel/Makefile b/lang/gcc12-devel/Makefile
> index 644abf2cbb86..fab28c952f80 100644
> --- a/lang/gcc12-devel/Makefile
> +++ b/lang/gcc12-devel/Makefile
> @@ -83,7 +83,7 @@ CONFIGURE_OUTSOURCE=3D  yes
> .if empty(PORT_OPTIONS:MBOOTSTRAP)
> CONFIGURE_ARGS+=3D--disable-bootstrap
> .else
> -CONFIGURE_ARGS+=3D--with-build-config=3Dbootstrap-lto-noplugin
> +CONFIGURE_ARGS+=3D--with-build-config=3Dbootstrap-debug
> ALL_TARGET=3D    bootstrap-lean
> .endif
> INSTALL_TARGET=3D        install-strip
>=20
> The result was 01:38:45 for using bootstrap-debug:
>=20
> [01:39:00] [01] [01:38:45] Finished lang/gcc12-devel | =
gcc12-devel-12.0.1.s20220306_1: Success
>=20
> instead of the earlier 04:06:27 for using bootstrap-lto-noplugin :
>=20
> [04:33:13] [01] [04:06:27] Finished lang/gcc12-devel | =
gcc12-devel-12.0.1.s20220306_1: Success
>=20

Hmm. I forgot to quote about memory use and such as well:

last pid: . . .;  load averages:   . . . MaxObs:  17.13,  16.52,  13.87
. . . threads:   . . . running, . . . sleeping, 22 MaxObsRunning
5307Mi MaxObsActive
5317Mi MaxObs(Act+Lndry+SwapUsed)

vs., for example, the 12278Mi MaxObs(Act+Lndry+SwapUsed)
when bootstrap-lto-noplugin is used.

=3D=3D=3D
Mark Millard
marklmi at yahoo.com




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?17DDA9F6-CFD0-479B-B3B5-51B570893863>