Date: Wed, 2 Oct 2024 08:54:52 -0600 From: Warner Losh <imp@bsdimp.com> To: Dan Mack <mack@macktronics.com> Cc: dsdqmzk@hotmail.com, freebsd-current@freebsd.org Subject: Re: FYI: make's "max_jobs" needs to be separated from -j (now?) Message-ID: <CANCZdfpnwfsyuRH0p_xFfh5XdUZsLRtex_K%2Bs=DsJDux8t-1wg@mail.gmail.com> In-Reply-To: <0db2d927-3299-2c0f-2310-d8e386fb31c6@macktronics.com> References: <Zv0uSUVsyhRtMS27@albert.catwhisker.org> <SEYPR02MB5821CB9C62B92369962FCAABB1702@SEYPR02MB5821.apcprd02.prod.outlook.com> <cdfa00de-5640-8f09-d501-9fce87072d0d@macktronics.com> <SEYPR02MB58216CE43D7E3BC4841B5980B1702@SEYPR02MB5821.apcprd02.prod.outlook.com> <0db2d927-3299-2c0f-2310-d8e386fb31c6@macktronics.com>
index | next in thread | previous in thread | raw e-mail
[-- Attachment #1 --]
On Wed, Oct 2, 2024 at 8:42 AM Dan Mack <mack@macktronics.com> wrote:
> On Wed, 2 Oct 2024, dsdqmzk@hotmail.com wrote:
>
> > Dan Mack wrote:
> >> On Wed, 2 Oct 2024, dsdqmzk@hotmail.com wrote:
> >>
> >>> David Wolfskill wrote:
> >>>> I have been tracking stable/ and head (daily, with a few exceptions)
> for
> >>>> many years, now. Over time, I set up a set of ([t]csh) aliases to
> >>>> simplify the exercise for me.
> >>>>
> >>>> Until yesterday, the "make -j${max_jobs} buildworld" construct had
> >>>> worked without issue, but (yesterday), the invocation failed quite
> >>>> quickly:
> >>>>
> >>>> | Tue Oct 1 11:54:18 UTC 2024
> >>>> | --- buildworld ---
> >>>> | make[1]: "/usr/src/Makefile.inc1" line 362: SYSTEM_COMPILER:
> >>>> Determined that CC=cc matches the source tree. Not bootstrapping a
> >>>> cross-compiler.
> >>>> | make[1]: "/usr/src/Makefile.inc1" line 367: SYSTEM_LINKER:
> >>>> Determined that LD=ld matches the source tree. Not bootstrapping a
> >>>> cross-linker.
> >>>> | --------------------------------------------------------------
> >>>> | >>> World build started on Tue Oct 1 11:54:18 UTC 2024
> >>>> | --------------------------------------------------------------
> >>>> | >>> Deleting stale files in build tree...
> >>>> | 0.14 real 0.23 user 0.10 sys
> >>>> | *** [_cleanworldtmp] Error code 6
> >>>> |
> >>>> | make[1]: stopped making "buildworld" in /usr/src
> >>>> | .ERROR_TARGET='_cleanworldtmp'
> >>>> | .ERROR_META_FILE=''
> >>>>
> >>>> On a bit of a whim, I tried adjusting the "max_jobs" values
> (downward),
> >>>> which didn't help, but removing the "-j14" entirely did not produce a
> >>>> failure.
> >>>>
> >>>> On the other hand, rebuilding clang/llvm with a single core on a
> laptop
> >>>> (when I actually want to be able to use the laptop later in the day
> >>>> while I'm at work) didn't seem productive.
> >>>>
> >>>> A bit more rather randomly "trying stuff" yielded the result that
> while
> >>>>
> >>>> make -j14 buildworld
> >>>>
> >>>> failed (as described above),
> >>>>
> >>>> make -j 14 buildworld
> >>>>
> >>>> carries on as before -- it's building lib/clang (and using multiple
> >>>> cores to do so).... :-}
> >>>
> >>> Just got the same error, but both invocations didn't work, and I
> noticed
> >>> that bootstrapped version of mtree failed to run because of (now)
> >>> missing libmd.so.6. I think it's not really related to whitespace
> >>> between -j and jobs number, rather you had to (re)build the bootstrap
> >>> tools.
> >>
> >> I have been building current twice daily for a while and didn't notice
> >> this regression but I do have the space after "-j"
> >>
> >> #!/bin/sh
> >> make -j 16 buildworld > /logs/bw.$$ 2>&1 && \
> >> make -j 8 kernel KERNCONF=GENERIC > /logs/bk.$$ 2>&1 && \
> >> sync && reboot
> >
> > Do you also do `make delete-old-libs`?
> >
> >> I grepped all my logs across 3 servers and did not see a single instance
> >> of [_cleanworldtmp] Error code ... in any of the logs. What was the
> >> hash of the build you were on there, I can try to reproduce it quickly
> >> (but it might only trigger with your builddir state I guess)
> >
> > If I understand the problem correctly, it should be as easy as:
> >
> > 1. build on pre-e7a629c851d system
> > 2. install/reboot
> > 3. make delete-old-libs
> > 4. try to build world/kernel that fail as above, and, I think, make
> > kernel-toolchain was the one failing because mtree failed to run
> > (because of libmd.so.6 gone now)
> >
> > In any case, wiping out /usr/obj solved it for me.
>
> Ack, okay. I can't trigger it with a fresh or my /usr/obj but in any
> event the error number 6 is probably referring to a path or directory
> missing while doing a parallel build given some input state :-)
>
> #define ENXIO 6 /* Device not configured */
>
ENXIO usually is reserved for hardware errors when a device disappears
for block I/O contexts. So I'm not sure that this theory is so good.
But shell error exit statuses are largely independent of errnos.
Warner
[-- Attachment #2 --]
<div dir="ltr"><div dir="ltr"><br></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Wed, Oct 2, 2024 at 8:42 AM Dan Mack <<a href="mailto:mack@macktronics.com">mack@macktronics.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">On Wed, 2 Oct 2024, <a href="mailto:dsdqmzk@hotmail.com" target="_blank">dsdqmzk@hotmail.com</a> wrote:<br>
<br>
> Dan Mack wrote:<br>
>> On Wed, 2 Oct 2024, <a href="mailto:dsdqmzk@hotmail.com" target="_blank">dsdqmzk@hotmail.com</a> wrote:<br>
>><br>
>>> David Wolfskill wrote:<br>
>>>> I have been tracking stable/ and head (daily, with a few exceptions) for<br>
>>>> many years, now. Over time, I set up a set of ([t]csh) aliases to<br>
>>>> simplify the exercise for me.<br>
>>>><br>
>>>> Until yesterday, the "make -j${max_jobs} buildworld" construct had<br>
>>>> worked without issue, but (yesterday), the invocation failed quite<br>
>>>> quickly:<br>
>>>><br>
>>>> | Tue Oct 1 11:54:18 UTC 2024<br>
>>>> | --- buildworld ---<br>
>>>> | make[1]: "/usr/src/Makefile.inc1" line 362: SYSTEM_COMPILER:<br>
>>>> Determined that CC=cc matches the source tree. Not bootstrapping a<br>
>>>> cross-compiler.<br>
>>>> | make[1]: "/usr/src/Makefile.inc1" line 367: SYSTEM_LINKER:<br>
>>>> Determined that LD=ld matches the source tree. Not bootstrapping a<br>
>>>> cross-linker.<br>
>>>> | --------------------------------------------------------------<br>
>>>> | >>> World build started on Tue Oct 1 11:54:18 UTC 2024<br>
>>>> | --------------------------------------------------------------<br>
>>>> | >>> Deleting stale files in build tree...<br>
>>>> | 0.14 real 0.23 user 0.10 sys<br>
>>>> | *** [_cleanworldtmp] Error code 6<br>
>>>> |<br>
>>>> | make[1]: stopped making "buildworld" in /usr/src<br>
>>>> | .ERROR_TARGET='_cleanworldtmp'<br>
>>>> | .ERROR_META_FILE=''<br>
>>>><br>
>>>> On a bit of a whim, I tried adjusting the "max_jobs" values (downward),<br>
>>>> which didn't help, but removing the "-j14" entirely did not produce a<br>
>>>> failure.<br>
>>>><br>
>>>> On the other hand, rebuilding clang/llvm with a single core on a laptop<br>
>>>> (when I actually want to be able to use the laptop later in the day<br>
>>>> while I'm at work) didn't seem productive.<br>
>>>><br>
>>>> A bit more rather randomly "trying stuff" yielded the result that while<br>
>>>><br>
>>>> make -j14 buildworld<br>
>>>><br>
>>>> failed (as described above),<br>
>>>><br>
>>>> make -j 14 buildworld<br>
>>>><br>
>>>> carries on as before -- it's building lib/clang (and using multiple<br>
>>>> cores to do so).... :-}<br>
>>><br>
>>> Just got the same error, but both invocations didn't work, and I noticed<br>
>>> that bootstrapped version of mtree failed to run because of (now)<br>
>>> missing libmd.so.6. I think it's not really related to whitespace<br>
>>> between -j and jobs number, rather you had to (re)build the bootstrap<br>
>>> tools.<br>
>><br>
>> I have been building current twice daily for a while and didn't notice<br>
>> this regression but I do have the space after "-j"<br>
>><br>
>> #!/bin/sh<br>
>> make -j 16 buildworld > /logs/bw.$$ 2>&1 && \<br>
>> make -j 8 kernel KERNCONF=GENERIC > /logs/bk.$$ 2>&1 && \<br>
>> sync && reboot<br>
><br>
> Do you also do `make delete-old-libs`?<br>
><br>
>> I grepped all my logs across 3 servers and did not see a single instance<br>
>> of [_cleanworldtmp] Error code ... in any of the logs. What was the<br>
>> hash of the build you were on there, I can try to reproduce it quickly<br>
>> (but it might only trigger with your builddir state I guess)<br>
><br>
> If I understand the problem correctly, it should be as easy as:<br>
><br>
> 1. build on pre-e7a629c851d system<br>
> 2. install/reboot<br>
> 3. make delete-old-libs<br>
> 4. try to build world/kernel that fail as above, and, I think, make<br>
> kernel-toolchain was the one failing because mtree failed to run<br>
> (because of libmd.so.6 gone now)<br>
><br>
> In any case, wiping out /usr/obj solved it for me.<br>
<br>
Ack, okay. I can't trigger it with a fresh or my /usr/obj but in any <br>
event the error number 6 is probably referring to a path or directory <br>
missing while doing a parallel build given some input state :-)<br>
<br>
#define ENXIO 6 /* Device not configured */<br></blockquote><div><br></div><div>ENXIO usually is reserved for hardware errors when a device disappears</div><div>for block I/O contexts. So I'm not sure that this theory is so good.</div><div><br></div><div>But shell error exit statuses are largely independent of errnos.</div><div><br></div><div>Warner</div></div></div>
help
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CANCZdfpnwfsyuRH0p_xFfh5XdUZsLRtex_K%2Bs=DsJDux8t-1wg>
