Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 17 Nov 2025 19:53:15 -0800
From:      Adrian Chadd <adrian@freebsd.org>
To:        bob prohaska <fbsd@www.zefox.net>
Cc:        Carl Shapiro <cshapiro@panix.com>, Ronald Klop <ronald@freebsd.org>, freebsd-arm@freebsd.org,  freebsd-current@freebsd.org
Subject:   Re: Still seeing Failed assertion: "p[i] == 0" on armv7 buildworld
Message-ID:  <CAJ-Vmo=TbT7nD7rBrNnq3cutwMp9f7WXtQ-k9mUBne5ht4zGWg@mail.gmail.com>
In-Reply-To: <aRtBYaaa0n3_lwar@www.zefox.net>
References:  <aOvTG-20QRJtJJwf@int21h> <CANCZdfrJ8rph_rkT3Mk-sNYKNspoV15SvHWLsahzS0HnULi4ww@mail.gmail.com> <aO068RrAehdiHOoZ@www.zefox.net> <aRUJPryA4Vmu8dDD@www.zefox.net> <4957be52-e57f-4f5f-9626-d0f706480fe1@FreeBSD.org> <87ldkai9lu.fsf@panix.com> <aRXuLTN4hkGykHIl@www.zefox.net> <877bvthymv.fsf@panix.com> <aRdJ5xYeKEmhuIgh@www.zefox.net> <ouy1pm0nued.fsf@panix3.panix.com> <aRtBYaaa0n3_lwar@www.zefox.net>

next in thread | previous in thread | raw e-mail | index | archive | help

[-- Attachment #1 --]
(random reply, sorry bob)

i think i saw someone say they can trigger it with a single super large
source file, is that right? No need for parallelism, just build that one
file?

If so please pipe up, i'd like to see if you can get that over to mark on
his armv8 box and then we can try some stuff (like using cpuset to pin the
compilation to a single core so it doesn't migrate)


-adrian


On Mon, 17 Nov 2025 at 07:37, bob prohaska <fbsd@www.zefox.net> wrote:

> On Fri, Nov 14, 2025 at 05:04:10PM -0500, Carl Shapiro wrote:
> > bob prohaska <fbsd@www.zefox.net> writes:
> >
> > > Those files have been overwritten by restarting the buildworld
> sessions.
> > > They tend to be large and diffcult to synchronize with the .cpp and .sh
> > > files generated by the crash. It could be done if it's useful.
> >
> > At least from the perspective of debugging malloc(3), they'd be useful,
> > even if the files for reproducing the crash are not synchronized with
> > the std{err,out} output.  For example, there might be other log messages
> > generated by jemalloc.
> >
> > I need a moment to look at the code and step through what it is doing on
> > FreeBSD but my first guess is that there might just be an incorrect
> > assumption about committed memory always coming back zeroed.  That
> > should be true on 64-bit Linux when MADV_DONTNEED is used but not true
> > if another advice is used like MADV_FREE on either FreeBSD or Linux.  It
> > is always possible that the kernel is mishanding some memory but I would
> > like to rule out jemalloc itself before pointing a finger there.
>
> Here is an example of both the buildworld.log file and the generated
> diagnostic files, which for some reason didn't include .sh and .cpp files.
>
>
> http://www.zefox.net/~fbsd/assertion_failure/hostname_www.zefox.org/buildworld.log
>
> http://www.zefox.net/~fbsd/assertion_failure/hostname_www.zefox.org/symbolizer-input-bcaebf
>
> http://www.zefox.net/~fbsd/assertion_failure/hostname_www.zefox.org/symbolizer-output-1aa401
>
> This host's particular buildworld attempt has been going on for a long
> time, to the extent that
> world and kernel are mismatched:
> root@www:/usr/src # uname -KU
> 1600000 1500063
> The immediate goal is to get them back in sync.
>
> Thanks for reading,
>
> bob prohaska
>
>
>

[-- Attachment #2 --]
<div dir="ltr"><div>(random reply, sorry bob)</div><div><br></div><div>i think i saw someone say they can trigger it with a single super large source file, is that right? No need for parallelism, just build that one file?</div><div><br></div><div>If so please pipe up, i&#39;d like to see if you can get that over to mark on his armv8 box and then we can try some stuff (like using cpuset to pin the compilation to a single core so it doesn&#39;t migrate)</div><div><br></div><div><br></div><div>-adrian</div><div><br></div></div><br><div class="gmail_quote gmail_quote_container"><div dir="ltr" class="gmail_attr">On Mon, 17 Nov 2025 at 07:37, bob prohaska &lt;<a href="mailto:fbsd@www.zefox.net">fbsd@www.zefox.net</a>&gt; wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">On Fri, Nov 14, 2025 at 05:04:10PM -0500, Carl Shapiro wrote:<br>
&gt; bob prohaska &lt;<a href="mailto:fbsd@www.zefox.net" target="_blank">fbsd@www.zefox.net</a>&gt; writes:<br>
&gt; <br>
&gt; &gt; Those files have been overwritten by restarting the buildworld sessions.<br>
&gt; &gt; They tend to be large and diffcult to synchronize with the .cpp and .sh<br>
&gt; &gt; files generated by the crash. It could be done if it&#39;s useful.<br>
&gt; <br>
&gt; At least from the perspective of debugging malloc(3), they&#39;d be useful,<br>
&gt; even if the files for reproducing the crash are not synchronized with<br>
&gt; the std{err,out} output.  For example, there might be other log messages<br>
&gt; generated by jemalloc.<br>
&gt; <br>
&gt; I need a moment to look at the code and step through what it is doing on<br>
&gt; FreeBSD but my first guess is that there might just be an incorrect<br>
&gt; assumption about committed memory always coming back zeroed.  That<br>
&gt; should be true on 64-bit Linux when MADV_DONTNEED is used but not true<br>
&gt; if another advice is used like MADV_FREE on either FreeBSD or Linux.  It<br>
&gt; is always possible that the kernel is mishanding some memory but I would<br>
&gt; like to rule out jemalloc itself before pointing a finger there.<br>
<br>
Here is an example of both the buildworld.log file and the generated<br>
diagnostic files, which for some reason didn&#39;t include .sh and .cpp files.<br>
<br>
<a href="http://www.zefox.net/~fbsd/assertion_failure/hostname_www.zefox.org/buildworld.log" rel="noreferrer" target="_blank">http://www.zefox.net/~fbsd/assertion_failure/hostname_www.zefox.org/buildworld.log</a><br>;
<a href="http://www.zefox.net/~fbsd/assertion_failure/hostname_www.zefox.org/symbolizer-input-bcaebf" rel="noreferrer" target="_blank">http://www.zefox.net/~fbsd/assertion_failure/hostname_www.zefox.org/symbolizer-input-bcaebf</a><br>;
<a href="http://www.zefox.net/~fbsd/assertion_failure/hostname_www.zefox.org/symbolizer-output-1aa401" rel="noreferrer" target="_blank">http://www.zefox.net/~fbsd/assertion_failure/hostname_www.zefox.org/symbolizer-output-1aa401</a><br>;
<br>
This host&#39;s particular buildworld attempt has been going on for a long time, to the extent that<br>
world and kernel are mismatched:<br>
root@www:/usr/src # uname -KU<br>
1600000 1500063<br>
The immediate goal is to get them back in sync.<br>
<br>
Thanks for reading,<br>
<br>
bob prohaska<br>
<br>
<br>
</blockquote></div>

Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAJ-Vmo=TbT7nD7rBrNnq3cutwMp9f7WXtQ-k9mUBne5ht4zGWg>