Date: Sat, 3 Jul 2021 17:43:51 -0700 From: Mark Millard via freebsd-ports <freebsd-ports@freebsd.org> To: bob prohaska <fbsd@www.zefox.net> Cc: FreeBSD ports <freebsd-ports@freebsd.org>, freebsd-arm <freebsd-arm@freebsd.org>, FreeBSD Toolchain <freebsd-toolchain@freebsd.org> Subject: Re: llvm10 build failure on Rpi3 Message-ID: <B836EE78-0534-4D8D-A0DD-486193FBF511@yahoo.com> In-Reply-To: <20210703215445.GA18768@www.zefox.net> References: <C64D1A3F-A42E-42E3-8491-4DE9F6A96CFB@yahoo.com> <43513842-6FC0-4A89-8F0C-9EB2B328A5ED@yahoo.com> <9CFE71E2-23C3-4072-A8AD-74EDB339A146@yahoo.com> <A4669E1F-6DA9-492C-B06C-12AABE60FCEB@yahoo.com> <F2A8E1C3-EAAD-448A-9A97-979CC9ED9BE7@yahoo.com> <60EEFD09-97DE-4B4F-BAFD-61B96EF60E27@yahoo.com> <F727FF9A-CDFB-4C9C-8333-0FEA6C54976A@yahoo.com> <77A35ACF-275F-44C8-AEEE-4EFE5B5CBEA4@yahoo.com> <20210703182546.GA17871@www.zefox.net> <380184FB-6BA1-4C2D-9C6B-E249C2CF1317@yahoo.com> <20210703215445.GA18768@www.zefox.net>
next in thread | previous in thread | raw e-mail | index | archive | help
On 2021-Jul-3, at 14:54, bob prohaska <fbsd at www.zefox.net> wrote: > On Sat, Jul 03, 2021 at 01:15:19PM -0700, Mark Millard wrote: >>=20 >>=20 >>=20 >> So you still have not tried an artifacts or snapshot kernel+world? >>=20 > Not yet.=20 >=20 >>> Eventually I resorted to running make in devel/llvm10, to my = surprise it >>> ran to completion. >>=20 >> Interesting. >>=20 >> Was this -j4? -j1? -j2? Any other interesting characteristics >> for how it was run? >>=20 > Nothing special was done. IIRC, it was make -DBATCH > make.log in > the background. =46rom top's screen it looked like -j4.=20 >=20 >> It would be interesting to see if building in a chroot >> in that make style also worked (or a non-poudriere jail). >>=20 >=20 > Can you point me to instructions for doing the experiment? I'll deal with this is a separate reply. >>> It also ran make package successfully. Again I tried to >>> build just devel/llvm10 using poudriere, again getting "expected = expression".=20 >>>=20 >>> At that point I resized the swap partitions to 1 GB each and tried = poudriere >>> on devel/llvm10. That got rid of the excessive swap warnings, but = didn't help. >>> Finally I placed=20 >>> MAKE_JOBS_NUMBER=3D2=20 >>> in /usr/local/etc/poudriere.d/make.conf and tried again. That still = failed, >>> still with "expected expression".=20 >>=20 >> I'll note that the running build build shows Load Averages >> of under 3. So the MAKE_JOBS_NUMBER=3D2 seems to be working. >>=20 >>> Since devel/llvm10 had created a package successfully, I tried = slipping a copy >>> into poudriere's package directory, hoping it would find and use the = package >>> to make further progress. Unfortunately, poudriere seems to remember = the failure >>> and won't use the proffered package.=20 >>=20 > [large snip which convinced me to give up on tricking poudriere into > using a package constructed by make]=20 >>=20 >> Going in a different direction, one way to force a build to >> start over after a failure is to: rm -fr PATH/.building >> before starting a new bulk build. This might be appropriate > I'm missing something here: what does PATH represent? There's > nothing called .building under /usr/local/poudriere, at least > after the run finishes.=20 Part of how this works is that .building/ is initially populated with a shadow copy of the already existing .latest/ mostly via use of hard links, with some top level files actually copied. If the status of the bulk run reaches stopped:done: then the .building/ is mv'd (renamed) to be of the form .real_*/ with a new match for the * and then the links are adjusted to point to the new .real_*/ and the old .real_*/ is removed. In your context, this happens inside: /usr/local/poudriere/data/packages/main-default/ So, yes, your run that reached stopped:done: no longer has a .building/ By contrast, say you ^C the bulk run or that it reaches the stopped:crashed: state instead of stopped:done: . Then the .building/ would still be present, as would the pre-existing existing .real_*/ and the links that use it. This is the context for the next bulk run reporting: "Using packages from previously failed build: ${PACKAGES}/.building" >> if one suspects a problem of a kind that did not stop a >> build but produced something for a build that fails to operate >> correctly. >>=20 > Such as a corrupt llmv-tblgen? Yep, possibly via it depending on something else that has problems. >> So lang/rust finished. That is interesting because it includes an >> llvm build internally. >>=20 >=20 > Does that build invoke the same llvm-tblgen? Every devel/llvm* build builds its own llvm-tblgen . lang/rust would build its own too. And the system llvm support builds its own as well. > [snip]=20 >> Again, poudriere does not control memory initialization in >> the processes in the builders. >>=20 >=20 > For some reason I got the idea that whatever asked for memory to use > was responsible for initializing it. Part of the point of having memory management libraries have way to be told to fill-in things like 0xA5u bytes is to get hints about contexts that end up with memory not explicitly initialized by the requesting program. Such is why I had you try the contrasting junk:false case in /etc/malloc.conf . The results showed what the memory allocation library initialized with instead of something specific to the code requesting the allocation. > Certainly not the kernel..... The kernel fills in bytes into some user-space memory as part of doing various requested operations. In such cases it is potentially possible for the kernel to not have filled-in the memory like it should have. It is also possible for the kernel to replace the bytes seen by user-space memory that it should not touch. There is an example on-going issue with this for the 32-bit powerpc kernels that cover using old PowerMacs. >>> The fact that the stoppage reported looks like >>> a syntax error specific to devel/llmv10 which is unaffected by swap = pressure >>> makes it seem unrelated to kernel or swap constraints.=20 >>=20 >> The files with the syntax errors are ones generated by llvm-tblgen >> during the build and it is the output of llvm-tblgen that is corrupt, >> showing evidence of having used memory not initialized like it should >> have been. >>=20 >=20 > Wouldn't that point suspicion at llvm-tblgen, of whatever version > LLVM is actually doing the work?=20 It points at llvm-tblgen and/or something(s) that llvm-tblgen depends on. Either way, the observed failure is from the llvm-tblgen output being incorrect and later complained about. devel/llvm10 builds its own llvm-tblgen for its own use. Each devel/llvm* does. (As does the system's llvm*.) There is also the variability in which llvm-tblgen output is messed up: it is always some example of: lib/Target/*/*GenGlobalISel.inc but which value for the *'s tends to vary from build attempt to build attempt. It suggests that some sort of race condition is involved. >>> AIUI, the hardware of the Pi4 is considerably different from the Pi3 = in terms >>> of memory management, noted from an interview with Eben Upton on = YouTube. >>=20 >> Why would Eben Upton be talking about FreeBSD's memory management? >>=20 > He was talking about the Pi4 hardware and how it differed from the Pi3 Which is not memory management as such. >> I suspect that the talk is not about what you think it is about, >> but some narrower aspects than the overall memory managment. >>=20 >=20 > I thought it had something to do with added DMA capablity. The video = is at > https://www.youtube.com/watch?v=3Dhyj-7mTnumI > In light of the discussion about llvm-tblgen I'm doubtful it's = relevant, > but it's not the worst way to waste an hour. >=20 >>=20 >>> Is there any sort of sanity test for the poudriere system? If I = delete and >>> re-create the existing jail can the existing package library be = preserved >>> and re-used? If not, that's OK, I'd just like to know beforehand. >>>=20 >>=20 >> # poudriere jail -jNAME -d >> # poudriere jail -c -jNAME -m null -M /WORLDPATH -S /SRCPATH -v = 14.0-CURRENT >>=20 >> should work fine. But really all that you are >> doing is (using an example from my environment) >> is deleting and rewriting a few very small files >> in a directory with the jail's name: >>=20 > So, in my case /usr/local/poudriere/poudriere-system?=20 After the delete would be: poudriere jail -c -jNAME -m null -M = /usr/local/poudriere/poudriere-system -S /usr/src -v 14.0-CURRENT Same as in your: http://www.zefox.org/~bob/readme > (using the nomenclature in your sample instructions). > That would leave /usr/local/poudriere/data intact.... Yep. The delete does have an option (-C ???) for causing more to be deleted under /usr/local/poudriere/data/ . (Despite documentation claims otherwise, it did not seem to delete packages when reqeuested.) > I'm starting to understand why you think it unlikely > to help. >=20 >> The deletion/replacement of timestamp may have rebuild >> consequences from appearing to have changed (or just >> being missing). >>=20 > If timestamps guide decisions on what to make and when, > that might be significant. Not sure how I might've screwed > them up, but in my hands anything is possible 8-) I took a quick look and did not notice any timestamp comparisons controlling anything. >> Nothing about any of those is going to change how memory >> initialization is working in llvm-tblgen's operation >> for generating any *GenGlobalISel.inc files, other than >> if the timestamp forces some sort of rebuild from scratch >> of some build dependencies first. >>=20 > Maybe this should be obvious, but which llvm-tblgen is in=20 > action? the one from the system, (12.0.1) or something > else? >=20 devel/llvm10 builds its own llvm-tblgen and uses it. Every devel/llvm* build builds its own llvm-tblgen . Looking in the .log file for a build there are lines containing commands that start out with (from my example devel/llvm10 build context): /wrkdirs/usr/ports/devel/llvm10/work/.build/bin/llvm-tblgen Before any of those, there are commands associated with building that bin/llvm-tblgen . =3D=3D=3D Mark Millard marklmi at yahoo.com ( dsl-only.net went away in early 2018-Mar)
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?B836EE78-0534-4D8D-A0DD-486193FBF511>