Date: Mon, 6 Mar 2023 15:05:27 -0800 From: Mark Millard <marklmi@yahoo.com> To: Lorenzo Salvadore <developer@lorenzosalvadore.it> Cc: Brooks Davis <brooks@FreeBSD.org>, "salvadore@freebsd.org" <salvadore@FreeBSD.org>, FreeBSD Mailing List <freebsd-ports@freebsd.org> Subject: Re: armv7 lang/gcc12 "no bootstrap" build via system clang 15.0.7 based poudriere build ends up stuck in a small loop Message-ID: <EB917CA9-CC67-4F79-8EBD-6BE82B021D45@yahoo.com> In-Reply-To: <480C8278-DC30-40D6-AED2-F52F59E78EBC@yahoo.com> References: <F536BC00-49D3-41F8-A328-EA10FD21E1DC.ref@yahoo.com> <F536BC00-49D3-41F8-A328-EA10FD21E1DC@yahoo.com> <2HOLCFE6Z_cOyGycU4ZBU7Lf6kcqohVx7tiLiRLzdjMEc6a8DFeH1IaJqdPNJOqFVTh1MGE7_UUJLcg2gg0UbTZIHZl72NbaNEsqrJwJ3xA=@lorenzosalvadore.it> <93707ED2-F529-49DE-A018-794827F56247@yahoo.com> <7AA0AE73-87CC-4B26-92B2-A0EC4281F429@yahoo.com> <480C8278-DC30-40D6-AED2-F52F59E78EBC@yahoo.com>
next in thread | previous in thread | raw e-mail | index | archive | help
[devel/llvm16 use added: still gets stuck in my context.] On Mar 6, 2023, at 10:13, Mark Millard <marklmi@yahoo.com> wrote: > [Backtrace added.] >=20 >> On Mar 6, 2023, at 09:46, Mark Millard <marklmi@yahoo.com> wrote: >>=20 >> [Some more context notes.] >>=20 >> On Mar 6, 2023, at 09:12, Mark Millard <marklmi@yahoo.com> wrote: >>=20 >>> On Mar 6, 2023, at 08:37, Lorenzo Salvadore = <developer@lorenzosalvadore.it> wrote: >>>=20 >>>> ------- Original Message ------- >>>> On Monday, March 6th, 2023 at 9:46 AM, Mark Millard = <marklmi@yahoo.com> wrote: >>>>=20 >>>>=20 >>>>>=20 >>>>>=20 >>>>> Under main that has clang 15.0.7, I've had to locally >>>>> switch to using the likes of: >>>>>=20 >>>>> OPTIONS_DEFAULT_armv7=3DSTANDARD_BOOTSTRAP >>>>>=20 >>>>> (to express it in Makefile terms) for lang/gcc12 in order >>>>> to avoid the following. >>>>>=20 >>>>> The no bootstrap build ends up stuck in small loop in = partition_union >>>>> (in cc1): >>>>>=20 >>>>> (gdb) info threads >>>>> Id Target Id Frame >>>>> * 1 LWP 632886 of process 27787 0x016eb82c in partition_union () >>>>> (gdb) bt >>>>> #0 0x016eb82c in partition_union () >>>>> #1 0x0133e6ec in var_union(_var_map*, tree_node*, tree_node*) () >>>>> #2 0x013218e4 in attempt_coalesce(_var_map*, ssa_conflicts*, int, = int, __sFILE*) () >>>>> #3 0x013203d0 in coalesce_ssa_name(_var_map*) () >>>>> #4 0x012c66b4 in rewrite_out_of_ssa(ssaexpand*) () >>>>> #5 0x0082c094 in (anonymous = namespace)::pass_expand::execute(function*) () >>>>> #6 0x00fd6ff0 in execute_one_pass(opt_pass*) () >>>>> #7 0x00fd8380 in execute_pass_list_1(opt_pass*) () >>>>> #8 0x00fc6df0 in execute_pass_list(function*, opt_pass*) () >>>>> #9 0x00880c20 in cgraph_node::expand() () >>>>> #10 0x00882d10 in symbol_table::compile() () >>>>> #11 0x00883454 in symbol_table::finalize_compilation_unit() () >>>>> #12 0x0120e204 in compile_file() () >>>>> #13 0x0120d9d4 in toplev::main(int, char**) () >>>>> #14 0x01646c28 in main () >>>>> (gdb) finish >>>>> Run till exit from #0 0x016eb82c in partition_union () >>>>>=20 >>>>> It never exits. I've walked through the short loop that ends >>>>> up with data that leads to no progress: bne always taken and >>>>> reaches a status of no change in the values involved happens >>>>> in the loop. >>>>>=20 >>>>> truss shows no output and no subroutines are called in the >>>>> few instruction long loop. >>>>>=20 >>>>> I ran multiple tests of "no bootstrap" and all failed the >>>>> same way. >>>>>=20 >>>>> Such would not be a good thing for the FreeBSD armv7 package >>>>> build server. >>>>>=20 >>>>> Also seen via lldb: >>>>>=20 >>>>> (lldb) bt >>>>> * thread #1, name =3D 'cc1', stop reason =3D signal SIGSTOP >>>>> * frame #0: 0x016eb82c cc1`partition_union + 152 frame #1: = 0x0133e6ec cc1`var_union(_var_map*, tree_node*, tree_node*) + 104 >>>>> frame #2: 0x013218e4 cc1`attempt_coalesce(_var_map*, = ssa_conflicts*, int, int, __sFILE*) + 508 frame #3: 0x013203d0 = cc1`coalesce_ssa_name(_var_map*) + 7240 >>>>> frame #4: 0x012c66b4 cc1`rewrite_out_of_ssa(ssaexpand*) + 2020 = frame #5: 0x0082c094 cc1`(anonymous = namespace)::pass_expand::execute(function*) + 68 >>>>> frame #6: 0x00fd6ff0 cc1`execute_one_pass(opt_pass*) + 616 frame = #7: 0x00fd8380 cc1`execute_pass_list_1(opt_pass*) + 44 >>>>> frame #8: 0x00fc6df0 cc1`execute_pass_list(function*, opt_pass*) + = 40 frame #9: 0x00880c20 cc1`cgraph_node::expand() + 324 >>>>> frame #10: 0x00882d10 cc1`symbol_table::compile() + 3860 frame = #11: 0x00883454 cc1`symbol_table::finalize_compilation_unit() + 300 >>>>> frame #12: 0x0120e204 cc1`compile_file() + 236 frame #13: = 0x0120d9d4 cc1`toplev::main(int, char**) + 7028 >>>>> frame #14: 0x01646c28 cc1`main + 48 frame #15: 0x004ad3f0 = cc1`__start(argc=3D31, argv=3D0xffffadec, env=3D0xffffae6c, = ps_strings=3D<unavailable>, obj=3D0x4181e004, cleanup=3D0x417ed4d8) at = crt1_c.c:92:7 >>>>>=20 >>>>>=20 >>>>>=20 >>>>> The armv7 STANDARD_BOOTSTRAP change lead to it reaching = completion. >>>>>=20 >>>>> But the "no bootstrap" issue suggests that system-clang 15.0.7 >>>>> has a problem for armv7 targeting. (I've not seen problems for >>>>> targeting aarch64 or amd64.) >>>>>=20 >>>>>=20 >>>>> For reference: >>>>>=20 >>>>> # uname -apKU >>>>> FreeBSD CA72_16Gp_ZFS 14.0-CURRENT FreeBSD 14.0-CURRENT #88 = main-n261230-e78dc78e517a-dirty: Wed Mar 1 16:17:45 PST 2023 = root@CA72_16Gp_ZFS:/usr/obj/BUILDs/main-CA72-nodbg-clang/usr/main-src/arm6= 4.aarch64/sys/GENERIC-NODBG-CA72 arm armv7 1400081 1400081 >>>>>=20 >>>>> via: >>>>>=20 >>>>> # poudriere jail -l >>>>> JAILNAME VERSION ARCH METHOD TIMESTAMP PATH >>>>> . . . >>>>> main-CA7 14.0-CURRENT arm.armv7 null 2021-06-27 17:58:33 = /usr/obj/DESTDIRs/main-CA7-poud >>>>> . . . >>>>>=20 >>>>> on an aarch64 system, no qemu involved (or even installed): >>>>>=20 >>>>> # uname -apKU >>>>> FreeBSD CA72_16Gp_ZFS 14.0-CURRENT FreeBSD 14.0-CURRENT #88 = main-n261230-e78dc78e517a-dirty: Wed Mar 1 16:17:45 PST 2023 = root@CA72_16Gp_ZFS:/usr/obj/BUILDs/main-CA72-nodbg-clang/usr/main-src/arm6= 4.aarch64/sys/GENERIC-NODBG-CA72 arm64 aarch64 1400081 1400081 >>>>>=20 >>>>> (It is a 16 Cortex-A72 HoneyComb.) >>>>=20 >>>> Thanks Mark. >>>>=20 >>>> I guess cases like this are one of the reasons for bootstrapping = existence: >>>> compilation with clang on armv7 probably is not the tipical case, = so it >>>> does not work so easily as using GCC on amd64. Good that it works = at least >>>> with bootstraping. >>>>=20 >>>> Now, I would like to suggest a few more experiments: >>>=20 >>> Some of the below have a partial answer from the fact that >>> the FreeBSD package builder system for armv7 is still >>> running system-clang 14 (main) or 13 (13.1-RELEASE) and >>> does not yet see the problem. (The build server's actual >>> kernel vintage should not be an issue to worry about.) >>>=20 >>> Nor did it have problems in the past building lang/gcc12. >>>=20 >>> This is a new issue. >>>=20 >>>> - does the compilation work without bootstrapping with = lang/gcc13-devel? >>=20 >> I started a lang/gcc13-devel build attempt. It got stuck >> as well while composing this message. I'll recreate and >> look at the backtrace later. >=20 > The rerun reproduced the problem. The backtrace was: >=20 > (lldb) bt > * thread #1, name =3D 'cc1', stop reason =3D signal SIGSTOP > * frame #0: 0x01f560cc cc1`partition_union + 152 > frame #1: 0x01a17e20 cc1`var_union(_var_map*, tree_node*, = tree_node*) + 104 > frame #2: 0x019ecaa4 cc1`attempt_coalesce(_var_map*, = ssa_conflicts*, int, int, __sFILE*) + 624 > frame #3: 0x019ea91c cc1`coalesce_ssa_name(_var_map*) + 8100 > frame #4: 0x019609ac cc1`rewrite_out_of_ssa(ssaexpand*) + 2052 > frame #5: 0x00a1f334 cc1`(anonymous = namespace)::pass_expand::execute(function*) + 68 > frame #6: 0x01583044 cc1`execute_one_pass(opt_pass*) + 664 > frame #7: 0x015842bc cc1`execute_pass_list_1(opt_pass*) + 44 > frame #8: 0x01572368 cc1`execute_pass_list(function*, opt_pass*) + = 40 > frame #9: 0x00a8efa8 cc1`cgraph_node::expand() + 364 > frame #10: 0x00a91404 cc1`symbol_table::compile() + 3244 > frame #11: 0x00a91e20 cc1`symbol_table::finalize_compilation_unit() = + 300 > frame #12: 0x0183530c cc1`compile_file() + 236 > frame #13: 0x01834acc cc1`toplev::main(int, char**) + 6716 > frame #14: 0x01e7c998 cc1`main + 48 > frame #15: 0x005a35b0 cc1`__start(argc=3D31, argv=3D0xffffabfc, = env=3D0xffffac7c, ps_strings=3D<unavailable>, obj=3D0x42094004, = cleanup=3D0x420634d8) at crt1_c.c:92:7 >=20 > confirming the similar context to the hangup building gcc12. >=20 >>>> - does the compilation work without bootstrapping with a higher = version >>>> of clang (we have devel/llvm16 in the ports tree, which tracks a = pre-release)? >=20 > I'll see about forcing lang/gcc13-devel to use devel/llvm16 instead > of system-clang. (Not something I've done before, at least that I > remember.) I do already have devel/llvm16 (rc3) built. I used devel/llvm13-devel for this, adding (partial patch notation, whitespace details possibly not preserved in the Email result): +BUILD_DEPENDS+=3D = ${LOCALBASE}/bin/clang${LLVM_DEFAULT}:devel/llvm${LLVM_DEFAULT} +CPP=3D ${LOCALBASE}/bin/clang-cpp${LLVM_DEFAULT} +CC=3D ${LOCALBASE}/bin/clang${LLVM_DEFAULT} +CXX=3D ${LOCALBASE}/bin/clang++${LLVM_DEFAULT} to the Makefile to control the clang toolchain used. (In my context, LLVM_DEFAULT is 16. I watch commands in top as well.) Still it got stuck looping in a small loop in the same routine in cc1: (lldb) bt * thread #1, name =3D 'cc1', stop reason =3D signal SIGSTOP * frame #0: 0x01f2972c cc1`partition_union + 152 frame #1: 0x019eddf0 cc1`var_union(_var_map*, tree_node*, = tree_node*) + 104 frame #2: 0x019c2ebc cc1`attempt_coalesce(_var_map*, ssa_conflicts*, = int, int, __sFILE*) + 624 frame #3: 0x019c0cdc cc1`coalesce_ssa_name(_var_map*) + 8108 frame #4: 0x01938b34 cc1`rewrite_out_of_ssa(ssaexpand*) + 2052 frame #5: 0x00a0f670 cc1`(anonymous = namespace)::pass_expand::execute(function*) + 64 frame #6: 0x01560cec cc1`execute_one_pass(opt_pass*) + 664 frame #7: 0x01561f58 cc1`execute_pass_list_1(opt_pass*) + 44 frame #8: 0x0154fff8 cc1`execute_pass_list(function*, opt_pass*) + = 40 frame #9: 0x00a7ebc4 cc1`cgraph_node::expand() + 364 frame #10: 0x00a80fe0 cc1`symbol_table::compile() + 3236 frame #11: 0x00a81a40 cc1`symbol_table::finalize_compilation_unit() = + 296 frame #12: 0x01810ef0 cc1`compile_file() + 236 frame #13: 0x018106b8 cc1`toplev::main(int, char**) + 6728 frame #14: 0x01e51064 cc1`main + 48 frame #15: 0x005a36c0 cc1`__start(argc=3D31, argv=3D0xffffabbc, = env=3D0xffffac3c, ps_strings=3D<unavailable>, obj=3D0x42067004, = cleanup=3D0x420364d8) at crt1_c.c:92:7 This suggests that using devel/llvm15 instead of system-clang 15 would also get the problem: it is not just a system-clang oddity. >>>> - does the compilation work without bootstrapping on a release = version of >>>> FreeBSD? >>>=20 >>> That is an example were the 13.1 based package builds on the >>> system used for armv7 builds did/does not have problem. >>>=20 >>> Nor do the main system-clang 14 based builds. >>>=20 >>>> - does the compilation work without bootstrapping using Linux = instead >>>> of FreeBSD? >>>=20 >>> I'm not well set up for that kind of experiment. >>>=20 >>>> You might want to open a bug report, but you should try to = understand >>>> first what is the component that causes the issue and if replacing = anything >>>> with something newer (where the bug might be already fixed) or with >>>> something supported (since FreeBSD CURRENT is under development, we >>>> might have regressions) solves the problem. >>>=20 >>> It is already known to be a regression compared to >>> system-clang 14 and 13 based builds. [. . .] >>>=20 >>>> If you find that the cause is in the FreeBSD GCC port(s), then = please >>>> open a bug report on bugzilla so that I can keep track of it and = other >>>> users with the same problem can find it there as well. As stands, the problem follows use of clang 15+ vs. not for the non-bootstrap form of building lang/gcc12+ for armv7. Nothing looks to be lang/gcc12+ themselves as the source of the problem: clang 15+ instead in my context (system-clang15+ and devel/llvm15+). But lang/gcc12+ does provide what turns out to be a workaround: STANDARD_BOOTSTRAP use for armv7 (with the unfortunate time/resource-use consequences). Of course, the evidence that I've got is far from being a small test case showing the problem. Finding such looks to be non-trivial and no small test case is likely to be produced anytime soon. >> When I can, I reference FreeBSD package builder results >> instead of build attempts from my personal environment. >> (But I'd previously used main with system-clang 14 and >> still use releng/13.1 with its system-clang 13 and had >> no problems, just like the armv7 package builder.) >>=20 >> I've not been building any lang/gcc* for * < 12 in >> a long time. For all I know, gcc11 and before could >> run into the problem. At this stage, the armv7 >> package builder never uses system-clang 15 and so >> gives no evidence for any lang/gcc* . >=20 =3D=3D=3D Mark Millard marklmi at yahoo.com
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?EB917CA9-CC67-4F79-8EBD-6BE82B021D45>