Date: Wed, 2 Mar 2016 20:50:46 -0800 From: Mark Millard <markmi@dsl-only.net> To: Roman Divacky <rdivacky@vlakno.cz> Cc: FreeBSD PowerPC ML <freebsd-ppc@freebsd.org>, FreeBSD Toolchain <freebsd-toolchain@freebsd.org> Subject: Re: clang 3.8.0 can mess up __builtin_dwarf_cfa (), at least for TARGET_ARCH=armv6, powerpc, powerpc64: a bug 207325 update Message-ID: <27FEB264-0A3E-42DF-A549-1E54ED489CEC@dsl-only.net> In-Reply-To: <A395CFC8-A2F9-4239-8BA2-283B8CB99D59@dsl-only.net> References: <83B8741C-B4C9-4EFB-A3B4-473F8F165984@dsl-only.net> <80EA4460-E842-46F5-B006-2A83FBBEE845@dsl-only.net> <F23112FF-C417-4757-96FF-4E93C259DC9D@dsl-only.net> <366B67F9-6A14-4906-8545-1B57A3FF53B8@dsl-only.net> <20160228083934.GA60222@vlakno.cz> <22AD8E4F-B3F2-455E-8EBE-2C70E428D44A@dsl-only.net> <462637FA-6FD2-4421-8F0C-0DF565E94FA6@dsl-only.net> <8D92D76D-9FBC-45F6-A20D-0C2A8B38D291@dsl-only.net> <33D5358F-6783-44C1-8155-86FB93CABE6F@dsl-only.net> <9DE18EC5-3C16-4B17-A0D0-5B5386961627@dsl-only.net> <C2B0D00F-1DBC-4546-98B6-C95D759D12CC@dsl-only.net> <A395CFC8-A2F9-4239-8BA2-283B8CB99D59@dsl-only.net>
next in thread | previous in thread | raw e-mail | index | archive | help
I now have the explanation of the clang 3.8.0 buildworld's libcxxrt = __cxa_throw related code not detecting the catch in main: main is = skipped because of mishandling r31 references in the eh.frame = information. The problem is distinct from the other reported problem(s). libcxxrt ends up with (dwarfdump -v -v -F output for __cxa_throw as = compiled by clang 3.8.0): > < 0><0x00010620:0x00010794><__cxa_throw><fde offset 0x000006c0 = length: 0x00000028><eh aug data len 0x0> > 0x00010620: <off cfa=3D00(r1) >=20 > 0x00010634: <off cfa=3D48(r1) > <off r30=3D-8(cfa) > <off = r31=3D-4(cfa) > <off r65=3D04(cfa) >=20 > 0x00010638: <off cfa=3D48(r31) > <off r25=3D-28(cfa) > <off = r26=3D-24(cfa) > <off r27=3D-20(cfa) > <off r28=3D-16(cfa) > <off = r29=3D-12(cfa) > <off r30=3D-8(cfa) > <off r31=3D-4(cfa) > <off = r65=3D04(cfa) >=20 > fde section offset 1728 0x000006c0 cie offset for fde: 1732 = 0x000006c4 > 0 DW_CFA_advance_loc 20 (5 * 4) > 1 DW_CFA_def_cfa_offset 48 > 3 DW_CFA_offset r31 -4 (1 * -4) > 5 DW_CFA_offset r30 -8 (2 * -4) > 7 DW_CFA_offset_extended_sf r65 4 (-1 * -4) > 10 DW_CFA_advance_loc 4 (1 * 4) > 11 DW_CFA_def_cfa_register r31 > 13 DW_CFA_offset r25 -28 (7 * -4) > 15 DW_CFA_offset r26 -24 (6 * -4) > 17 DW_CFA_offset r27 -20 (5 * -4) > 19 DW_CFA_offset r28 -16 (4 * -4) > 21 DW_CFA_offset r29 -12 (3 * -4) > 23 DW_CFA_offset r30 -8 (2 * -4) > 25 DW_CFA_nop > 26 DW_CFA_nop Note the cfa and r31 references in: > 0x00010634: <off cfa=3D48(r1) > . . . <off r31=3D-4(cfa) > . . . > 0x00010638: <off cfa=3D48(r31) > . . . <off r31=3D-4(cfa) > . . . The use of r31 to define cfa is from (in part) the clang++ 3.8.0 code = generation using r31 as a frame pointer in additino to r1 as the stack = pointer. The matching actual sequence of operations listed above is: > 1 DW_CFA_def_cfa_offset 48 > 3 DW_CFA_offset r31 -4 (1 * -4) > . . . > 11 DW_CFA_def_cfa_register r31 The "1 DW_CFA_def_cfa_offset 48" just notes that r1 (the stack pointer) = was decremented by 48 by the prior instruction so 48 needs to be added = to the new r1 value to reference the same _Unwind_Context cfa value as = the prior "<off cfa=3D00(r1) >" status does. The "3 DW_CFA_offset r31 -4 (1 * -4)" was generated because (soon old) = r31 value was saved at address cfa-4 ("<off r31=3D-4(cfa) >"). This = address to access what will be the old/saved r31 value is recorded in = the _Unwind_Context reg[31]. The "11 DW_CFA_def_cfa_register r31" was generated because the prior = instruction r31 was updated to be a copy of r1 for use as a frame = pointer. Note that such does not change the _Unwind_Context cfa value. = At this stage r1=3Dr31 and 48(r1)=3D48(r31) and such will hold until = either r1 or r31 is changed in the routine (if either is). The repeat of "<off r31=3D-4(cfa) >" on the "0x00010638: <off = cfa=3D48(r31) >" line indicates that there is no change to where/how to = find the pointer to the old/saved r31 value: no new DW_CFA_offset r31 = "instruction" for interpretation. [Note the messy mix of different r31's. gcc 4.2.1 does not (normally?) = generate such TARGET_ARCH=3Dpowerpc code but clang++ 3.8.0 normally = does. Thus clang++ touches an error that g++ 4.2.1 and the like normally = do not.] Unfortunately the above is not the interpretation given by the = interpreter in libgcc_s: "11 DW_CFA_def_cfa_register r31" instead accesses the old/saved r31 = value via the pointer in _Unwind_Context reg[31] and then applies the = offset 48 to that value. The result is the wrong cfa value (which should not have changed at all) = and all else is messed up after that. Since the old r31 value is an = older Frame Pointer value, one frame has also been "skipped" in the = process. (But the offset 48's involvement from there can produce pure = junk for the cfa value that results.) Code that sticks to cfa=3DOFFSET(r1) for the cfa will not see this error = in the .eh_frame information's interpretation. As for the lack of save/restore of some registers by = _Unwind_RaiseException as generated by clang 3.8.0: I did not earlier show how the the code involved picks to try to store = to r3's (non-existent) save/restore place (as an example). I show that = below. The code is in #1 of: > (gdb) bt > #0 _Unwind_SetGR (context=3D<optimized out>, index=3D<optimized out>, = val=3D1105272880) at = /usr/src/gnu/lib/libgcc/../../../contrib/gcc/unwind-dw2.c:220 > #1 0x41b7139c in __cxxabiv1::__gxx_personality_v0 (version=3D<optimized= out>, actions=3D6, exception_class=3D<optimized out>, = ue_header=3D0x41e12030, context=3D0xffffd570) > at = /usr/src/gnu/lib/libsupc++/../../../contrib/libstdc++/libsupc++/eh_persona= lity.cc:681 > #2 0x419915f8 in _Unwind_RaiseException_Phase2 (exc=3D<optimized = out>, context=3D<optimized out>) at = /usr/src/gnu/lib/libgcc/../../../contrib/gcc/unwind.inc:66 > #3 0x419911c0 in _Unwind_RaiseException (exc=3D<optimized out>) at = /usr/src/gnu/lib/libgcc/../../../contrib/gcc/unwind.inc:135 > #4 0x41b7075c in __cxxabiv1::__cxa_throw (obj=3D<optimized out>, = tinfo=3D<optimized out>, dest=3D<optimized out>) at = /usr/src/gnu/lib/libsupc++/../../../contrib/libstdc++/libsupc++/eh_throw.c= c:71 > #5 0x01800920 in main () at exception_test.cpp:5 and looks like: > 678 /* For targets with pointers smaller than the word size, we = must extend the > 679 pointer, and this extension is target dependent. */ > 680 _Unwind_SetGR (context, __builtin_eh_return_data_regno (0), > 681 __builtin_extend_pointer (ue_header)); > 682 _Unwind_SetGR (context, __builtin_eh_return_data_regno (1), > 683 handler_switch_value); > 684 _Unwind_SetIP (context, landing_pad); clang 3.8.0 agrees with gcc/g++ that __builtin_eh_return_data_regno (0) = translates to the (int) value 3 (referencing r3) --despite clang 3.8.0 = not providing anything that dwarfdump -v -v -F shows as saving/restoring = r3 for _Unwind_RaiseException. (Similarly for some other such = registers.) =3D=3D=3D Mark Millard markmi at dsl-only.net On 2016-Feb-29, at 11:10 PM, Mark Millard <markmi at dsl-only.net> = wrote: [TARGET_ARCH=3Dpowerpc context for all the evidence.] I found another clang++ 3.8.0 vs. g++ 4.2.1 difference where the system = libgcc_s depends on how it works for g++ 4.2.1 when clang 3.8.0 does not = work the same way: _Unwind_RaiseException is special in a way that makes it save and = restore lots of registers it does not directly use. (I'm not sure what = triggers having so many registers saved/restored.) But gcc/g++ 4.2.1 saves and restores more registers than clang/clang++ = 3.8.0 does. That in turn leaves .eh_frame information for more registers = for _Unwind_RaiseException for the gcc/g++ 4.2.1 context. _Unwind_RaiseException from a clang++ 3.8.0 build of libgcc_s does not = have save/restore for r3, r4, r5, r6, "r70" (from mfcr, dwarfdump = notation). The C++ exception handling library code in libgcc_s depends = on r3 (as one example). The pointer for r3 ends up being 0x0 and that = causes a crash in examples that get that far using the system's = libgcc_s. _Unwind_RaiseException from a g++ 4.2.1 build of libgcc_s has = save/restore for r3, r4, r5, r6, "r70" (from mfcr). Later below I list one form of the specific evidence for the = differences. It may be that this and the __builtin_dwarf_cfa() "fix" covers all the = problems for when libstdc++/libsupc++ are involved with the system = libgcc_s instead of libc++/libcxxrt being involved. In my view the registers to save/restore in routines like = _Unwind_RaiseException should be considered as part of the overall ABI = criteria. Under the rule "the TARGET_ARCH=3Dpowerpc ABI is always such = that it is gcc/g++ 4.2.1 compatible", I take it that clang 3.8.0 is = wrong for FreeBSD TARGET_ARCH=3Dpowerpc here: Another ABI violation. TARGET_ARCH=3Dpowerpc64 and possibly others could have the same sort of = issue. I've never gotten a clang/clang++ based TARGET_ARCH=3Dpowerpc64 = as far as a complete buildworld. And for now I'm more interested in = finding new types of errors for TARGET_ARCH=3Dpowerpc rather then what = range of TARGET_ARCH's get a specific clang 3.8.0 problem. Separately from the above I've shown that copying the following 3 files = to a gcc 4.2.1 buildworld/buildkernel TARGET_ARCH=3Dpowerpc context = allows exception_test.clang++380.powerpc to run just fine: > exception_test.clang++380.powerpc > /usr/lib/libc++.so.1 > /lib/libcxxrt.so.1 (debug files for the libraries also copied) That leaves the following libraries listed by ldd as being from the gcc = 4.2.1 buildworld: > /lib/libm.so.5 > /lib/libc.so.7 > /lib/libgcc_s.so.1 > # ldd exception_test.clang++380.powerpc > exception_test.clang++380.powerpc: > libc++.so.1 =3D> /usr/lib/libc++.so.1 (0x4183e000) > libcxxrt.so.1 =3D> /lib/libcxxrt.so.1 (0x41917000) > libm.so.5 =3D> /lib/libm.so.5 (0x41942000) > libc.so.7 =3D> /lib/libc.so.7 (0x41979000) > libgcc_s.so.1 =3D> /lib/libgcc_s.so.1 (0x41b1d000) exception_test.clang++380.powerpc used with the clang 3.8.0 buildworld = and its libgcc_s shows different behavior not likely to be explained by = the _Unwind_RaiseException register save/restore differences. (The lack = of some saves/restores would still be a problem if I get = exception_test.clang++380.powerpc to get that far before doing something = odd.) I'm still trying to get evidence of the specific low-level problem for = exception_test.clang++380.powerpc. It may be some time before I figure = out anything useful. Using some dwarfdump -v -v -F output for the evidence of register = save/restore differences. . . _Unwind_RaiseException from a g++ 4.2.1 build of libgcc_s has r3, r4, = r5, r6, r70 (from mfcr). The library depends on r3 (as one example). > fde section offset 1104 0x00000450 cie offset for fde: 1108 0x00000454 > 0 DW_CFA_advance_loc 8 (8 * 1) > 1 DW_CFA_def_cfa_offset 3024 > 4 DW_CFA_advance_loc1 156 > 6 DW_CFA_offset r4 -232 (58 * -4) > 8 DW_CFA_offset r3 -236 (59 * -4) > 10 DW_CFA_offset r28 -160 (40 * -4) > 12 DW_CFA_offset r27 -164 (41 * -4) > 14 DW_CFA_offset r26 -168 (42 * -4) > 16 DW_CFA_offset r25 -172 (43 * -4) > 18 DW_CFA_offset r24 -176 (44 * -4) > 20 DW_CFA_offset r23 -180 (45 * -4) > 22 DW_CFA_offset r22 -184 (46 * -4) > 24 DW_CFA_offset r21 -188 (47 * -4) > 26 DW_CFA_offset r20 -192 (48 * -4) > 28 DW_CFA_offset r19 -196 (49 * -4) > 30 DW_CFA_offset r18 -200 (50 * -4) > 32 DW_CFA_offset r17 -204 (51 * -4) > 34 DW_CFA_offset r16 -208 (52 * -4) > 36 DW_CFA_offset r15 -212 (53 * -4) > 38 DW_CFA_offset r14 -216 (54 * -4) > 40 DW_CFA_offset r63 -8 (2 * -4) > 42 DW_CFA_offset r62 -16 (4 * -4) > 44 DW_CFA_offset r61 -24 (6 * -4) > 46 DW_CFA_offset r60 -32 (8 * -4) > 48 DW_CFA_offset r59 -40 (10 * -4) > 50 DW_CFA_offset r58 -48 (12 * -4) > 52 DW_CFA_offset r57 -56 (14 * -4) > 54 DW_CFA_offset r56 -64 (16 * -4) > 56 DW_CFA_offset r55 -72 (18 * -4) > 58 DW_CFA_offset r54 -80 (20 * -4) > 60 DW_CFA_offset r53 -88 (22 * -4) > 62 DW_CFA_offset r52 -96 (24 * -4) > 64 DW_CFA_offset r51 -104 (26 * -4) > 66 DW_CFA_offset r50 -112 (28 * -4) > 68 DW_CFA_offset r49 -120 (30 * -4) > 70 DW_CFA_offset r48 -128 (32 * -4) > 72 DW_CFA_offset r47 -136 (34 * -4) > 74 DW_CFA_offset r46 -144 (36 * -4) > 76 DW_CFA_register r70 =3D r12 > 79 DW_CFA_offset_extended_sf r65 4 (-1 * -4) > 82 DW_CFA_advance_loc 32 (32 * 1) > 83 DW_CFA_offset r5 -228 (57 * -4) > 85 DW_CFA_offset r31 -148 (37 * -4) > 87 DW_CFA_offset r30 -152 (38 * -4) > 89 DW_CFA_offset r29 -156 (39 * -4) > 91 DW_CFA_offset_extended r70 -220 (55 * -4) > 94 DW_CFA_offset r6 -224 (56 * -4) > 96 DW_CFA_nop > 97 DW_CFA_nop > 98 DW_CFA_nop _Unwind_RaiseException from clang++ 3.8.0 build of libgcc_s does not = have has r3, r4, r5, r6, r70 (from mfcr). The library depends on r3 (as = one example). > fde section offset 692 0x000002b4 cie offset for fde: 696 0x000002b8 > 0 DW_CFA_advance_loc 20 (5 * 4) > 1 DW_CFA_def_cfa_offset 2992 > 4 DW_CFA_offset r31 -148 (37 * -4) > 6 DW_CFA_offset r30 -152 (38 * -4) > 8 DW_CFA_offset_extended_sf r65 4 (-1 * -4) > 11 DW_CFA_advance_loc 4 (1 * 4) > 12 DW_CFA_def_cfa_register r31 > 14 DW_CFA_offset r14 -216 (54 * -4) > 16 DW_CFA_offset r15 -212 (53 * -4) > 18 DW_CFA_offset r16 -208 (52 * -4) > 20 DW_CFA_offset r17 -204 (51 * -4) > 22 DW_CFA_offset r18 -200 (50 * -4) > 24 DW_CFA_offset r19 -196 (49 * -4) > 26 DW_CFA_offset r20 -192 (48 * -4) > 28 DW_CFA_offset r21 -188 (47 * -4) > 30 DW_CFA_offset r22 -184 (46 * -4) > 32 DW_CFA_offset r23 -180 (45 * -4) > 34 DW_CFA_offset r24 -176 (44 * -4) > 36 DW_CFA_offset r25 -172 (43 * -4) > 38 DW_CFA_offset r26 -168 (42 * -4) > 40 DW_CFA_offset r27 -164 (41 * -4) > 42 DW_CFA_offset r28 -160 (40 * -4) > 44 DW_CFA_offset r29 -156 (39 * -4) > 46 DW_CFA_offset r30 -152 (38 * -4) > 48 DW_CFA_offset r31 -148 (37 * -4) > 50 DW_CFA_offset r46 -144 (36 * -4) > 52 DW_CFA_offset r47 -136 (34 * -4) > 54 DW_CFA_offset r48 -128 (32 * -4) > 56 DW_CFA_offset r49 -120 (30 * -4) > 58 DW_CFA_offset r50 -112 (28 * -4) > 60 DW_CFA_offset r51 -104 (26 * -4) > 62 DW_CFA_offset r52 -96 (24 * -4) > 64 DW_CFA_offset r53 -88 (22 * -4) > 66 DW_CFA_offset r54 -80 (20 * -4) > 68 DW_CFA_offset r55 -72 (18 * -4) > 70 DW_CFA_offset r56 -64 (16 * -4) > 72 DW_CFA_offset r57 -56 (14 * -4) > 74 DW_CFA_offset r58 -48 (12 * -4) > 76 DW_CFA_offset r59 -40 (10 * -4) > 78 DW_CFA_offset r60 -32 (8 * -4) > 80 DW_CFA_offset r61 -24 (6 * -4) > 82 DW_CFA_offset r62 -16 (4 * -4) > 84 DW_CFA_offset r63 -8 (2 * -4) > 86 DW_CFA_nop =3D=3D=3D Mark Millard markmi at dsl-only.net On 2016-Feb-29, at 3:20 AM, Mark Millard <markmi at dsl-only.net> wrote: TARGET_ARCH=3Dpowerpc: Using Frame Depth 1 in "case = Intrinsic::eh_dwarf_cfa" (and Offset 0 in "case = Builtin::BI__builtin_dwarf_cfa") for PPCTargetLowering::LowerFRAMEADDR = related use has allowed getting into _Unwind_RaiseException_Phase2 and = __cxxabiv1::__gxx_personality_v0. The example is the 8 line example = compiled under g++ 4.2.1 but then used under a buildworld that was built = with clang 3.8.0: # ldd exception_test.g++421.powerpc=20 exception_test.g++421.powerpc: libstdc++.so.6 =3D> /usr/local/lib/gcc49/libstdc++.so.6 = (0x41840000) libm.so.5 =3D> /lib/libm.so.5 (0x4196a000) libgcc_s.so.1 =3D> /lib/libgcc_s.so.1 (0x419a1000) libc.so.7 =3D> /lib/libc.so.7 (0x419c0000) _Unwind_RaiseException_Phase2 is well past the point of the failure and = crash from having Frame Depth 0 instead. It is getting a SEGV during the _Unwind_SetGR called via: 710 /* For targets with pointers smaller than the word size, we = must extend the 711 pointer, and this extension is target dependent. */ 712 _Unwind_SetGR (context, __builtin_eh_return_data_regno (0), 713 __builtin_extend_pointer (ue_header)); for: _Unwind_SetGR (context=3D0xffffd570, index=3D3, val=3D1105272896) at = /usr/src/gnu/lib/libgcc/../../../contrib/gcc/unwind-dw2.c:207 context->reg[3] is 0x0 and so its use in the following gets the SEGV. 217 ptr =3D context->reg[index]; 218=09 219 if (size =3D=3D sizeof(_Unwind_Ptr)) 220 * (_Unwind_Ptr *) ptr =3D val; I'm not going to try to analyze the source of this before getting some = sleep. For the 8 line program being compiled by clang++ 3.8.0 instead the = results are different than the above and than the original behavior: The = program does not crash abnormally but also does not find the catch = clause that it should. The std::terminate gets its normal SIGABRT = instead of an earlier SEGV. Again I'm not going to try to analyze the details before getting some = sleep. But I will mention that I've also already submitted a report that = libgcc_s does not completely implement DW_CFA_remember_state and = DW_CFA_restore_state and that the code generated on powerpc64 touches = the defect and so ends up with misbehavior. These might be similar. =3D=3D=3D Mark Millard markmi at dsl-only.net On 2016-Feb-28, at 10:13 PM, Mark Millard <markmi at dsl-only.net> = wrote: Back to the "case Builtin::BI__builtin_dwarf_cfa" and = "PPCTargetLowering::LowerFRAMEADDR" context: I made the wrong change and need to retry. The detail. . . Passing a 1 through instead of zero did not do what I expected to the = code generated. Instead it added one instruction: addi r3,r3,1 resulting in (objdump -d --prefix-addresses on the .o): > Disassembly of section .text: > 00000000 <_Z1fv> mflr r0 > 00000004 <_Z1fv+0x4> stw r31,-4(r1) > 00000008 <_Z1fv+0x8> stw r0,4(r1) > 0000000c <_Z1fv+0xc> stwu r1,-16(r1) > 00000010 <_Z1fv+0x10> mr r31,r1 > 00000014 <_Z1fv+0x14> mr r3,r31 > 00000018 <_Z1fv+0x18> addi r3,r3,1 > 0000001c <_Z1fv+0x1c> bl 0000001c <_Z1fv+0x1c> > 00000020 <_Z1fv+0x20> addi r1,r1,16 > 00000024 <_Z1fv+0x24> lwz r0,4(r1) > 00000028 <_Z1fv+0x28> lwz r31,-4(r1) > 0000002c <_Z1fv+0x2c> mtlr r0 > 00000030 <_Z1fv+0x30> blr In other words: it added the 1 as a byte offset like the comments that I = thought were wrong said. Since it does not appear that PPCTargetLowering::LowerFRAMEADDR would do = that with a 1 I conclude that PPCTargetLowering::LowerFRAMEADDR is not = involved with that figure. So looking around. . . /usr/src/contrib/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp = has: > case Intrinsic::eh_dwarf_cfa: { > SDValue CfaArg =3D DAG.getSExtOrTrunc(getValue(I.getArgOperand(0)), = sdl, > = TLI.getPointerTy(DAG.getDataLayout())); > SDValue Offset =3D DAG.getNode(ISD::ADD, sdl, > CfaArg.getValueType(), > DAG.getNode(ISD::FRAME_TO_ARGS_OFFSET, = sdl, > CfaArg.getValueType()), > CfaArg); > SDValue FA =3D DAG.getNode( > ISD::FRAMEADDR, sdl, TLI.getPointerTy(DAG.getDataLayout()), > DAG.getConstant(0, sdl, TLI.getPointerTy(DAG.getDataLayout()))); > setValue(&I, DAG.getNode(ISD::ADD, sdl, FA.getValueType(), > FA, Offset)); > return nullptr; And so sure enough the argument is an offset as used by this code. And what I call the frame depth is plugged in as 0 here via = "DAG.getConstant(0, sdl, TLI.getPointerTy(DAG.getDataLayout()))". The = offset is applied after getting the frame address. So I get to revert my change and try again changing the above call to = use a 1 instead. It does not look like this changes the time frames in my history notes: = it has been using frame depth zero since V2.7 when "case = Builtin::BI__builtin_dwarf_cfa" appeared. In general my overall questions about the target triple controlling = which value to use (in DAG.getConstant hrere) still apply: It is not = obvious that something that has been using frame depth 0 since V2.7 can = be immediately changed to frame depth 1 for all contexts. =3D=3D=3D Mark Millard markmi at dsl-only.net On 2016-Feb-28, at 8:49 PM, Mark Millard <markmi at dsl-only.net> wrote: Here is what the "ABI for the ARM 32 32-bit Architecture" "DWARF for the = ARM Architecture" document says about the CFA: > 3.4 Canonical Frame Address >=20 > The term Canonical Frame Address (CFA) is defined in [GDWARF], =C2=A76.4= , Call Frame Information. This ABI adopts the typical definition of CFA = given there. > =EF=81=AF The CFA is the value of the stack pointer (r13) at the call = site in the previous frame. This, with the armv6 code I've shown via "objdump -d", indicates that = for armv6 clang++'s __builtin_dwarf_cfa() return value is not the same = value as the official ARM ABI indicates. It also indicates that what g++ = returns does match the official ARM ABI. =3D=3D=3D Mark Millard markmi at dsl-only.net On 2016-Feb-28, at 5:40 PM, Mark Millard <markmi at dsl-only.net> wrote: Looking some at clang/llvm history shows releases/branches: V2.6 did not have "case Builtin::BI__builtin_dwarf_cfa". V2.7 did have "case Builtin::BI__builtin_dwarf_cfa" but = PPCTargetLowering::LowerFRAMEADDR ignored the argument. V2.8 had PPCTargetLowering::LowerFRAMEADDR using its argument (as a = frame depth, not a byte offset). The apparently incorrect (not matching g++ frame depth returned) = comments, naming, and value (when viewed as a frame depth) for "case = Builtin::BI__builtin_dwarf_cfa" started in V2.7 and continues to this = day. That is a lot of time for various dependencies on the clang = (mis)definition to accumulate across everything that uses clang. It may be that limiting any change to specific TARGET_ARCH's for FreeBSD = is appropriate. FreeBSD would likely need to list the appropriate = TARGET_ARCH's, likely including powerpc and powerpc64 since clang before = 3.8.0 was not in use for buildworld for powerpc/powerpc64. Still this may have consequences for ports that use clang and might = reference clang-compiled __builtin_dwarf_cfa() use, possibly from a = lang/clang* instead of the system clang. My guess is that the = interoperability with being able to use g++ vintages as well may lead to = (modern?) lang/clang*'s tracking the fix for FreeBSD TARGET_ARCH's that = are fixed. I can ignore all this and build a system based on using 1 as the frame = depth just to test, just as a matter of proof of concept for powerpc. = (Powerpc64 hits a system libgcc_s defect and so needs more before C++ = exceptions can be tested overall.) =3D=3D=3D Mark Millard markmi at dsl-only.net On 2016-Feb-28, at 2:20 PM, Mark Millard <markmi at dsl-only.net> wrote: In /usr/src/contrib/llvm/tools/clang/lib/CodeGen/CGBuiltin.cpp there is: > case Builtin::BI__builtin_dwarf_cfa: { > // The offset in bytes from the first argument to the CFA. > // > // Why on earth is this in the frontend? Is there any reason at > // all that the backend can't reasonably determine this while > // lowering llvm.eh.dwarf.cfa()? > // > // TODO: If there's a satisfactory reason, add a target hook for > // this instead of hard-coding 0, which is correct for most targets. > int32_t Offset =3D 0; >=20 > Value *F =3D CGM.getIntrinsic(Intrinsic::eh_dwarf_cfa); > return RValue::get(Builder.CreateCall(F, > llvm::ConstantInt::get(Int32Ty, = Offset))); > } I would have guessed that the internal argument was how many frames away = on the stack to go from what 0 produces (high er address direction). = g++'s __builtin_dwarf_cfa() returns the address for the next frame = compared to clang 3.8.0 (higher address direction). I'd call that more of a frame depth than an offset. .eh_frame and its = cfa material use offset terminology as byte offsets. And the comments = above talk of an offset in bytes --but "next frame" distances in bytes = would not be constant. Looking at a use of LowerFRAMEADDR in a LowerRETURNADDR, for example, > SDValue ARMTargetLowering::LowerRETURNADDR(SDValue Op, SelectionDAG = &DAG) const{ > . . . > EVT VT =3D Op.getValueType(); > SDLoc dl(Op); > unsigned Depth =3D = cast<ConstantSDNode>(Op.getOperand(0))->getZExtValue(); > if (Depth) { > SDValue FrameAddr =3D LowerFRAMEADDR(Op, DAG); > SDValue Offset =3D DAG.getConstant(4, dl, MVT::i32); > return DAG.getLoad(VT, dl, DAG.getEntryNode(), > DAG.getNode(ISD::ADD, dl, VT, FrameAddr, Offset), > MachinePointerInfo(), false, false, false, 0); > } > . . . > } (PPCTargetLowering::LowerRETURNADDR is similar.)=20 This has a mix of Depth and Offset overall, with the depth going to = LowerFRAMEADDR via Op but Offset used later in GAG.getLoad via adding to = the FrameAddr. This would lead me to guess that the terminology and comments in "case = Builtin::BI__builtin_dwarf_cfa" are wrong and that the = Builder.CreateCall has been given a frame depth, not an offset. > SDValue PPCTargetLowering::LowerFRAMEADDR(SDValue Op, > SelectionDAG &DAG) const { > SDLoc dl(Op); > unsigned Depth =3D = cast<ConstantSDNode>(Op.getOperand(0))->getZExtValue(); >=20 > MachineFunction &MF =3D DAG.getMachineFunction(); > MachineFrameInfo *MFI =3D MF.getFrameInfo(); > MFI->setFrameAddressIsTaken(true); >=20 > EVT PtrVT =3D = DAG.getTargetLoweringInfo().getPointerTy(MF.getDataLayout()); > bool isPPC64 =3D PtrVT =3D=3D MVT::i64; >=20 > // Naked functions never have a frame pointer, and so we use r1. For = all > // other functions, this decision must be delayed until during PEI. > unsigned FrameReg; > if (MF.getFunction()->hasFnAttribute(Attribute::Naked)) > FrameReg =3D isPPC64 ? PPC::X1 : PPC::R1; > else > FrameReg =3D isPPC64 ? PPC::FP8 : PPC::FP; >=20 > SDValue FrameAddr =3D DAG.getCopyFromReg(DAG.getEntryNode(), dl, = FrameReg, > PtrVT); > while (Depth--) > FrameAddr =3D DAG.getLoad(Op.getValueType(), dl, DAG.getEntryNode(), > FrameAddr, MachinePointerInfo(), false, false, > false, 0); > return FrameAddr; > }=20 Again Op is called Depth --and is used to get from one frame pointer = value to the next: a frame depth. To match g++ 4.2.1 the value to use is 1 for depth. Overall, at least applied to powerpc/powerpc64: > . . . > // TODO: If there's a satisfactory reason, add a target hook for > // this instead of hard-coding 0, which is correct for most targets. > int32_t Offset =3D 0; I think the comments in this area are actually talking about byte = offsets, not depths and are just wrong. A byte offset of 0 would make = sense relative to hardcoding but the value is actually a frame depth --a = very different context. I think that the naming of the variable is just wrong: it should be = called Depth. And I think that the comments should have originally talked about using = a hard coded Depth 1 to match g++ and preexisting library usage of = __builtin_dwarf_cfa() for C++ and other exception handling (.eh_frame = usage). ANd the code should avhe matched. As far as I can tell this error in the "case = Builtin::BI__builtin_dwarf_cfa:" design was not caught until now. But since the mess has been around a longtime just switching everything = to match the g++ context now likely has its own problems. (Not just a = FreeBSD issue.) For FreeBSD I expect that Depth 1 could be used for powerpc and = powerpc64: if it has been wrong for a long time (not just 3.8.0) then = powerpc/powerpc64 FreeBSD has likely been broken for C++ exception = handling when buildworld was via clang for just as long. (But only = recently has clang gotten this near working for buildworld for at least = one of powerpc/powerpc64. Currently powerpc is closer, given that = powerpc64 does not support softfloat last I knew.) For other TARGET_ARCH's: For FreeBSD armv6 it is less clear to me: it is based on clang as it is = and I do not know what C++ exception ABI it uses. If a modern gcc/g++ = buildworld had problems with C++ exception handling, does anything need = to be done about it? For FreeBSD armv6 and the like: is xtoolchain like = support important? FreeBSD may have similar questions for other TARGET_ARCH's. =3D=3D=3D Mark Millard markmi at dsl-only.net On 2016-Feb-28, at 2:46 AM, Mark Millard <markmi at dsl-only.net> wrote: On 2016-Feb-28, at 12:39 AM, Roman Divacky <rdivacky at vlakno.cz> = wrote: >=20 > Mark, >=20 > __builtin_dwarf_cfa() is lowered in clang to llvm intrinsic = eh_dwarf_cfa. > There's a depth argument (which defaults to 0, saying it's correct for = most > targets).=20 >=20 > Then the intrinsic gets lowered in SelectionDAG using > PPCTargetLowering::LowerFRAMEADDR() >=20 >=20 > Can you check that 1) the depth should be 0 for ppc64/ppc32 2) that > LowerFRAMEADDR() does something sensible? >=20 > There's a loop depth-times, so I wonder if that makes a difference. >=20 > Thanks, Roman "Lowered"? I'm not familiar with the clang code base or its terminology. = Handed off to a lower level interface, may be? As near as I can tell libgcc_s could be made to deal with clang 3.8.0's = way of __builtin_dwarf_cfa() working for powerpc/powerpc64. But then use = with g++ would be broken on powerpc/powerpc64 unless there were some = sort of live "which compiler's type of code" test also involved. Having only one libgcc_s and multiple compilers using it for a given = TARGET_ARCH=3D (for example, devel/powerpc64-xtoolchain-gcc like uses) = suggests sticking to one convention per TARGET_ARCH=3D for = __builtin_dwarf_cfa(). I would guess that g++ conventions win in this type of context for = FreeBSD, under the guideline of trying to be gcc 4.2.1 "ABI" compatible. = libgcc_s from FreeBSD works for C++ exceptions with its gcc 4.2.1 for = powerpc and powerpc64 as things are as far as I know. But for clang++ FreeBSD is one context among many and other libraries = may be based on clang 3.8.0's existing interpretation, without gcc/g++ = compatibility constraints. (I've no experience with earlier clang = vintages for the issue.) It may be that any change needs to be FreeBSD = target-triple specific for all I know. In essence: making the convention = part of the ABI chosen. I'll probably get some sleep before looking much at the code that you = reference. A quick look at part of it suggests a fair amount of = research/study for me to interpret things reliably. The loop may be somewhat analogous to _Unwind_RaiseException's loop, but = for a specific depth. I would currently guess that depth 1 would match = gcc 4.2.1's result for __builtin_dwarf_cfa(). But there was also some other "address"(?) builtin support routine = nearby that seemed to call into LowerFRAMEADDR() and I've no clue if g++ = 4.2.1 uses the same depth-figure standard for both sorts of things or = not. For all I know both types of builtins(?) might have mismatches with = gcc/g++ 4.2.1 and both might need fixes. I do vaguely remember seeing a builtin in FreeBSD code for something = that had an explicit number in the argument list, possibly = __builtin_frame_address(n)(?). But I only saw __builtin_dwarf_cfa() with = no argument in the argument list as far as I remember. If clang 3.8.0 and gcc 4.2.1 disagreed about what the numbering standard = referred to for __builtin_frame_address(n) (or whatever it was), that = would not be good and compatibility would imply conversions between the = conventions for the 2 FreeBSD contexts. I have not checked for armv6 related clang++ vs. g++ compatibility for = C++ exception-handling. If anything is not operating for that context I = expect that it would be g++ that generated buildworld code that did not = work based on the FreeBSD source code: clang/clang++ is the normal = compiler and kyua seemed to operate, unlike on the powerpc/powerpc64. I've never tried to build armv6 via an equivalent of = devel/powerpc64-gcc. I do not know if armv6 even uses the same sort of = C++ exception-handling ABI or not. But I do know that = __builtin_dwarf_cfa() is not compatible between clang++ and g++ from the = 2-line source and comparing objdump -d results. So more than powerpc/powerpc64 might be involved overall. =3D=3D=3D Mark Millard markmi at dsl-only.net
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?27FEB264-0A3E-42DF-A549-1E54ED489CEC>