FreeBSD Mail Archives

Date:      Wed, 2 Mar 2016 20:50:46 -0800
From:      Mark Millard <markmi@dsl-only.net>
To:        Roman Divacky <rdivacky@vlakno.cz>
Cc:        FreeBSD PowerPC ML <freebsd-ppc@freebsd.org>, FreeBSD Toolchain <freebsd-toolchain@freebsd.org>
Subject:   Re: clang 3.8.0 can mess up __builtin_dwarf_cfa (), at least for TARGET_ARCH=armv6, powerpc, powerpc64: a bug 207325 update
Message-ID:  <27FEB264-0A3E-42DF-A549-1E54ED489CEC@dsl-only.net>
In-Reply-To: <A395CFC8-A2F9-4239-8BA2-283B8CB99D59@dsl-only.net>
References:  <83B8741C-B4C9-4EFB-A3B4-473F8F165984@dsl-only.net> <80EA4460-E842-46F5-B006-2A83FBBEE845@dsl-only.net> <F23112FF-C417-4757-96FF-4E93C259DC9D@dsl-only.net> <366B67F9-6A14-4906-8545-1B57A3FF53B8@dsl-only.net> <20160228083934.GA60222@vlakno.cz> <22AD8E4F-B3F2-455E-8EBE-2C70E428D44A@dsl-only.net> <462637FA-6FD2-4421-8F0C-0DF565E94FA6@dsl-only.net> <8D92D76D-9FBC-45F6-A20D-0C2A8B38D291@dsl-only.net> <33D5358F-6783-44C1-8155-86FB93CABE6F@dsl-only.net> <9DE18EC5-3C16-4B17-A0D0-5B5386961627@dsl-only.net> <C2B0D00F-1DBC-4546-98B6-C95D759D12CC@dsl-only.net> <A395CFC8-A2F9-4239-8BA2-283B8CB99D59@dsl-only.net>

I now have the explanation of the clang 3.8.0 buildworld's libcxxrt =
__cxa_throw related code not detecting the catch in main: main is =
skipped because of mishandling r31 references in the eh.frame =
information. The problem is distinct from the other reported problem(s).

libcxxrt ends up with (dwarfdump -v -v -F output for __cxa_throw as =
compiled by clang 3.8.0):

> <    0><0x00010620:0x00010794><__cxa_throw><fde offset 0x000006c0 =
length: 0x00000028><eh aug data len 0x0>
>         0x00010620: <off cfa=3D00(r1) >=20
>         0x00010634: <off cfa=3D48(r1) > <off r30=3D-8(cfa) > <off =
r31=3D-4(cfa) > <off r65=3D04(cfa) >=20
>         0x00010638: <off cfa=3D48(r31) > <off r25=3D-28(cfa) > <off =
r26=3D-24(cfa) > <off r27=3D-20(cfa) > <off r28=3D-16(cfa) > <off =
r29=3D-12(cfa) > <off r30=3D-8(cfa) > <off r31=3D-4(cfa) > <off =
r65=3D04(cfa) >=20
>  fde section offset 1728 0x000006c0 cie offset for fde: 1732 =
0x000006c4
>          0 DW_CFA_advance_loc 20  (5 * 4)
>          1 DW_CFA_def_cfa_offset 48
>          3 DW_CFA_offset r31 -4  (1 * -4)
>          5 DW_CFA_offset r30 -8  (2 * -4)
>          7 DW_CFA_offset_extended_sf r65 4  (-1 * -4)
>         10 DW_CFA_advance_loc 4  (1 * 4)
>         11 DW_CFA_def_cfa_register r31
>         13 DW_CFA_offset r25 -28  (7 * -4)
>         15 DW_CFA_offset r26 -24  (6 * -4)
>         17 DW_CFA_offset r27 -20  (5 * -4)
>         19 DW_CFA_offset r28 -16  (4 * -4)
>         21 DW_CFA_offset r29 -12  (3 * -4)
>         23 DW_CFA_offset r30 -8  (2 * -4)
>         25 DW_CFA_nop
>         26 DW_CFA_nop

Note the cfa and r31 references in:

> 0x00010634: <off cfa=3D48(r1) >  . . . <off r31=3D-4(cfa) > . . .
> 0x00010638: <off cfa=3D48(r31) > . . . <off r31=3D-4(cfa) > . . .

The use of r31 to define cfa is from (in part) the clang++ 3.8.0 code =
generation using r31 as a frame pointer in additino to r1 as the stack =
pointer. The matching actual sequence of operations listed above is:

>          1 DW_CFA_def_cfa_offset 48
>          3 DW_CFA_offset r31 -4  (1 * -4)
> . . .
>         11 DW_CFA_def_cfa_register r31

The "1 DW_CFA_def_cfa_offset 48" just notes that r1 (the stack pointer) =
was decremented by 48 by the prior instruction so 48 needs to be added =
to the new r1 value to reference the same _Unwind_Context cfa value as =
the prior "<off cfa=3D00(r1) >" status does.

The "3 DW_CFA_offset r31 -4  (1 * -4)" was generated because (soon old) =
r31 value was saved at address cfa-4 ("<off r31=3D-4(cfa) >"). This =
address to access what will be the old/saved r31 value is recorded in =
the _Unwind_Context reg[31].

The "11 DW_CFA_def_cfa_register r31" was generated because the prior =
instruction r31 was updated to be a copy of r1 for use as a frame =
pointer. Note that such does not change the _Unwind_Context cfa value. =
At this stage r1=3Dr31 and 48(r1)=3D48(r31) and such will hold until =
either r1 or r31 is changed in the routine (if either is).

The repeat of "<off r31=3D-4(cfa) >" on the "0x00010638: <off =
cfa=3D48(r31) >" line indicates that there is no change to where/how to =
find the pointer to the old/saved r31 value: no new DW_CFA_offset r31 =
"instruction" for interpretation.

[Note the messy mix of different r31's. gcc 4.2.1 does not (normally?) =
generate such TARGET_ARCH=3Dpowerpc code but clang++ 3.8.0 normally =
does. Thus clang++ touches an error that g++ 4.2.1 and the like normally =
do not.]

Unfortunately the above is not the interpretation given by the =
interpreter in libgcc_s:

"11 DW_CFA_def_cfa_register r31" instead accesses the old/saved r31 =
value via the pointer in _Unwind_Context reg[31] and then applies the =
offset 48 to that value.

The result is the wrong cfa value (which should not have changed at all) =
and all else is messed up after that. Since the old r31 value is an =
older Frame Pointer value, one frame has also been "skipped" in the =
process. (But the offset 48's involvement from there can produce pure =
junk for the cfa value that results.)

Code that sticks to cfa=3DOFFSET(r1) for the cfa will not see this error =
in the .eh_frame information's interpretation.

As for the lack of save/restore of some registers by =
_Unwind_RaiseException as generated by clang 3.8.0:

I did not earlier show how the the code involved picks to try to store =
to r3's (non-existent) save/restore place (as an example). I show that =
below.

The code is in #1 of:

> (gdb) bt
> #0  _Unwind_SetGR (context=3D<optimized out>, index=3D<optimized out>, =
val=3D1105272880) at =
/usr/src/gnu/lib/libgcc/../../../contrib/gcc/unwind-dw2.c:220
> #1  0x41b7139c in __cxxabiv1::__gxx_personality_v0 (version=3D<optimized=
 out>, actions=3D6, exception_class=3D<optimized out>, =
ue_header=3D0x41e12030, context=3D0xffffd570)
>     at =
/usr/src/gnu/lib/libsupc++/../../../contrib/libstdc++/libsupc++/eh_persona=
lity.cc:681
> #2  0x419915f8 in _Unwind_RaiseException_Phase2 (exc=3D<optimized =
out>, context=3D<optimized out>) at =
/usr/src/gnu/lib/libgcc/../../../contrib/gcc/unwind.inc:66
> #3  0x419911c0 in _Unwind_RaiseException (exc=3D<optimized out>) at =
/usr/src/gnu/lib/libgcc/../../../contrib/gcc/unwind.inc:135
> #4  0x41b7075c in __cxxabiv1::__cxa_throw (obj=3D<optimized out>, =
tinfo=3D<optimized out>, dest=3D<optimized out>) at =
/usr/src/gnu/lib/libsupc++/../../../contrib/libstdc++/libsupc++/eh_throw.c=
c:71
> #5  0x01800920 in main () at exception_test.cpp:5

and looks like:

> 678	  /* For targets with pointers smaller than the word size, we =
must extend the
> 679	     pointer, and this extension is target dependent.  */
> 680	  _Unwind_SetGR (context, __builtin_eh_return_data_regno (0),
> 681			 __builtin_extend_pointer (ue_header));
> 682	  _Unwind_SetGR (context, __builtin_eh_return_data_regno (1),
> 683			 handler_switch_value);
> 684	  _Unwind_SetIP (context, landing_pad);

clang 3.8.0 agrees with gcc/g++ that __builtin_eh_return_data_regno (0) =
translates to the (int) value 3 (referencing r3) --despite clang 3.8.0 =
not providing anything that dwarfdump -v -v -F shows as saving/restoring =
r3 for _Unwind_RaiseException. (Similarly for some other such =
registers.)

=3D=3D=3D
Mark Millard
markmi at dsl-only.net

On 2016-Feb-29, at 11:10 PM, Mark Millard <markmi at dsl-only.net> =
wrote:

[TARGET_ARCH=3Dpowerpc context for all the evidence.]

I found another clang++ 3.8.0 vs. g++ 4.2.1 difference where the system =
libgcc_s depends on how it works for g++ 4.2.1 when clang 3.8.0 does not =
work the same way:

_Unwind_RaiseException is special in a way that makes it save and =
restore lots of registers it does not directly use. (I'm not sure what =
triggers having so many registers saved/restored.)

But gcc/g++ 4.2.1 saves and restores more registers than clang/clang++ =
3.8.0 does. That in turn leaves .eh_frame information for more registers =
for _Unwind_RaiseException for the gcc/g++ 4.2.1 context.

_Unwind_RaiseException from a clang++ 3.8.0 build of libgcc_s does not =
have save/restore for r3, r4, r5, r6, "r70" (from mfcr, dwarfdump =
notation). The C++ exception handling library code in libgcc_s depends =
on r3 (as one example). The pointer for r3 ends up being 0x0 and that =
causes a crash in examples that get that far using the system's =
libgcc_s.

_Unwind_RaiseException from a g++ 4.2.1 build of libgcc_s has =
save/restore for r3, r4, r5, r6, "r70" (from mfcr).

Later below I list one form of the specific evidence for the =
differences.

It may be that this and the __builtin_dwarf_cfa() "fix" covers all the =
problems for when libstdc++/libsupc++ are involved with the system =
libgcc_s instead of libc++/libcxxrt being involved.

In my view the registers to save/restore in routines like =
_Unwind_RaiseException should be considered as part of the overall ABI =
criteria. Under the rule "the TARGET_ARCH=3Dpowerpc ABI is always such =
that it is gcc/g++ 4.2.1 compatible", I take it that clang 3.8.0 is =
wrong for FreeBSD TARGET_ARCH=3Dpowerpc here: Another ABI violation.

TARGET_ARCH=3Dpowerpc64 and possibly others could have the same sort of =
issue. I've never gotten a clang/clang++ based TARGET_ARCH=3Dpowerpc64 =
as far as a complete buildworld. And for now I'm more interested in =
finding new types of errors for TARGET_ARCH=3Dpowerpc rather then what =
range of TARGET_ARCH's get a specific clang 3.8.0 problem.

Separately from the above I've shown that copying the following 3 files =
to a gcc 4.2.1 buildworld/buildkernel  TARGET_ARCH=3Dpowerpc context =
allows exception_test.clang++380.powerpc to run just fine:

> exception_test.clang++380.powerpc
> /usr/lib/libc++.so.1
> /lib/libcxxrt.so.1

(debug files for the libraries also copied)

That leaves the following libraries listed by ldd as being from the gcc =
4.2.1 buildworld:

> /lib/libm.so.5
> /lib/libc.so.7
> /lib/libgcc_s.so.1

> # ldd exception_test.clang++380.powerpc
> exception_test.clang++380.powerpc:
> 	libc++.so.1 =3D> /usr/lib/libc++.so.1 (0x4183e000)
> 	libcxxrt.so.1 =3D> /lib/libcxxrt.so.1 (0x41917000)
> 	libm.so.5 =3D> /lib/libm.so.5 (0x41942000)
> 	libc.so.7 =3D> /lib/libc.so.7 (0x41979000)
> 	libgcc_s.so.1 =3D> /lib/libgcc_s.so.1 (0x41b1d000)

exception_test.clang++380.powerpc used with the clang 3.8.0 buildworld =
and its libgcc_s shows different behavior not likely to be explained by =
the _Unwind_RaiseException register save/restore differences. (The lack =
of some saves/restores would still be a problem if I get =
exception_test.clang++380.powerpc to get that far before doing something =
odd.)

I'm still trying to get evidence of the specific low-level problem for =
exception_test.clang++380.powerpc. It may be some time before I figure =
out anything useful.

Using some dwarfdump -v -v -F output for the evidence of register =
save/restore differences. . .

_Unwind_RaiseException from a g++ 4.2.1 build of libgcc_s has r3, r4, =
r5, r6, r70 (from mfcr). The library depends on r3 (as one example).

> fde section offset 1104 0x00000450 cie offset for fde: 1108 0x00000454
>         0 DW_CFA_advance_loc 8  (8 * 1)
>         1 DW_CFA_def_cfa_offset 3024
>         4 DW_CFA_advance_loc1 156
>         6 DW_CFA_offset r4 -232  (58 * -4)
>         8 DW_CFA_offset r3 -236  (59 * -4)
>        10 DW_CFA_offset r28 -160  (40 * -4)
>        12 DW_CFA_offset r27 -164  (41 * -4)
>        14 DW_CFA_offset r26 -168  (42 * -4)
>        16 DW_CFA_offset r25 -172  (43 * -4)
>        18 DW_CFA_offset r24 -176  (44 * -4)
>        20 DW_CFA_offset r23 -180  (45 * -4)
>        22 DW_CFA_offset r22 -184  (46 * -4)
>        24 DW_CFA_offset r21 -188  (47 * -4)
>        26 DW_CFA_offset r20 -192  (48 * -4)
>        28 DW_CFA_offset r19 -196  (49 * -4)
>        30 DW_CFA_offset r18 -200  (50 * -4)
>        32 DW_CFA_offset r17 -204  (51 * -4)
>        34 DW_CFA_offset r16 -208  (52 * -4)
>        36 DW_CFA_offset r15 -212  (53 * -4)
>        38 DW_CFA_offset r14 -216  (54 * -4)
>        40 DW_CFA_offset r63 -8  (2 * -4)
>        42 DW_CFA_offset r62 -16  (4 * -4)
>        44 DW_CFA_offset r61 -24  (6 * -4)
>        46 DW_CFA_offset r60 -32  (8 * -4)
>        48 DW_CFA_offset r59 -40  (10 * -4)
>        50 DW_CFA_offset r58 -48  (12 * -4)
>        52 DW_CFA_offset r57 -56  (14 * -4)
>        54 DW_CFA_offset r56 -64  (16 * -4)
>        56 DW_CFA_offset r55 -72  (18 * -4)
>        58 DW_CFA_offset r54 -80  (20 * -4)
>        60 DW_CFA_offset r53 -88  (22 * -4)
>        62 DW_CFA_offset r52 -96  (24 * -4)
>        64 DW_CFA_offset r51 -104  (26 * -4)
>        66 DW_CFA_offset r50 -112  (28 * -4)
>        68 DW_CFA_offset r49 -120  (30 * -4)
>        70 DW_CFA_offset r48 -128  (32 * -4)
>        72 DW_CFA_offset r47 -136  (34 * -4)
>        74 DW_CFA_offset r46 -144  (36 * -4)
>        76 DW_CFA_register r70 =3D r12
>        79 DW_CFA_offset_extended_sf r65 4  (-1 * -4)
>        82 DW_CFA_advance_loc 32  (32 * 1)
>        83 DW_CFA_offset r5 -228  (57 * -4)
>        85 DW_CFA_offset r31 -148  (37 * -4)
>        87 DW_CFA_offset r30 -152  (38 * -4)
>        89 DW_CFA_offset r29 -156  (39 * -4)
>        91 DW_CFA_offset_extended r70 -220  (55 * -4)
>        94 DW_CFA_offset r6 -224  (56 * -4)
>        96 DW_CFA_nop
>        97 DW_CFA_nop
>        98 DW_CFA_nop

_Unwind_RaiseException from clang++ 3.8.0 build of libgcc_s does not =
have has r3, r4, r5, r6, r70 (from mfcr). The library depends on r3 (as =
one example).

> fde section offset 692 0x000002b4 cie offset for fde: 696 0x000002b8
>         0 DW_CFA_advance_loc 20  (5 * 4)
>         1 DW_CFA_def_cfa_offset 2992
>         4 DW_CFA_offset r31 -148  (37 * -4)
>         6 DW_CFA_offset r30 -152  (38 * -4)
>         8 DW_CFA_offset_extended_sf r65 4  (-1 * -4)
>        11 DW_CFA_advance_loc 4  (1 * 4)
>        12 DW_CFA_def_cfa_register r31
>        14 DW_CFA_offset r14 -216  (54 * -4)
>        16 DW_CFA_offset r15 -212  (53 * -4)
>        18 DW_CFA_offset r16 -208  (52 * -4)
>        20 DW_CFA_offset r17 -204  (51 * -4)
>        22 DW_CFA_offset r18 -200  (50 * -4)
>        24 DW_CFA_offset r19 -196  (49 * -4)
>        26 DW_CFA_offset r20 -192  (48 * -4)
>        28 DW_CFA_offset r21 -188  (47 * -4)
>        30 DW_CFA_offset r22 -184  (46 * -4)
>        32 DW_CFA_offset r23 -180  (45 * -4)
>        34 DW_CFA_offset r24 -176  (44 * -4)
>        36 DW_CFA_offset r25 -172  (43 * -4)
>        38 DW_CFA_offset r26 -168  (42 * -4)
>        40 DW_CFA_offset r27 -164  (41 * -4)
>        42 DW_CFA_offset r28 -160  (40 * -4)
>        44 DW_CFA_offset r29 -156  (39 * -4)
>        46 DW_CFA_offset r30 -152  (38 * -4)
>        48 DW_CFA_offset r31 -148  (37 * -4)
>        50 DW_CFA_offset r46 -144  (36 * -4)
>        52 DW_CFA_offset r47 -136  (34 * -4)
>        54 DW_CFA_offset r48 -128  (32 * -4)
>        56 DW_CFA_offset r49 -120  (30 * -4)
>        58 DW_CFA_offset r50 -112  (28 * -4)
>        60 DW_CFA_offset r51 -104  (26 * -4)
>        62 DW_CFA_offset r52 -96  (24 * -4)
>        64 DW_CFA_offset r53 -88  (22 * -4)
>        66 DW_CFA_offset r54 -80  (20 * -4)
>        68 DW_CFA_offset r55 -72  (18 * -4)
>        70 DW_CFA_offset r56 -64  (16 * -4)
>        72 DW_CFA_offset r57 -56  (14 * -4)
>        74 DW_CFA_offset r58 -48  (12 * -4)
>        76 DW_CFA_offset r59 -40  (10 * -4)
>        78 DW_CFA_offset r60 -32  (8 * -4)
>        80 DW_CFA_offset r61 -24  (6 * -4)
>        82 DW_CFA_offset r62 -16  (4 * -4)
>        84 DW_CFA_offset r63 -8  (2 * -4)
>        86 DW_CFA_nop

=3D=3D=3D
Mark Millard
markmi at dsl-only.net

On 2016-Feb-29, at 3:20 AM, Mark Millard <markmi at dsl-only.net> wrote:

TARGET_ARCH=3Dpowerpc: Using Frame Depth 1 in "case =
Intrinsic::eh_dwarf_cfa" (and Offset 0 in "case =
Builtin::BI__builtin_dwarf_cfa") for PPCTargetLowering::LowerFRAMEADDR =
related use has allowed getting into _Unwind_RaiseException_Phase2 and =
__cxxabiv1::__gxx_personality_v0. The example is the 8 line example =
compiled under g++ 4.2.1 but then used under a buildworld that was built =
with clang 3.8.0:

# ldd exception_test.g++421.powerpc=20
exception_test.g++421.powerpc:
	libstdc++.so.6 =3D> /usr/local/lib/gcc49/libstdc++.so.6 =
(0x41840000)
	libm.so.5 =3D> /lib/libm.so.5 (0x4196a000)
	libgcc_s.so.1 =3D> /lib/libgcc_s.so.1 (0x419a1000)
	libc.so.7 =3D> /lib/libc.so.7 (0x419c0000)

_Unwind_RaiseException_Phase2 is well past the point of the failure and =
crash from having Frame Depth 0 instead.

It is getting a SEGV during the _Unwind_SetGR called via:

710	  /* For targets with pointers smaller than the word size, we =
must extend the
711	     pointer, and this extension is target dependent.  */
712	  _Unwind_SetGR (context, __builtin_eh_return_data_regno (0),
713			 __builtin_extend_pointer (ue_header));

for:

_Unwind_SetGR (context=3D0xffffd570, index=3D3, val=3D1105272896) at =
/usr/src/gnu/lib/libgcc/../../../contrib/gcc/unwind-dw2.c:207

context->reg[3] is 0x0 and so its use in the following gets the SEGV.

217	  ptr =3D context->reg[index];
218=09
219	  if (size =3D=3D sizeof(_Unwind_Ptr))
220	    * (_Unwind_Ptr *) ptr =3D val;

I'm not going to try to analyze the source of this before getting some =
sleep.

For the 8 line program being compiled by clang++ 3.8.0 instead the =
results are different than the above and than the original behavior: The =
program does not crash abnormally but also does not find the catch =
clause that it should. The std::terminate gets its normal SIGABRT =
instead of an earlier SEGV.

Again I'm not going to try to analyze the details before getting some =
sleep.

But I will mention that I've also already submitted a report that =
libgcc_s does not completely implement DW_CFA_remember_state and =
DW_CFA_restore_state and that the code generated on powerpc64 touches =
the defect and so ends up with misbehavior. These might be similar.

=3D=3D=3D
Mark Millard
markmi at dsl-only.net

On 2016-Feb-28, at 10:13 PM, Mark Millard <markmi at dsl-only.net> =
wrote:

Back to the "case Builtin::BI__builtin_dwarf_cfa" and =
"PPCTargetLowering::LowerFRAMEADDR" context:

I made the wrong change and need to retry.

The detail. . .

Passing a 1 through instead of zero did not do what I expected to the =
code generated. Instead it added one instruction:

addi    r3,r3,1

resulting in (objdump -d --prefix-addresses on the .o):

> Disassembly of section .text:
> 00000000 <_Z1fv> mflr    r0
> 00000004 <_Z1fv+0x4> stw     r31,-4(r1)
> 00000008 <_Z1fv+0x8> stw     r0,4(r1)
> 0000000c <_Z1fv+0xc> stwu    r1,-16(r1)
> 00000010 <_Z1fv+0x10> mr      r31,r1
> 00000014 <_Z1fv+0x14> mr      r3,r31
> 00000018 <_Z1fv+0x18> addi    r3,r3,1
> 0000001c <_Z1fv+0x1c> bl      0000001c <_Z1fv+0x1c>
> 00000020 <_Z1fv+0x20> addi    r1,r1,16
> 00000024 <_Z1fv+0x24> lwz     r0,4(r1)
> 00000028 <_Z1fv+0x28> lwz     r31,-4(r1)
> 0000002c <_Z1fv+0x2c> mtlr    r0
> 00000030 <_Z1fv+0x30> blr

In other words: it added the 1 as a byte offset like the comments that I =
thought were wrong said.

Since it does not appear that PPCTargetLowering::LowerFRAMEADDR would do =
that with a 1 I conclude that PPCTargetLowering::LowerFRAMEADDR is not =
involved with that figure.

So looking around. . .

/usr/src/contrib/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp =
has:

> case Intrinsic::eh_dwarf_cfa: {
>  SDValue CfaArg =3D DAG.getSExtOrTrunc(getValue(I.getArgOperand(0)), =
sdl,
>                                      =
TLI.getPointerTy(DAG.getDataLayout()));
>  SDValue Offset =3D DAG.getNode(ISD::ADD, sdl,
>                               CfaArg.getValueType(),
>                               DAG.getNode(ISD::FRAME_TO_ARGS_OFFSET, =
sdl,
>                                           CfaArg.getValueType()),
>                               CfaArg);
>  SDValue FA =3D DAG.getNode(
>      ISD::FRAMEADDR, sdl, TLI.getPointerTy(DAG.getDataLayout()),
>      DAG.getConstant(0, sdl, TLI.getPointerTy(DAG.getDataLayout())));
>  setValue(&I, DAG.getNode(ISD::ADD, sdl, FA.getValueType(),
>                           FA, Offset));
>  return nullptr;

And so sure enough the argument is an offset as used by this code.

And what I call the frame depth is plugged in as 0 here via =
"DAG.getConstant(0, sdl, TLI.getPointerTy(DAG.getDataLayout()))". The =
offset is applied after getting the frame address.

So I get to revert my change and try again changing the above call to =
use a 1 instead.

It does not look like this changes the time frames in my history notes: =
it has been using frame depth zero since V2.7 when "case =
Builtin::BI__builtin_dwarf_cfa" appeared.

In general my overall questions about the target triple controlling =
which value to use (in DAG.getConstant hrere) still apply: It is not =
obvious that something that has been using frame depth 0 since V2.7 can =
be immediately changed to frame depth 1 for all contexts.

=3D=3D=3D
Mark Millard
markmi at dsl-only.net

On 2016-Feb-28, at 8:49 PM, Mark Millard <markmi at dsl-only.net> wrote:

Here is what the "ABI for the ARM 32 32-bit Architecture" "DWARF for the =
ARM Architecture" document says about the CFA:

> 3.4 Canonical Frame Address
>=20
> The term Canonical Frame Address (CFA) is defined in [GDWARF], =C2=A76.4=
, Call Frame Information. This ABI adopts the typical definition of CFA =
given there.
> =EF=81=AF The CFA is the value of the stack pointer (r13) at the call =
site in the previous frame.

This, with the armv6 code I've shown via "objdump -d", indicates that =
for armv6 clang++'s __builtin_dwarf_cfa() return value is not the same =
value as the official ARM ABI indicates. It also indicates that what g++ =
returns does match the official ARM ABI.

=3D=3D=3D
Mark Millard
markmi at dsl-only.net

On 2016-Feb-28, at 5:40 PM, Mark Millard <markmi at dsl-only.net> wrote:

Looking some at clang/llvm history shows releases/branches:

V2.6 did not have "case Builtin::BI__builtin_dwarf_cfa".
V2.7 did have "case Builtin::BI__builtin_dwarf_cfa" but =
PPCTargetLowering::LowerFRAMEADDR ignored the argument.
V2.8 had PPCTargetLowering::LowerFRAMEADDR using its argument (as a =
frame depth, not a byte offset).

The apparently incorrect (not matching g++ frame depth returned) =
comments, naming, and value (when viewed as a frame depth) for "case =
Builtin::BI__builtin_dwarf_cfa" started in V2.7 and continues to this =
day.

That is a lot of time for various dependencies on the clang =
(mis)definition to accumulate across everything that uses clang.

It may be that limiting any change to specific TARGET_ARCH's for FreeBSD =
is appropriate. FreeBSD would likely need to list the appropriate =
TARGET_ARCH's, likely including powerpc and powerpc64 since clang before =
3.8.0 was not in use for buildworld for powerpc/powerpc64.

Still this may have consequences for ports that use clang and might =
reference clang-compiled __builtin_dwarf_cfa() use, possibly from a =
lang/clang* instead of the system clang. My guess is that the =
interoperability with being able to use g++ vintages as well may lead to =
(modern?) lang/clang*'s tracking the fix for FreeBSD TARGET_ARCH's that =
are fixed.

I can ignore all this and build a system based on using 1 as the frame =
depth just to test, just as a matter of proof of concept for powerpc. =
(Powerpc64 hits a system libgcc_s defect and so needs more before C++ =
exceptions can be tested overall.)

=3D=3D=3D
Mark Millard
markmi at dsl-only.net

On 2016-Feb-28, at 2:20 PM, Mark Millard <markmi at dsl-only.net> wrote:

In /usr/src/contrib/llvm/tools/clang/lib/CodeGen/CGBuiltin.cpp there is:

> case Builtin::BI__builtin_dwarf_cfa: {
> // The offset in bytes from the first argument to the CFA.
> //
> // Why on earth is this in the frontend?  Is there any reason at
> // all that the backend can't reasonably determine this while
> // lowering llvm.eh.dwarf.cfa()?
> //
> // TODO: If there's a satisfactory reason, add a target hook for
> // this instead of hard-coding 0, which is correct for most targets.
> int32_t Offset =3D 0;
>=20
> Value *F =3D CGM.getIntrinsic(Intrinsic::eh_dwarf_cfa);
> return RValue::get(Builder.CreateCall(F,
>                                 llvm::ConstantInt::get(Int32Ty, =
Offset)));
> }

I would have guessed that the internal argument was how many frames away =
on the stack to go from what 0 produces (high er address direction). =
g++'s __builtin_dwarf_cfa() returns the address for the next frame =
compared to clang 3.8.0 (higher address direction).

I'd call that more of a frame depth than an offset. .eh_frame and its =
cfa material use offset terminology as byte offsets. And the comments =
above talk of an offset in bytes --but "next frame" distances in bytes =
would not be constant.

Looking at a use of LowerFRAMEADDR in a LowerRETURNADDR, for example,

> SDValue ARMTargetLowering::LowerRETURNADDR(SDValue Op, SelectionDAG =
&DAG) const{
> . . .
> EVT VT =3D Op.getValueType();
> SDLoc dl(Op);
> unsigned Depth =3D =
cast<ConstantSDNode>(Op.getOperand(0))->getZExtValue();
> if (Depth) {
> SDValue FrameAddr =3D LowerFRAMEADDR(Op, DAG);
> SDValue Offset =3D DAG.getConstant(4, dl, MVT::i32);
> return DAG.getLoad(VT, dl, DAG.getEntryNode(),
>                  DAG.getNode(ISD::ADD, dl, VT, FrameAddr, Offset),
>                  MachinePointerInfo(), false, false, false, 0);
> }
> . . .
> }

(PPCTargetLowering::LowerRETURNADDR is similar.)=20

This has a mix of Depth and Offset overall, with the depth going to =
LowerFRAMEADDR via Op but Offset used later in GAG.getLoad via adding to =
the FrameAddr.

This would lead me to guess that the terminology and comments in "case =
Builtin::BI__builtin_dwarf_cfa" are wrong and that the =
Builder.CreateCall has been given a frame depth, not an offset.

> SDValue PPCTargetLowering::LowerFRAMEADDR(SDValue Op,
>                                     SelectionDAG &DAG) const {
> SDLoc dl(Op);
> unsigned Depth =3D =
cast<ConstantSDNode>(Op.getOperand(0))->getZExtValue();
>=20
> MachineFunction &MF =3D DAG.getMachineFunction();
> MachineFrameInfo *MFI =3D MF.getFrameInfo();
> MFI->setFrameAddressIsTaken(true);
>=20
> EVT PtrVT =3D =
DAG.getTargetLoweringInfo().getPointerTy(MF.getDataLayout());
> bool isPPC64 =3D PtrVT =3D=3D MVT::i64;
>=20
> // Naked functions never have a frame pointer, and so we use r1. For =
all
> // other functions, this decision must be delayed until during PEI.
> unsigned FrameReg;
> if (MF.getFunction()->hasFnAttribute(Attribute::Naked))
> FrameReg =3D isPPC64 ? PPC::X1 : PPC::R1;
> else
> FrameReg =3D isPPC64 ? PPC::FP8 : PPC::FP;
>=20
> SDValue FrameAddr =3D DAG.getCopyFromReg(DAG.getEntryNode(), dl, =
FrameReg,
>                                    PtrVT);
> while (Depth--)
> FrameAddr =3D DAG.getLoad(Op.getValueType(), dl, DAG.getEntryNode(),
>                       FrameAddr, MachinePointerInfo(), false, false,
>                       false, 0);
> return FrameAddr;
> }=20

Again Op is called Depth --and is used to get from one frame pointer =
value to the next: a frame depth.

To match g++ 4.2.1 the value to use is 1 for depth.

Overall, at least applied to powerpc/powerpc64:

> . . .

> // TODO: If there's a satisfactory reason, add a target hook for
> // this instead of hard-coding 0, which is correct for most targets.
> int32_t Offset =3D 0;

I think the comments in this area are actually talking about byte =
offsets, not depths and are just wrong. A byte offset of 0 would make =
sense relative to hardcoding but the value is actually a frame depth --a =
very different context.

I think that the naming of the variable is just wrong: it should be =
called Depth.

And I think that the comments should have originally talked about using =
a hard coded Depth 1 to match g++ and preexisting library usage of =
__builtin_dwarf_cfa() for C++ and other exception handling (.eh_frame =
usage). ANd the code should avhe matched.

As far as I can tell this error in the "case =
Builtin::BI__builtin_dwarf_cfa:" design was not caught until now.

But since the mess has been around a longtime just switching everything =
to match the g++ context now likely has its own problems. (Not just a =
FreeBSD issue.)

For FreeBSD I expect that Depth 1 could be used for powerpc and =
powerpc64: if it has been wrong for a long time (not just 3.8.0) then =
powerpc/powerpc64 FreeBSD has likely been broken for C++ exception =
handling when buildworld was via clang for just as long. (But only =
recently has clang gotten this near working for buildworld for at least =
one of powerpc/powerpc64. Currently powerpc is closer, given that =
powerpc64 does not support softfloat last I knew.)

For other TARGET_ARCH's:

For FreeBSD armv6 it is less clear to me: it is based on clang as it is =
and I do not know what C++ exception ABI it uses. If a modern gcc/g++ =
buildworld had problems with C++ exception handling, does anything need =
to be done about it? For FreeBSD armv6 and the like: is xtoolchain like =
support important?

FreeBSD may have similar questions for other TARGET_ARCH's.

=3D=3D=3D
Mark Millard
markmi at dsl-only.net

On 2016-Feb-28, at 2:46 AM, Mark Millard <markmi at dsl-only.net> wrote:

On 2016-Feb-28, at 12:39 AM, Roman Divacky <rdivacky at vlakno.cz> =
wrote:
>=20
> Mark,
>=20
> __builtin_dwarf_cfa() is lowered in clang to llvm intrinsic =
eh_dwarf_cfa.
> There's a depth argument (which defaults to 0, saying it's correct for =
most
> targets).=20
>=20
> Then the intrinsic gets lowered in SelectionDAG using
> PPCTargetLowering::LowerFRAMEADDR()
>=20
>=20
> Can you check that 1) the depth should be 0 for ppc64/ppc32 2) that
> LowerFRAMEADDR() does something sensible?
>=20
> There's a loop depth-times, so I wonder if that makes a difference.
>=20
> Thanks, Roman

"Lowered"? I'm not familiar with the clang code base or its terminology. =
Handed off to a lower level interface, may be?

As near as I can tell libgcc_s could be made to deal with clang 3.8.0's =
way of __builtin_dwarf_cfa() working for powerpc/powerpc64. But then use =
with g++ would be broken on powerpc/powerpc64 unless there were some =
sort of live "which compiler's type of code" test also involved.

Having only one libgcc_s and multiple compilers using it for a given =
TARGET_ARCH=3D (for example, devel/powerpc64-xtoolchain-gcc like uses) =
suggests sticking to one convention per TARGET_ARCH=3D for =
__builtin_dwarf_cfa().

I would guess that g++ conventions win in this type of context for =
FreeBSD, under the guideline of trying to be gcc 4.2.1 "ABI" compatible. =
libgcc_s from FreeBSD works for C++ exceptions with its gcc 4.2.1 for =
powerpc and powerpc64 as things are as far as I know.

But for clang++ FreeBSD is one context among many and other libraries =
may be based on clang 3.8.0's existing interpretation, without gcc/g++ =
compatibility constraints. (I've no experience with earlier clang =
vintages for the issue.) It may be that any change needs to be FreeBSD =
target-triple specific for all I know. In essence: making the convention =
part of the ABI chosen.

I'll probably get some sleep before looking much at the code that you =
reference. A quick look at part of it suggests a fair amount of =
research/study for me to interpret things reliably.

The loop may be somewhat analogous to _Unwind_RaiseException's loop, but =
for a specific depth. I would currently guess that depth 1 would match =
gcc 4.2.1's result for __builtin_dwarf_cfa().

But there was also some other "address"(?) builtin support routine =
nearby that seemed to call into LowerFRAMEADDR() and I've no clue if g++ =
4.2.1 uses the same depth-figure standard for both sorts of things or =
not. For all I know both types of builtins(?) might have mismatches with =
gcc/g++ 4.2.1 and both might need fixes.

I do vaguely remember seeing a builtin in FreeBSD code for something =
that had an explicit number in the argument list, possibly =
__builtin_frame_address(n)(?). But I only saw __builtin_dwarf_cfa() with =
no argument in the argument list as far as I remember.

If clang 3.8.0 and gcc 4.2.1 disagreed about what the numbering standard =
referred to for __builtin_frame_address(n) (or whatever it was), that =
would not be good and compatibility would imply conversions between the =
conventions for the 2 FreeBSD contexts.

I have not checked for armv6 related clang++ vs. g++ compatibility for =
C++ exception-handling. If anything is not operating for that context I =
expect that it would be g++ that generated buildworld code that did not =
work based on the FreeBSD source code: clang/clang++ is the normal =
compiler and kyua seemed to operate, unlike on the powerpc/powerpc64.

I've never tried to build armv6 via an equivalent of =
devel/powerpc64-gcc. I do not know if armv6 even uses the same sort of =
C++ exception-handling ABI or not. But I do know that =
__builtin_dwarf_cfa() is not compatible between clang++ and g++ from the =
2-line source and comparing objdump -d results.

So more than powerpc/powerpc64 might be involved overall.

=3D=3D=3D
Mark Millard
markmi at dsl-only.net

Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?27FEB264-0A3E-42DF-A549-1E54ED489CEC>

Header And Logo

Peripheral Links

Site Navigation

Header And Logo

Peripheral Links

Search

Site Navigation