Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 28 Feb 2016 09:39:34 +0100
From:      Roman Divacky <rdivacky@vlakno.cz>
To:        Mark Millard <markmi@dsl-only.net>
Cc:        freebsd-arm <freebsd-arm@freebsd.org>, FreeBSD PowerPC ML <freebsd-ppc@freebsd.org>, FreeBSD Toolchain <freebsd-toolchain@freebsd.org>, Dimitry Andric <dim@FreeBSD.org>
Subject:   Re: clang 3.8.0 can mess up __builtin_dwarf_cfa (), at least for TARGET_ARCH=armv6, powerpc, powerpc64: a bug 207325 update
Message-ID:  <20160228083934.GA60222@vlakno.cz>
In-Reply-To: <366B67F9-6A14-4906-8545-1B57A3FF53B8@dsl-only.net>
References:  <83B8741C-B4C9-4EFB-A3B4-473F8F165984@dsl-only.net> <80EA4460-E842-46F5-B006-2A83FBBEE845@dsl-only.net> <F23112FF-C417-4757-96FF-4E93C259DC9D@dsl-only.net> <366B67F9-6A14-4906-8545-1B57A3FF53B8@dsl-only.net>

next in thread | previous in thread | raw e-mail | index | archive | help
Mark,

__builtin_dwarf_cfa() is lowered in clang to llvm intrinsic eh_dwarf_cfa.
There's a depth argument (which defaults to 0, saying it's correct for most
targets). 

Then the intrinsic gets lowered in SelectionDAG using
PPCTargetLowering::LowerFRAMEADDR()


Can you check that 1) the depth should be 0 for ppc64/ppc32 2) that
LowerFRAMEADDR() does something sensible?

There's a loop depth-times, so I wonder if that makes a difference.

Thanks, Roman


On Sat, Feb 27, 2016 at 05:55:02PM -0800, Mark Millard wrote:
> I discovered on powerpc that __builtin_dwarf_cfa() for clang 3.8.0 and g++ do not agree. For powerpc this breaks C++ exception handling (via the use in libgcc_s's unwind handling), resulting in uncaught exceptions and SEGV's. objdump -d for the two line source file below shows the low level differences.
> 
> > extern void g(void*);
> > void f() { g(__builtin_dwarf_cfa()); }
> 
> I've also shown the same issue for powerpc64.
> 
> The issue is where g's argument value points relative to f's frame and f's caller's frame (since __builtin_dwarf_cfa() is called by f, not g).
> 
> And now for armv6 . . .
> 
> > # clang++ -c -g -std=c++11 -Wall -pedantic builtin_dwarf_cfa.cpp
> > # /usr/local/bin/objdump -d --prefix-addresses builtin_dwarf_cfa.o
> > 
> > builtin_dwarf_cfa.o:     file format elf32-littlearm
> > 
> > 
> > Disassembly of section .text:
> > 00000000 <_Z1fv> push	{fp, lr}
> > 00000004 <_Z1fv+0x4> mov	fp, sp
> > 00000008 <_Z1fv+0x8> mov	r0, fp
> > 0000000c <_Z1fv+0xc> bl	00000000 <_Z1gPv>
> > 00000010 <_Z1fv+0x10> pop	{fp, pc}
> 
> vs.
> 
> > # g++5 -c -g -std=c++11 -Wall -pedantic builtin_dwarf_cfa.cpp
> > # /usr/local/bin/objdump -d --prefix-addresses builtin_dwarf_cfa.o
> > 
> > builtin_dwarf_cfa.o:     file format elf32-littlearm
> > 
> > 
> > Disassembly of section .text:
> > 00000000 <_Z1fv> push	{fp, lr}
> > 00000004 <_Z1fv+0x4> add	fp, sp, #4, 0
> > 00000008 <_Z1fv+0x8> add	r3, fp, #4, 0
> > 0000000c <_Z1fv+0xc> mov	r0, r3
> > 00000010 <_Z1fv+0x10> bl	00000000 <_Z1gPv>
> > 00000014 <_Z1fv+0x14> nop			; (mov r0, r0)
> > 00000018 <_Z1fv+0x18> pop	{fp, pc}
> 
> 
> They do not agree.
> 
> So any infrastructure based on __builtin_dwarf_cfa() use will be compiler sensitive for armv6 as well.
> 
> [It is my understanding that what g++ does is what the normal sort of .eh_frame infrastructure is designed for: pointing between the caller's and called's frames.]
> 
> 
> For reference: powerpc64 and powerpc results. . .
> 
> > # clang++ -c -g -std=c++11 -Wall -pedantic builtin_dwarf_cfa.cpp
> > # /usr/local/bin/objdump -d --prefix-addresses builtin_dwarf_cfa.o
> > 
> > builtin_dwarf_cfa.o:     file format elf64-powerpc-freebsd
> > 
> > 
> > Disassembly of section .text:
> > 0000000000000000 <._Z1fv> mflr    r0
> > 0000000000000004 <._Z1fv+0x4> std     r31,-8(r1)
> > 0000000000000008 <._Z1fv+0x8> std     r0,16(r1)
> > 000000000000000c <._Z1fv+0xc> stdu    r1,-128(r1)
> > 0000000000000010 <._Z1fv+0x10> mr      r31,r1
> > 0000000000000014 <._Z1fv+0x14> mr      r3,r31
> > 0000000000000018 <._Z1fv+0x18> bl      0000000000000018 <._Z1fv+0x18>
> > 000000000000001c <._Z1fv+0x1c> nop
> > 0000000000000020 <._Z1fv+0x20> addi    r1,r1,128
> > 0000000000000024 <._Z1fv+0x24> ld      r0,16(r1)
> > 0000000000000028 <._Z1fv+0x28> ld      r31,-8(r1)
> > 000000000000002c <._Z1fv+0x2c> mtlr    r0
> > 0000000000000030 <._Z1fv+0x30> blr
> >         ...
> 
> r3 does not point to a boundary with f's caller's stack frame.
> 
> By contrast for g++49:
> 
> > # g++49 -c -g -std=c++11 -Wall -pedantic builtin_dwarf_cfa.cpp
> > # /usr/local/bin/objdump -d --prefix-addresses builtin_dwarf_cfa.o | more
> > 
> > builtin_dwarf_cfa.o:     file format elf64-powerpc-freebsd
> > 
> > 
> > Disassembly of section .text:
> > 0000000000000000 <._Z1fv> mflr    r0
> > 0000000000000004 <._Z1fv+0x4> std     r0,16(r1)
> > 0000000000000008 <._Z1fv+0x8> std     r31,-8(r1)
> > 000000000000000c <._Z1fv+0xc> stdu    r1,-128(r1)
> > 0000000000000010 <._Z1fv+0x10> mr      r31,r1
> > 0000000000000014 <._Z1fv+0x14> addi    r9,r31,128
> > 0000000000000018 <._Z1fv+0x18> mr      r3,r9
> > 000000000000001c <._Z1fv+0x1c> bl      000000000000001c <._Z1fv+0x1c>
> > 0000000000000020 <._Z1fv+0x20> nop
> > 0000000000000024 <._Z1fv+0x24> addi    r1,r31,128
> > 0000000000000028 <._Z1fv+0x28> ld      r0,16(r1)
> > 000000000000002c <._Z1fv+0x2c> mtlr    r0
> > 0000000000000030 <._Z1fv+0x30> ld      r31,-8(r1)
> > 0000000000000034 <._Z1fv+0x34> blr
> > 0000000000000038 <._Z1fv+0x38> .long 0x0
> > 000000000000003c <._Z1fv+0x3c> .long 0x90001
> > 0000000000000040 <._Z1fv+0x40> lwz     r0,1(r1)
> 
> r3 does point to a boundary with f's caller's stack frame.
> 
> For TARGET_ARCH=powerpc, clang 3.8.0 first:
> 
> > # clang++ -c -g -std=c++11 -Wall -pedantic builtin_dwarf_cfa.cpp
> > # /usr/local/bin/objdump -d --prefix-addresses builtin_dwarf_cfa.o
> > 
> > builtin_dwarf_cfa.o:     file format elf32-powerpc-freebsd
> > 
> > 
> > Disassembly of section .text:
> > 00000000 <_Z1fv> mflr    r0
> > 00000004 <_Z1fv+0x4> stw     r31,-4(r1)
> > 00000008 <_Z1fv+0x8> stw     r0,4(r1)
> > 0000000c <_Z1fv+0xc> stwu    r1,-16(r1)
> > 00000010 <_Z1fv+0x10> mr      r31,r1
> > 00000014 <_Z1fv+0x14> mr      r3,r31
> > 00000018 <_Z1fv+0x18> bl      00000018 <_Z1fv+0x18>
> > 0000001c <_Z1fv+0x1c> addi    r1,r1,16
> > 00000020 <_Z1fv+0x20> lwz     r0,4(r1)
> > 00000024 <_Z1fv+0x24> lwz     r31,-4(r1)
> > 00000028 <_Z1fv+0x28> mtlr    r0
> > 0000002c <_Z1fv+0x2c> blr
> 
> Then g++5 (5.3):
> 
> > # g++5 -c -g -std=c++11 -Wall -pedantic builtin_dwarf_cfa.cpp
> > # /usr/local/bin/objdump -d --prefix-addresses builtin_dwarf_cfa.o
> > 
> > builtin_dwarf_cfa.o:     file format elf32-powerpc-freebsd
> > 
> > 
> > Disassembly of section .text:
> > 00000000 <_Z1fv> stwu    r1,-16(r1)
> > 00000004 <_Z1fv+0x4> mflr    r0
> > 00000008 <_Z1fv+0x8> stw     r0,20(r1)
> > 0000000c <_Z1fv+0xc> stw     r31,12(r1)
> > 00000010 <_Z1fv+0x10> mr      r31,r1
> > 00000014 <_Z1fv+0x14> addi    r9,r31,16
> > 00000018 <_Z1fv+0x18> mr      r3,r9
> > 0000001c <_Z1fv+0x1c> bl      0000001c <_Z1fv+0x1c>
> > 00000020 <_Z1fv+0x20> nop
> > 00000024 <_Z1fv+0x24> addi    r11,r31,16
> > 00000028 <_Z1fv+0x28> lwz     r0,4(r11)
> > 0000002c <_Z1fv+0x2c> mtlr    r0
> > 00000030 <_Z1fv+0x30> lwz     r31,-4(r11)
> > 00000034 <_Z1fv+0x34> mr      r1,r11
> > 00000038 <_Z1fv+0x38> blr
> 
> 
> The historical note below is from before I'd discovered powerpc64 or armv6 have the same sort of issue. But it gives an example use that is broken for powerpc and powerpc64. (I do not know if armv6 uses the same infrastructure.)
> 
> ===
> Mark Millard
> markmi at dsl-only.net
> 
> On 2016-Feb-27, at 3:31 PM, Mark Millard <markmi at dsl-only.net> wrote:
> > 
> > [Top post for dinging the low level problem that directly breaks c++ exception handling for TARGET_ARCH=powerpc for clang 3.8.0 generated code.]
> > 
> > I've tracked down the c++ exception problem for TARGET_ARCH=powerpc via clang 3.8.0: misbehavior of clang 3.8.0 code generation for __builtin_dwarf_cfa () as used in:
> > 
> > #define uw_init_context(CONTEXT)                                           \
> >  do                                                                       \
> >    {                                                                      \
> >      /* Do any necessary initialization to access arbitrary stack frames. \
> >         On the SPARC, this means flushing the register windows.  */       \
> >      __builtin_unwind_init ();                                            \
> >      uw_init_context_1 (CONTEXT, __builtin_dwarf_cfa (),                  \
> >                         __builtin_return_address (0));                    \
> >    }                                                                      \
> >  while (0)
> > . . .
> > 85	_Unwind_Reason_Code
> > 86	_Unwind_RaiseException(struct _Unwind_Exception *exc)
> > 87	{
> > 88	  struct _Unwind_Context this_context, cur_context;
> > 89	  _Unwind_Reason_Code code;
> > 90	
> > 91	  /* Set up this_context to describe the current stack frame.  */
> > 92	  uw_init_context (&this_context);
> > 
> > In the below r4 ends up with the __builtin_dwarf_cfa () value supplied to uw_init_context_1:
> > 
> > Dump of assembler code for function _Unwind_RaiseException:
> >   0x419a8fd8 <+0>:	mflr    r0
> >   0x419a8fdc <+4>:	stw     r31,-148(r1)
> >   0x419a8fe0 <+8>:	stw     r30,-152(r1)
> >   0x419a8fe4 <+12>:	stw     r0,4(r1)
> >   0x419a8fe8 <+16>:	stwu    r1,-2992(r1)
> >   0x419a8fec <+20>:	mr      r31,r1
> > . . .
> >   0x419a9094 <+188>:	mr      r4,r31
> >   0x419a9098 <+192>:	mflr    r30
> >   0x419a909c <+196>:	lwz     r5,2996(r31)
> >   0x419a90a0 <+200>:	mr      r3,r28
> >   0x419a90a4 <+204>:	bl      0x419a929c <uw_init_context_1>
> > 
> > That r4 ends up holding the stack pointer value for after it has been decremented. r4 is not pointing at the boundary with the caller's frame.
> > 
> > The .eh_frame information and unwind code is set up for pointing at the boundary with the caller's frame. So the cfa relative addressing is messed up for what it actually extracts.
> > 
> > Contrast this with gcc/g++ 5.3's TARGET_ARCH=powerpc64 code where r4 is  made to be at the boundary with the caller's frame:
> > 
> > Dump of assembler code for function _Unwind_RaiseException:
> >   0x00000000501cb810 <+0>:	mflr    r0
> >   0x00000000501cb814 <+4>:	stdu    r1,-5648(r1)
> > . . .
> >   0x00000000501cb8d0 <+192>:	addi    r4,r1,5648
> >   0x00000000501cb8d4 <+196>:	stw     r12,5656(r1)
> >   0x00000000501cb8d8 <+200>:	mr      r28,r3
> >   0x00000000501cb8dc <+204>:	addi    r31,r1,2544
> >   0x00000000501cb8e0 <+208>:	mr      r3,r27
> >   0x00000000501cb8e4 <+212>:	addi    r29,r1,112
> >   0x00000000501cb8e8 <+216>:	bl      0x501cae60 <uw_init_context_1>
> > 
> > 
> > NOTE: The powerpc (32-bit) issue may in some way be associated with the clang 3.8.0 powerpc ABI violation in how it handles the stack pointer for FreeBSD: TARGET_ARCH=powerpc is currently using a "red zone", decrementing the stack pointer late, and incrementing the stack pointer early compared to the FreeBSD ABI rules. (This is similar to the official FreeBSD ABI for TARGET_ARCH=powerpc64.)
> > 
> > 
> > 
> > 
> > ===
> > Mark Millard
> > markmi at dsl-only.net



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20160228083934.GA60222>