From owner-freebsd-toolchain@freebsd.org Sun Feb 28 08:41:58 2016 Return-Path: Delivered-To: freebsd-toolchain@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id A1747AB7F02; Sun, 28 Feb 2016 08:41:58 +0000 (UTC) (envelope-from rdivacky@vlakno.cz) Received: from vlakno.cz (mail.vlakno.cz [91.217.96.224]) by mx1.freebsd.org (Postfix) with ESMTP id 32242188C; Sun, 28 Feb 2016 08:41:57 +0000 (UTC) (envelope-from rdivacky@vlakno.cz) Received: by vlakno.cz (Postfix, from userid 1002) id 49D291E20F9C; Sun, 28 Feb 2016 09:39:34 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=vlakno.cz; s=mail; t=1456648774; bh=kdnu7GPiW5BPapHqf8pxugQ40PsHcIvJJjTTfaaTzX4=; h=Date:From:To:Cc:Subject:References:In-Reply-To; b=LOZZTVzKU6rfjOclY1Av/Vsu40I0SMXGWL+ZFGWf/hjUpx4mtx8BYF7s2otcocOsp A7Xk96hKfB0qoTKYMSs9Ayd/ahaEvcvPdggdPJr8Knq2SXfe8vHd/YP2YTOgZ0E/Xk mKiPUYuNAdvwhHnIxodZ3wZVYmxob1mxVgp0ZLX4= Date: Sun, 28 Feb 2016 09:39:34 +0100 From: Roman Divacky To: Mark Millard Cc: freebsd-arm , FreeBSD PowerPC ML , FreeBSD Toolchain , Dimitry Andric Subject: Re: clang 3.8.0 can mess up __builtin_dwarf_cfa (), at least for TARGET_ARCH=armv6, powerpc, powerpc64: a bug 207325 update Message-ID: <20160228083934.GA60222@vlakno.cz> References: <83B8741C-B4C9-4EFB-A3B4-473F8F165984@dsl-only.net> <80EA4460-E842-46F5-B006-2A83FBBEE845@dsl-only.net> <366B67F9-6A14-4906-8545-1B57A3FF53B8@dsl-only.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <366B67F9-6A14-4906-8545-1B57A3FF53B8@dsl-only.net> User-Agent: Mutt/1.5.24 (2015-08-30) X-BeenThere: freebsd-toolchain@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Maintenance of FreeBSD's integrated toolchain List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 28 Feb 2016 08:41:58 -0000 Mark, __builtin_dwarf_cfa() is lowered in clang to llvm intrinsic eh_dwarf_cfa. There's a depth argument (which defaults to 0, saying it's correct for most targets). Then the intrinsic gets lowered in SelectionDAG using PPCTargetLowering::LowerFRAMEADDR() Can you check that 1) the depth should be 0 for ppc64/ppc32 2) that LowerFRAMEADDR() does something sensible? There's a loop depth-times, so I wonder if that makes a difference. Thanks, Roman On Sat, Feb 27, 2016 at 05:55:02PM -0800, Mark Millard wrote: > I discovered on powerpc that __builtin_dwarf_cfa() for clang 3.8.0 and g++ do not agree. For powerpc this breaks C++ exception handling (via the use in libgcc_s's unwind handling), resulting in uncaught exceptions and SEGV's. objdump -d for the two line source file below shows the low level differences. > > > extern void g(void*); > > void f() { g(__builtin_dwarf_cfa()); } > > I've also shown the same issue for powerpc64. > > The issue is where g's argument value points relative to f's frame and f's caller's frame (since __builtin_dwarf_cfa() is called by f, not g). > > And now for armv6 . . . > > > # clang++ -c -g -std=c++11 -Wall -pedantic builtin_dwarf_cfa.cpp > > # /usr/local/bin/objdump -d --prefix-addresses builtin_dwarf_cfa.o > > > > builtin_dwarf_cfa.o: file format elf32-littlearm > > > > > > Disassembly of section .text: > > 00000000 <_Z1fv> push {fp, lr} > > 00000004 <_Z1fv+0x4> mov fp, sp > > 00000008 <_Z1fv+0x8> mov r0, fp > > 0000000c <_Z1fv+0xc> bl 00000000 <_Z1gPv> > > 00000010 <_Z1fv+0x10> pop {fp, pc} > > vs. > > > # g++5 -c -g -std=c++11 -Wall -pedantic builtin_dwarf_cfa.cpp > > # /usr/local/bin/objdump -d --prefix-addresses builtin_dwarf_cfa.o > > > > builtin_dwarf_cfa.o: file format elf32-littlearm > > > > > > Disassembly of section .text: > > 00000000 <_Z1fv> push {fp, lr} > > 00000004 <_Z1fv+0x4> add fp, sp, #4, 0 > > 00000008 <_Z1fv+0x8> add r3, fp, #4, 0 > > 0000000c <_Z1fv+0xc> mov r0, r3 > > 00000010 <_Z1fv+0x10> bl 00000000 <_Z1gPv> > > 00000014 <_Z1fv+0x14> nop ; (mov r0, r0) > > 00000018 <_Z1fv+0x18> pop {fp, pc} > > > They do not agree. > > So any infrastructure based on __builtin_dwarf_cfa() use will be compiler sensitive for armv6 as well. > > [It is my understanding that what g++ does is what the normal sort of .eh_frame infrastructure is designed for: pointing between the caller's and called's frames.] > > > For reference: powerpc64 and powerpc results. . . > > > # clang++ -c -g -std=c++11 -Wall -pedantic builtin_dwarf_cfa.cpp > > # /usr/local/bin/objdump -d --prefix-addresses builtin_dwarf_cfa.o > > > > builtin_dwarf_cfa.o: file format elf64-powerpc-freebsd > > > > > > Disassembly of section .text: > > 0000000000000000 <._Z1fv> mflr r0 > > 0000000000000004 <._Z1fv+0x4> std r31,-8(r1) > > 0000000000000008 <._Z1fv+0x8> std r0,16(r1) > > 000000000000000c <._Z1fv+0xc> stdu r1,-128(r1) > > 0000000000000010 <._Z1fv+0x10> mr r31,r1 > > 0000000000000014 <._Z1fv+0x14> mr r3,r31 > > 0000000000000018 <._Z1fv+0x18> bl 0000000000000018 <._Z1fv+0x18> > > 000000000000001c <._Z1fv+0x1c> nop > > 0000000000000020 <._Z1fv+0x20> addi r1,r1,128 > > 0000000000000024 <._Z1fv+0x24> ld r0,16(r1) > > 0000000000000028 <._Z1fv+0x28> ld r31,-8(r1) > > 000000000000002c <._Z1fv+0x2c> mtlr r0 > > 0000000000000030 <._Z1fv+0x30> blr > > ... > > r3 does not point to a boundary with f's caller's stack frame. > > By contrast for g++49: > > > # g++49 -c -g -std=c++11 -Wall -pedantic builtin_dwarf_cfa.cpp > > # /usr/local/bin/objdump -d --prefix-addresses builtin_dwarf_cfa.o | more > > > > builtin_dwarf_cfa.o: file format elf64-powerpc-freebsd > > > > > > Disassembly of section .text: > > 0000000000000000 <._Z1fv> mflr r0 > > 0000000000000004 <._Z1fv+0x4> std r0,16(r1) > > 0000000000000008 <._Z1fv+0x8> std r31,-8(r1) > > 000000000000000c <._Z1fv+0xc> stdu r1,-128(r1) > > 0000000000000010 <._Z1fv+0x10> mr r31,r1 > > 0000000000000014 <._Z1fv+0x14> addi r9,r31,128 > > 0000000000000018 <._Z1fv+0x18> mr r3,r9 > > 000000000000001c <._Z1fv+0x1c> bl 000000000000001c <._Z1fv+0x1c> > > 0000000000000020 <._Z1fv+0x20> nop > > 0000000000000024 <._Z1fv+0x24> addi r1,r31,128 > > 0000000000000028 <._Z1fv+0x28> ld r0,16(r1) > > 000000000000002c <._Z1fv+0x2c> mtlr r0 > > 0000000000000030 <._Z1fv+0x30> ld r31,-8(r1) > > 0000000000000034 <._Z1fv+0x34> blr > > 0000000000000038 <._Z1fv+0x38> .long 0x0 > > 000000000000003c <._Z1fv+0x3c> .long 0x90001 > > 0000000000000040 <._Z1fv+0x40> lwz r0,1(r1) > > r3 does point to a boundary with f's caller's stack frame. > > For TARGET_ARCH=powerpc, clang 3.8.0 first: > > > # clang++ -c -g -std=c++11 -Wall -pedantic builtin_dwarf_cfa.cpp > > # /usr/local/bin/objdump -d --prefix-addresses builtin_dwarf_cfa.o > > > > builtin_dwarf_cfa.o: file format elf32-powerpc-freebsd > > > > > > Disassembly of section .text: > > 00000000 <_Z1fv> mflr r0 > > 00000004 <_Z1fv+0x4> stw r31,-4(r1) > > 00000008 <_Z1fv+0x8> stw r0,4(r1) > > 0000000c <_Z1fv+0xc> stwu r1,-16(r1) > > 00000010 <_Z1fv+0x10> mr r31,r1 > > 00000014 <_Z1fv+0x14> mr r3,r31 > > 00000018 <_Z1fv+0x18> bl 00000018 <_Z1fv+0x18> > > 0000001c <_Z1fv+0x1c> addi r1,r1,16 > > 00000020 <_Z1fv+0x20> lwz r0,4(r1) > > 00000024 <_Z1fv+0x24> lwz r31,-4(r1) > > 00000028 <_Z1fv+0x28> mtlr r0 > > 0000002c <_Z1fv+0x2c> blr > > Then g++5 (5.3): > > > # g++5 -c -g -std=c++11 -Wall -pedantic builtin_dwarf_cfa.cpp > > # /usr/local/bin/objdump -d --prefix-addresses builtin_dwarf_cfa.o > > > > builtin_dwarf_cfa.o: file format elf32-powerpc-freebsd > > > > > > Disassembly of section .text: > > 00000000 <_Z1fv> stwu r1,-16(r1) > > 00000004 <_Z1fv+0x4> mflr r0 > > 00000008 <_Z1fv+0x8> stw r0,20(r1) > > 0000000c <_Z1fv+0xc> stw r31,12(r1) > > 00000010 <_Z1fv+0x10> mr r31,r1 > > 00000014 <_Z1fv+0x14> addi r9,r31,16 > > 00000018 <_Z1fv+0x18> mr r3,r9 > > 0000001c <_Z1fv+0x1c> bl 0000001c <_Z1fv+0x1c> > > 00000020 <_Z1fv+0x20> nop > > 00000024 <_Z1fv+0x24> addi r11,r31,16 > > 00000028 <_Z1fv+0x28> lwz r0,4(r11) > > 0000002c <_Z1fv+0x2c> mtlr r0 > > 00000030 <_Z1fv+0x30> lwz r31,-4(r11) > > 00000034 <_Z1fv+0x34> mr r1,r11 > > 00000038 <_Z1fv+0x38> blr > > > The historical note below is from before I'd discovered powerpc64 or armv6 have the same sort of issue. But it gives an example use that is broken for powerpc and powerpc64. (I do not know if armv6 uses the same infrastructure.) > > === > Mark Millard > markmi at dsl-only.net > > On 2016-Feb-27, at 3:31 PM, Mark Millard wrote: > > > > [Top post for dinging the low level problem that directly breaks c++ exception handling for TARGET_ARCH=powerpc for clang 3.8.0 generated code.] > > > > I've tracked down the c++ exception problem for TARGET_ARCH=powerpc via clang 3.8.0: misbehavior of clang 3.8.0 code generation for __builtin_dwarf_cfa () as used in: > > > > #define uw_init_context(CONTEXT) \ > > do \ > > { \ > > /* Do any necessary initialization to access arbitrary stack frames. \ > > On the SPARC, this means flushing the register windows. */ \ > > __builtin_unwind_init (); \ > > uw_init_context_1 (CONTEXT, __builtin_dwarf_cfa (), \ > > __builtin_return_address (0)); \ > > } \ > > while (0) > > . . . > > 85 _Unwind_Reason_Code > > 86 _Unwind_RaiseException(struct _Unwind_Exception *exc) > > 87 { > > 88 struct _Unwind_Context this_context, cur_context; > > 89 _Unwind_Reason_Code code; > > 90 > > 91 /* Set up this_context to describe the current stack frame. */ > > 92 uw_init_context (&this_context); > > > > In the below r4 ends up with the __builtin_dwarf_cfa () value supplied to uw_init_context_1: > > > > Dump of assembler code for function _Unwind_RaiseException: > > 0x419a8fd8 <+0>: mflr r0 > > 0x419a8fdc <+4>: stw r31,-148(r1) > > 0x419a8fe0 <+8>: stw r30,-152(r1) > > 0x419a8fe4 <+12>: stw r0,4(r1) > > 0x419a8fe8 <+16>: stwu r1,-2992(r1) > > 0x419a8fec <+20>: mr r31,r1 > > . . . > > 0x419a9094 <+188>: mr r4,r31 > > 0x419a9098 <+192>: mflr r30 > > 0x419a909c <+196>: lwz r5,2996(r31) > > 0x419a90a0 <+200>: mr r3,r28 > > 0x419a90a4 <+204>: bl 0x419a929c > > > > That r4 ends up holding the stack pointer value for after it has been decremented. r4 is not pointing at the boundary with the caller's frame. > > > > The .eh_frame information and unwind code is set up for pointing at the boundary with the caller's frame. So the cfa relative addressing is messed up for what it actually extracts. > > > > Contrast this with gcc/g++ 5.3's TARGET_ARCH=powerpc64 code where r4 is made to be at the boundary with the caller's frame: > > > > Dump of assembler code for function _Unwind_RaiseException: > > 0x00000000501cb810 <+0>: mflr r0 > > 0x00000000501cb814 <+4>: stdu r1,-5648(r1) > > . . . > > 0x00000000501cb8d0 <+192>: addi r4,r1,5648 > > 0x00000000501cb8d4 <+196>: stw r12,5656(r1) > > 0x00000000501cb8d8 <+200>: mr r28,r3 > > 0x00000000501cb8dc <+204>: addi r31,r1,2544 > > 0x00000000501cb8e0 <+208>: mr r3,r27 > > 0x00000000501cb8e4 <+212>: addi r29,r1,112 > > 0x00000000501cb8e8 <+216>: bl 0x501cae60 > > > > > > NOTE: The powerpc (32-bit) issue may in some way be associated with the clang 3.8.0 powerpc ABI violation in how it handles the stack pointer for FreeBSD: TARGET_ARCH=powerpc is currently using a "red zone", decrementing the stack pointer late, and incrementing the stack pointer early compared to the FreeBSD ABI rules. (This is similar to the official FreeBSD ABI for TARGET_ARCH=powerpc64.) > > > > > > > > > > === > > Mark Millard > > markmi at dsl-only.net