From owner-freebsd-toolchain@freebsd.org Wed Oct 17 10:00:03 2018 Return-Path: Delivered-To: freebsd-toolchain@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 1A68510D03ED for ; Wed, 17 Oct 2018 10:00:03 +0000 (UTC) (envelope-from marklmi@yahoo.com) Received: from sonic312-20.consmr.mail.bf2.yahoo.com (sonic312-20.consmr.mail.bf2.yahoo.com [74.6.128.82]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id B39AD7A001 for ; Wed, 17 Oct 2018 10:00:02 +0000 (UTC) (envelope-from marklmi@yahoo.com) X-YMail-OSG: lj3DrygVM1loCiK1nJl0qzBV9C7DR93JYWzQuICiHW5eOr3RPSrmmwVqIkGk7Ed Pd0QcZHUZmBlE8GvI8I2OJd6uX.kHMeBJu3129C1oJwETOXNxd7ATit7nPsqpH5qxro98mjoddF7 a2YeDO5xNOvKchvetxS6O3emKOWLz_GplLVRM9ftc5QaVXWhZ.DG919S3_mmRdpRerYtoa4bgv2T xQ1Xeib_OMD8bXlTT_v4v7A1gLbbUDVN2tyvqX7iKyEHWkNjLlMgpNZM6Rq3dGDNxfaygkMCo.JE EHc980a0fa7z2QmIVnBdTj1bwjr2OdAKDeID217wGx6oddxVOYZdebTttWE.EaA9ff2m6uu880Tg gG7U5I3p64ZgRbPz5ildOnhrLvaRnSm_dXMUZxJX1y4RLUlrRcpjrpQRhAjSHYMP1845lCx9Jcdt QBe6HAFmqohoSiV11TerE_JdnN3EnObyMTe.8.UkaY3sL3cCmax5ZK0lLWvtQFnIiSWIbsBwnOmN q340PFUk3qN5Z5ChlYK4DbmZKwHtYOOrdmosLnm8pIDBMKo9n78ttC54dnLPK8rtYvKCUU_A_8as q7uZTxHYU1CBNxVNAqxvdrq6fHOlFBoxdByLh8VianBzVO_75CC6oS26R5qwVfD4AND3fEJqVAQX qNUyvwWmOcOKJQQfBFpyMehYe.qVzO2l3HuObLWM60IvYt3gMaWaABfx4UCkx7nwjwTVPflOxlrF 8444mJUFyGSF.pDhe8D7h8U5.r7hODbTWmr9A00AMaDMxzuYoBsmwOKKgLBphjOru1jdfusNuPLg .PyX.y8CMO.2Z3HC.JIw01mrX8BMdG8sritfRHQjY12.qaOMf15qLE7fLitYbdESZR3lvqPPKOEZ TLO7gvRS4EhsTLt0lstGUw57RuZwsHLrkiZovM7NlfsxM75kbUdy0zZCsiQGsYiZnYx0NW0jJUAo uslixYASSVQS5Vbu1MHAtxAFC5jzp0LiBqYeJi8i_M.ZZsOF9HI4sMrji0F0sf_KVlbQrOyIH8_3 sZEl2Br.2atceKMnOUtnb9fWPJlUlqpqucmrr83lXwDvuYHXyo8DmyzQHMnzaABHDiWINGSht_f7 NJpuv0JOiltqRpOxmBw-- Received: from sonic.gate.mail.ne1.yahoo.com by sonic312.consmr.mail.bf2.yahoo.com with HTTP; Wed, 17 Oct 2018 09:59:56 +0000 Received: from c-76-115-7-162.hsd1.or.comcast.net (EHLO [192.168.1.25]) ([76.115.7.162]) by smtp402.mail.bf1.yahoo.com (Oath Hermes SMTP Server) with ESMTPA ID 262adbda998196774548452c497e7325; Wed, 17 Oct 2018 09:59:54 +0000 (UTC) From: Mark Millard Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: quoted-printable Mime-Version: 1.0 (Mac OS X Mail 11.5 \(3445.9.1\)) Subject: /lib/libgcc_s.so.1 mishandles eh_frame information that /usr/local/lib/gcc8/libgcc_s.so.1 handles (powerpc64 test context): a simple example program Message-Id: <4D444DB3-A472-42BC-973E-3E468C07757B@yahoo.com> Date: Wed, 17 Oct 2018 02:59:51 -0700 Cc: Justin Hibbits To: FreeBSD Toolchain , FreeBSD PowerPC ML X-Mailer: Apple Mail (2.3445.9.1) X-BeenThere: freebsd-toolchain@freebsd.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: Maintenance of FreeBSD's integrated toolchain List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 17 Oct 2018 10:00:03 -0000 (I happen to be using head -r339076 and ports -r480180 vintage materials, not that I expect such narrow vintage ties.) I finally have a simple example of the issue on powerpc64 . . . The following simple C++ program shows a significant difference for powerpc64 depending on which libgcc_s.so is used (system's vs. gcc8's): # more exception_test1.cpp=20 #include // -O2 context used. volatile unsigned int v =3D 1; extern int f() { volatile unsigned char c =3D 'a'; v++; // Despite "volatile" the access to v in g // was otherwise optimized out and the // std::exception was not followed by // code for f(). So force g's use. return c; } extern void g() { if (v) throw std::exception(); f(); // ends up inlined but the problem is demonstrated. } int main(void) { try {g();} // Used a separate function to avoid any potential // special handling of code in main. Call not // optimized out. catch (std::exception& e) {} return 0; } (gcc8 just happens to be the lang/gcc* that I have installed. Similar points likely apply to gcc[?-8]. The same problem can be demonstrated by devel/powerpc64-gcc use, which ends up using /lib/libgcc_s.so.1 as well --but does not provide the contrasting "it works" case.) The only reason for the try/catch is to avoid the "it works" case from doing: # ./a.out terminate called after throwing an instance of 'std::exception' what(): std::exception Abort trap (core dumped) Just calling g() is enough to have the problem with /lib/libgcc_s.so.1 . The program works fine for being built via: # g++8 -Wl,-rpath=3D/usr/local/lib/gcc8 -g -O2 exception_test1.cpp # ldd a.out a.out: libstdc++.so.6 =3D> /usr/local/lib/gcc8/libstdc++.so.6 = (0x81006e000) libm.so.5 =3D> /lib/libm.so.5 (0x8102c7000) libgcc_s.so.1 =3D> /usr/local/lib/gcc8/libgcc_s.so.1 = (0x810307000) libc.so.7 =3D> /lib/libc.so.7 (0x810330000) But fails, stuck looping in _Unwind_RaiseException, for being built via: # g++8 -g -O2 exception_test1.cpp # ldd a.out a.out: libstdc++.so.6 =3D> /usr/local/lib/gcc8/libstdc++.so.6 = (0x81006e000) libm.so.5 =3D> /lib/libm.so.5 (0x8102c7000) libgcc_s.so.1 =3D> /lib/libgcc_s.so.1 (0x810307000) libc.so.7 =3D> /lib/libc.so.7 (0x81032d000) The only difference (other than detailed addresses) is: libgcc_s.so.1 =3D> /usr/local/lib/gcc8/libgcc_s.so.1 = (0x810307000) vs. libgcc_s.so.1 =3D> /lib/libgcc_s.so.1 (0x810307000) The dwarfdump -v -v -F reports match exactly for the two builds of the program, as does the code for the function g where the problem is observed. What is different is that /lib/libgcc_s.so.1 misinterprets the .eh_frame information (disagreeing with the dwarfdump report and with /usr/local/lib/gcc8/libgcc_s.so.1 behavior). # dwarfdump -v -v -F a.out | more .eh_frame fde: < 0><0x100007a0:0x10000840><> 0x100007a0: =20 fde section offset 20 0x00000014 cie offset for fde: 24 0x00000018 0 DW_CFA_nop 1 DW_CFA_nop 2 DW_CFA_nop < 1><0x10000840:0x10000894>
0x10000840: =20 0x1000084c: =20 0x10000854: =20 0x10000860: =20 0x10000864: =20 fde section offset 152 0x00000098 cie offset for fde: 36 0x00000024 0 DW_CFA_advance_loc 12 (3 * 4) 1 DW_CFA_def_cfa_offset 112 3 DW_CFA_offset_extended_sf r65 16 (-2 * -8) 6 DW_CFA_advance_loc 8 (2 * 4) 7 DW_CFA_remember_state 8 DW_CFA_def_cfa_offset 0 10 DW_CFA_advance_loc 12 (3 * 4) 11 DW_CFA_restore_extended r65 13 DW_CFA_advance_loc 4 (1 * 4) 14 DW_CFA_restore_state < 2><0x10000db0:0x10000ddc> 0x10000db0: =20 fde section offset 64 0x00000040 cie offset for fde: 68 0x00000044 0 DW_CFA_nop 1 DW_CFA_nop 2 DW_CFA_nop < 3><0x10000de0:0x10000e5c> 0x10000de0: =20 0x10000de8: =20 0x10000e14: =20 0x10000e18: =20 0x10000e1c: =20 0x10000e24: =20 fde section offset 84 0x00000054 cie offset for fde: 88 0x00000058 0 DW_CFA_advance_loc 8 (2 * 4) 1 DW_CFA_def_cfa_offset 128 4 DW_CFA_advance_loc 44 (11 * 4) 5 DW_CFA_remember_state 6 DW_CFA_def_cfa_offset 0 8 DW_CFA_advance_loc 4 (1 * 4) 9 DW_CFA_restore_state 10 DW_CFA_advance_loc 4 (1 * 4) 11 DW_CFA_register r65 =3D r0 14 DW_CFA_advance_loc 8 (2 * 4) 15 DW_CFA_offset_extended_sf r65 16 (-2 * -8) 18 DW_CFA_nop < 4><0x10000ee0:0x10000f34><> 0x10000ee0: =20 0x10000ee4: =20 0x10000ef8: =20 fde section offset 40 0x00000028 cie offset for fde: 44 0x0000002c 0 DW_CFA_advance_loc 4 (1 * 4) 1 DW_CFA_register r65 =3D r12 4 DW_CFA_advance_loc 20 (5 * 4) 5 DW_CFA_restore_extended r65 cie: < 0> version 1 cie section offset 0 0x00000000 augmentation zR code_alignment_factor 4 data_alignment_factor -8 return_address_register 65 eh aug data len 0x1 bytes 0x1b=20 bytes of initial instructions 3 cie length 16 initial instructions 0 DW_CFA_def_cfa r1 0 < 1> version 1 cie section offset 120 0x00000078 augmentation zPLR code_alignment_factor 4 data_alignment_factor -8 return_address_register 65 eh aug data len 0xb bytes 0x94 00 00 00 00 00 01 04 c9 14 1b=20 bytes of initial instructions 3 cie length 28 initial instructions 0 DW_CFA_def_cfa r1 0 In: < 3><0x10000de0:0x10000e5c> 0x10000de0: =20 0x10000de8: =20 0x10000e14: =20 0x10000e18: =20 0x10000e1c: =20 0x10000e24: =20 The last 3 128's are from the DW_CFA_restore_state from the sequence: 1 DW_CFA_def_cfa_offset 128 . . . 5 DW_CFA_remember_state . . . 9 DW_CFA_restore_state But with /lib/libgcc_s.so.1 the 128 is not saved and restored, leaving default 0's in place instead. And use of the wrong stack addresses results, which in turn prevents the stack from unwinding past g()'s frame. [Note: For FreeBSD on powerpc64 r1 is the stack-pointer.] The code described by the: < 3><0x10000de0:0x10000e5c> . . . is as follows. Note the stdu r1,-128(r1) and the addi r1,r1,128 and what code only used via bne cr7,0x10000e18 and that it has the stdu r1,-128(r1) prior context, not addi r1,r1,128: (gdb) disass g Dump of assembler code for function g(): 0x0000000010000de0 <+0>: nop 0x0000000010000de4 <+4>: stdu r1,-128(r1) 0x0000000010000de8 <+8>: lwz r9,-32536(r2) 0x0000000010000dec <+12>: cmpdi cr7,r9,0 0x0000000010000df0 <+16>: bne cr7,0x10000e18 0x0000000010000df4 <+20>: li r9,97 0x0000000010000df8 <+24>: nop 0x0000000010000dfc <+28>: stb r9,112(r1) 0x0000000010000e00 <+32>: lwz r9,-32536(r2) 0x0000000010000e04 <+36>: addi r9,r9,1 0x0000000010000e08 <+40>: stw r9,-32536(r2) 0x0000000010000e0c <+44>: lbz r9,112(r1) 0x0000000010000e10 <+48>: addi r1,r1,128 0x0000000010000e14 <+52>: blr 0x0000000010000e18 <+56>: mflr r0 0x0000000010000e1c <+60>: li r3,8 0x0000000010000e20 <+64>: std r0,144(r1) 0x0000000010000e24 <+68>: bl 0x100007a0 = <0000004b.plt_call.__cxa_allocate_exception@@CXXABI_1.3> 0x0000000010000e28 <+72>: ld r2,40(r1) 0x0000000010000e2c <+76>: nop 0x0000000010000e30 <+80>: nop 0x0000000010000e34 <+84>: ld r9,-32720(r2) 0x0000000010000e38 <+88>: ld r5,-32712(r2) 0x0000000010000e3c <+92>: nop 0x0000000010000e40 <+96>: ld r4,-32704(r2) 0x0000000010000e44 <+100>: std r9,0(r3) 0x0000000010000e48 <+104>: bl 0x10000820 = <0000004b.plt_call.__cxa_throw@@CXXABI_1.3> 0x0000000010000e4c <+108>: ld r2,40(r1) 0x0000000010000e50 <+112>: .long 0x0 0x0000000010000e54 <+116>: .long 0x90001 0x0000000010000e58 <+120>: lwz r0,0(0) [Note: more than the 128's might not be handled right for more general code, but the example only shows the 128's issue (i.e., the cfa_offset mishandling issue).] I'll note that throw_exception in /lib/libgcc_s.so.1 has the same sort of machine-code structure as g relative to cfa_offset's and that, without a workaround to avoid that structure being generated, all thrown C++ exceptions fail by _Unwind_RaiseException being stuck in a loop for powerpc64. In order to test the simple program I used the workaround: # svnlite diff /usr/src/contrib/libcxxrt/ Index: /usr/src/contrib/libcxxrt/exception.cc =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D --- /usr/src/contrib/libcxxrt/exception.cc (revision 339076) +++ /usr/src/contrib/libcxxrt/exception.cc (working copy) @@ -772,10 +772,71 @@ info->globals.uncaughtExceptions++; _Unwind_Reason_Code err =3D = _Unwind_RaiseException(&ex->unwindHeader); +#if !defined(__powerpc64__) && !defined(__ppc64__) // The _Unwind_RaiseException() function should not return, it = should // unwind the stack past this function. If it does return, then = something // has gone wrong. report_failure(err, ex); +#else +// NOTE: Only tested for devel/powerpc64-gcc based buildworld +// because clang still silently ignores +// __builtin_eh_return(offset,handler) for powerpc64 +// (and powerpc), thus not generating correct output. +// +// NOTE: I've no clue if other archtiectures might have +// analogous issues to powerpc64. I'm not sure +// about powerpc because of it still being stuck +// at gcc 4.2.1 . (clang problems and no devel/powerpc-gcc .) +// +// The above/normal code produced the following sort of structure +// for throw_exception. r1 is the stack pointer, note its adjustments +// via stdu r1,-128(r1) and via addi r1,r1,128 . +// +// : mflr r0 +// : std r31,-8(r1) +// : mr r31,r3 +// : std r0,16(r1) +// : stdu r1,-128(r1) +// . . . +// : bl = <00000018.plt_call._Unwind_RaiseException@@GCC_3.0> +// : ld r2,40(r1) +// : addi r1,r1,128 +// : mr r4,r31 +// : ld r0,16(r1) +// : ld r31,-8(r1) +// : mtlr r0 +// : b +// +// The loop in __Unwind_RaiseException had its "fs" +// used with uw_frame_state_for and uw_update_context get +// stuck with the pc field having the address for +// throw_exception+152 (just after the stack adjustment +// addi r1,r1,128). Effectively, throw_exception unwinds +// its stack use before calling report_failure in a +// way that throw_exception is no longer on the stack. +// The exception unwinding logic did not handle this +// correctly and got stuck looping. +// +// The below avoids having any such stack adjustment here +// by avoiding the report_failure call and directly doing +// what case _URC_END_OF_STACK in report_failure does for +// its first couple of lines. (It is also the kind of +// thing that src/contrib/libstdc++/libsupc++/eh_throw.cc +// has in its __cxxabiv1::__cxa_throw after the +// _Unwind_RaiseException call.) +// +// Another option could be to turn report_failure into +// a macro so that no subroutine call could be involved. +// That should avoid the early stack pointer kadjsutment. +// +// Also: For the other archtiectures that I looked at, no +// such stack adjsutments were involved in the code +// generated (or the matching dwarfdump output). +// But I did not look at many. + + __cxa_begin_catch (&(ex->unwindHeader)); + std::terminate(); +#endif } =3D=3D=3D Mark Millard marklmi at yahoo.com ( dsl-only.net went away in early 2018-Mar)