From owner-freebsd-toolchain@freebsd.org Sun Feb 7 05:58:33 2016 Return-Path: Delivered-To: freebsd-toolchain@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 0FFDFA9F073 for ; Sun, 7 Feb 2016 05:58:33 +0000 (UTC) (envelope-from markmi@dsl-only.net) Received: from asp.reflexion.net (outbound-mail-210-2.reflexion.net [208.70.210.2]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id C9393762 for ; Sun, 7 Feb 2016 05:58:32 +0000 (UTC) (envelope-from markmi@dsl-only.net) Received: (qmail 28255 invoked from network); 7 Feb 2016 05:58:38 -0000 Received: from unknown (HELO mail-cs-01.app.dca.reflexion.local) (10.81.19.1) by 0 (rfx-qmail) with SMTP; 7 Feb 2016 05:58:38 -0000 Received: by mail-cs-01.app.dca.reflexion.local (Reflexion email security v7.80.0) with SMTP; Sun, 07 Feb 2016 00:58:33 -0500 (EST) Received: (qmail 15689 invoked from network); 7 Feb 2016 05:58:32 -0000 Received: from unknown (HELO iron2.pdx.net) (69.64.224.71) by 0 (rfx-qmail) with SMTP; 7 Feb 2016 05:58:32 -0000 X-No-Relay: not in my network X-No-Relay: not in my network X-No-Relay: not in my network X-No-Relay: not in my network X-No-Relay: not in my network X-No-Relay: not in my network Received: from [192.168.1.8] (c-76-115-7-162.hsd1.or.comcast.net [76.115.7.162]) by iron2.pdx.net (Postfix) with ESMTPSA id BB1CA1C43A8; Sat, 6 Feb 2016 21:58:18 -0800 (PST) Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 8.2 \(2104\)) Subject: Re: powerpc (32-bit) clang 3.8.0 vs. gcc 4.2.1 routine preamble mismatches: contributions to SEGV's differences From: Mark Millard In-Reply-To: <3D08EB58-7FEF-432E-8192-77F988A75621@dsl-only.net> Date: Sat, 6 Feb 2016 21:58:24 -0800 Cc: Roman Divacky , Justin Hibbits , Konstantin Belousov , Nathan Whitehorn Content-Transfer-Encoding: quoted-printable Message-Id: References: <3D08EB58-7FEF-432E-8192-77F988A75621@dsl-only.net> To: FreeBSD PowerPC ML , FreeBSD Toolchain X-Mailer: Apple Mail (2.2104) X-BeenThere: freebsd-toolchain@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Maintenance of FreeBSD's integrated toolchain List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 07 Feb 2016 05:58:33 -0000 I've submitted bug 206990 ( = https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D206990 ) with a = proof-of-concept patch for avoiding signal delivery mixed with clang = 3.8.0 generated code causing SEGV's --a patch that has passed my = personal testing so far. "make -j 6 buildworld" finished normally = instead of getting a SEGV in a few minutes on a = dual-processor/each-being-dual-core G5. Now a "make -j 3 buildworld" on a dual processor G4 is in process, = booted from the same SSD. We will see. The official TARGET_ARCH=3Dpowerpc sendsig code could tromp on the frame = pointer stored at "-4(r1)" (as seen in the clang3.8.0-generated code) = during the period in which the frame pointer is outside the range = identified by the stack pointer (r1) and where the stack started. The = change respects a Darwin-like/AIX-like "red-zone"/scratch area on the = smaller-address side of the stack. This should still be compatible with = gcc 4.2.1 style code, although it "wastes" more bytes temporarily in = that context. =3D=3D=3D Mark Millard markmi at dsl-only.net On 2016-Feb-5, at 1:59 AM, Mark Millard wrote: >=20 > Clang 3.8.0 produced code uses r31 as a frame pointer in contexts = where gcc 4.2.1 produced code does not (ever?). This leaves clang's = produced code being more dependent on r31 handling, such as when = resuming after signal delivery. >=20 > The following is one of the routines in "make" where a clang 3.8.0 = based "make" sometimes gets a SEGV after resuming after a SIGCHLD = delivery, the SEGV being from having r31=3D0x0 in a Frame Pointer (r31) = based address calculation that is at some point dereferenced. (See = https://lists.freebsd.org/pipermail/freebsd-ppc/2016-February/008002.html = .) >=20 > But gcc 4.2.1 does not use r31 as a frame pointer in the Str_Match = that it produces and so does not see the problem. gcc 4.2.1's produced = code simply uses the stack pointer as needed. >=20 >=20 > clang 3.8.0 based Str_Match preamble (from make): >=20 > 0x181a4a8 : mflr r0 > 0x181a4ac : stw r31,-4(r1) # Clang's frame = pointer (r31)=20 > # saved before stack = pointer changed. > 0x181a4b0 : stw r0,4(r1) # lr saved before = stack pointer changed. > 0x181a4b4 : stwu r1,-32(r1) # Stack pointer = finally saved and > # changed. > 0x181a4b8 : mr r31,r1 # r31 is the frame = pointer under clang. > 0x181a4bc : stw r30,24(r31) >=20 > gcc 4.2.1 based Str_Match preamble: >=20 > 0x1819cb8 : mflr r0 > 0x1819cbc : stwu r1,-32(r1) # Stack pointer saved = and changed first. > 0x1819cc0 : stw r31,28(r1) # r31 saved after = stack pointer changed. > 0x1819cc4 : mr r31,r3 # gcc 4.2.1 does not = reserve > # r31 for use as a = frame pointer. > 0x1819cc8 : stw r30,24(r1) > 0x1819ccc : stw r0,36(r1) # lr saved after = stack pointer changed. >=20 >=20 > (Str_Match is a self contained routine, although it is recursive.) >=20 >=20 > Looking at some other gcc 4.2.1 preamble examples. . . >=20 > 0x1823b58 : cmpwi cr7,r6,0 > 0x1823b5c : stwu r1,-64(r1) # Stack pointer saved = and changed "first" > 0x1823b60 : mflr r0 > 0x1823b64 : lis r9,396 > 0x1823b68 : stw r25,36(r1) > 0x1823b6c : addi r25,r9,8944 > 0x1823b70 : stw r26,40(r1) > 0x1823b74 : mr r26,r3 > 0x1823b78 : stw r27,44(r1) > 0x1823b7c : mr r27,r4 > 0x1823b80 : stw r28,48(r1) > 0x1823b84 : mr r28,r8 > 0x1823b88 : stw r29,52(r1) > 0x1823b8c : mr r29,r5 > 0x1823b90 : stw r31,60(r1) > 0x1823b94 : mr r31,r7 # Again r31 is not a = frame pointer > 0x1823b98 : stw r0,68(r1) > 0x1823b9c : lwz r0,0(r25) > 0x1823ba0 : stw r0,28(r1) > 0x1823ba4 : li r0,0 > 0x1823ba8 : stw r30,56(r1) > 0x1823bac : beq- cr7,0x1823bbc >=20 >=20 > 0x1819f30 : mflr r0 # Stack pointer saved = and changed first > 0x1819f34 : stwu r1,-32(r1) > 0x1819f38 : stw r28,16(r1) > 0x1819f3c : mr r28,r5 > 0x1819f40 : stw r30,24(r1) > 0x1819f44 : mr r30,r3 > 0x1819f48 : stw r31,28(r1) > 0x1819f4c : mr r31,r4 # Again r31 is not a = frame pointer > 0x1819f50 : stw r29,20(r1) > 0x1819f54 : stw r0,36(r1) > 0x1819f58 : lbz r29,0(r4) >=20 >=20 > 0x181fcac : mflr r0 # Stack pointer saved = and changed first > 0x181fcb0 : stwu r1,-48(r1) > 0x181fcb4 : lis r9,396 > 0x181fcb8 : stw r27,28(r1) > 0x181fcbc : mr r27,r4 > 0x181fcc0 : stw r0,52(r1) > 0x181fcc4 : stw r28,32(r1) > 0x181fcc8 : mr r28,r7 > 0x181fccc : lwz r0,-1344(r9) > 0x181fcd0 : stw r29,36(r1) > 0x181fcd4 : mr r29,r5 > 0x181fcd8 : andi. r9,r0,512 > 0x181fcdc : stw r30,40(r1) > 0x181fce0 : stw r31,44(r1) > 0x181fce4 : mr r30,r8 > 0x181fce8 : mr r31,r6 # Again r31 is not a = frame pointer >=20 >=20 > 0x1801d58 : mflr r0 # Stack pointer saved = and changed first > 0x1801d5c : stwu r1,-48(r1) > 0x1801d60 : stw r28,32(r1) > 0x1801d64 : stw r0,52(r1) > 0x1801d68 : stw r30,40(r1) > 0x1801d6c : mr r30,r4 > 0x1801d70 : lwz r28,4(r3) > 0x1801d74 : lwz r4,0(r3) > 0x1801d78 : stw r29,36(r1) > 0x1801d7c : add r29,r30,r28 > 0x1801d80 : cmpw cr7,r29,r4 > 0x1801d84 : stw r27,28(r1) > 0x1801d88 : stw r31,44(r1) > 0x1801d8c : mr r27,r5 > 0x1801d90 : mr r31,r3 # Again r31 is not a = frame pointer >=20 >=20 > And so it goes for every intermittent SEGV related example (clang = 3.8.0 buildworld based) that I've examined: the matching gcc 4.2.1 code = would not try to use the the r31 values that clang does use. Instead gcc = 4.2.1 assigns an independent value to r31 before using it. >=20 >=20 > In effect gcc 4.2.1 and clang 3.8.0 are not following the exact-same = call standard. If clang 3.8.0's code generation is left as is then a = conversion to its call standard requirements will be required if clang = 3.8.0 is to be used for powerpc (32-bit). >=20 > "Works when gcc 4.2.1 is used" is not a great guide to "appropriate = for use with clang 3.8.0", at least in this area for powerpc (32-bit). >=20 > (These notes presume a context with sys/powerpc/powerpc/sigcode32.S = -r295186 in place so that signal delivery maintains the modulo 16 byte = stack/frame alignment status instead of changing the alignment. It = appears that, while necessary, this is not sufficient for a clang 3.8.0 = based buildworld to operate with signals reliably. See = https://lists.freebsd.org/pipermail/freebsd-ppc/2016-February/008002.html = .) >=20 > =3D=3D=3D Mark Millard markmi at dsl-only.net From owner-freebsd-toolchain@freebsd.org Sun Feb 7 21:32:39 2016 Return-Path: Delivered-To: freebsd-toolchain@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id C37FDAA0F49 for ; Sun, 7 Feb 2016 21:32:39 +0000 (UTC) (envelope-from markmi@dsl-only.net) Received: from asp.reflexion.net (outbound-mail-210-2.reflexion.net [208.70.210.2]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 8AD4D1604 for ; Sun, 7 Feb 2016 21:32:39 +0000 (UTC) (envelope-from markmi@dsl-only.net) Received: (qmail 25133 invoked from network); 7 Feb 2016 21:32:38 -0000 Received: from unknown (HELO mail-cs-02.app.dca.reflexion.local) (10.81.19.2) by 0 (rfx-qmail) with SMTP; 7 Feb 2016 21:32:38 -0000 Received: by mail-cs-02.app.dca.reflexion.local (Reflexion email security v7.80.0) with SMTP; Sun, 07 Feb 2016 16:32:32 -0500 (EST) Received: (qmail 3162 invoked from network); 7 Feb 2016 21:32:32 -0000 Received: from unknown (HELO iron2.pdx.net) (69.64.224.71) by 0 (rfx-qmail) with SMTP; 7 Feb 2016 21:32:32 -0000 X-No-Relay: not in my network X-No-Relay: not in my network X-No-Relay: not in my network X-No-Relay: not in my network X-No-Relay: not in my network X-No-Relay: not in my network Received: from [192.168.1.8] (c-76-115-7-162.hsd1.or.comcast.net [76.115.7.162]) by iron2.pdx.net (Postfix) with ESMTPSA id 01E831C43DC; Sun, 7 Feb 2016 13:32:32 -0800 (PST) Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 8.2 \(2104\)) Subject: Re: powerpc (32-bit) clang 3.8.0 vs. gcc 4.2.1 routine preamble mismatches: contributions to SEGV's differences From: Mark Millard In-Reply-To: Date: Sun, 7 Feb 2016 13:32:36 -0800 Cc: Roman Divacky , Justin Hibbits , Konstantin Belousov , Nathan Whitehorn Content-Transfer-Encoding: quoted-printable Message-Id: <29B25F29-17DC-4F85-8C5A-3F286B260E29@dsl-only.net> References: <3D08EB58-7FEF-432E-8192-77F988A75621@dsl-only.net> To: FreeBSD PowerPC ML , FreeBSD Toolchain X-Mailer: Apple Mail (2.2104) X-BeenThere: freebsd-toolchain@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Maintenance of FreeBSD's integrated toolchain List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 07 Feb 2016 21:32:40 -0000 On 2016-Feb-6, at 9:58 PM, Mark Millard wrote: >=20 > I've submitted bug 206990 ( = https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D206990 ) with a = proof-of-concept patch for avoiding signal delivery mixed with clang = 3.8.0 generated code causing SEGV's --a patch that has passed my = personal testing so far. "make -j 6 buildworld" finished normally = instead of getting a SEGV in a few minutes on a = dual-processor/each-being-dual-core G5. >=20 > Now a "make -j 3 buildworld" on a dual processor G4 is in process, = booted from the same SSD. We will see. >=20 > The official TARGET_ARCH=3Dpowerpc sendsig code could tromp on the = frame pointer stored at "-4(r1)" (as seen in the clang3.8.0-generated = code) during the period in which the frame pointer is outside the range = identified by the stack pointer (r1) and where the stack started. The = change respects a Darwin-like/AIX-like "red-zone"/scratch area on the = smaller-address side of the stack. This should still be compatible with = gcc 4.2.1 style code, although it "wastes" more bytes temporarily in = that context. >=20 >=20 > =3D=3D=3D > Mark Millard > markmi at dsl-only.net Nathan Whitehorn had me submit to llvm.org. So there now is: Bug 26519 ( https://llvm.org/bugs/show_bug.cgi?id=3D26519 ) - Clang = 3.8.0's "Target: powerpc-unknown-freebsd11.0" code generation is = violating the ABI involved I left the "importance" at "normal" but if it still makes sense it may = be that "release blocker" would be appropriate. (Can 32-bit powerpc = block clang releases? Is it "too late now, already released"?) =3D=3D=3D Mark Millard markmi at dsl-only.net On 2016-Feb-5, at 1:59 AM, Mark Millard wrote: >=20 > Clang 3.8.0 produced code uses r31 as a frame pointer in contexts = where gcc 4.2.1 produced code does not (ever?). This leaves clang's = produced code being more dependent on r31 handling, such as when = resuming after signal delivery. >=20 > The following is one of the routines in "make" where a clang 3.8.0 = based "make" sometimes gets a SEGV after resuming after a SIGCHLD = delivery, the SEGV being from having r31=3D0x0 in a Frame Pointer (r31) = based address calculation that is at some point dereferenced. (See = https://lists.freebsd.org/pipermail/freebsd-ppc/2016-February/008002.html = .) >=20 > But gcc 4.2.1 does not use r31 as a frame pointer in the Str_Match = that it produces and so does not see the problem. gcc 4.2.1's produced = code simply uses the stack pointer as needed. >=20 >=20 > clang 3.8.0 based Str_Match preamble (from make): >=20 > 0x181a4a8 : mflr r0 > 0x181a4ac : stw r31,-4(r1) # Clang's frame = pointer (r31)=20 > # saved before stack = pointer changed. > 0x181a4b0 : stw r0,4(r1) # lr saved before = stack pointer changed. > 0x181a4b4 : stwu r1,-32(r1) # Stack pointer = finally saved and > # changed. > 0x181a4b8 : mr r31,r1 # r31 is the frame = pointer under clang. > 0x181a4bc : stw r30,24(r31) >=20 > gcc 4.2.1 based Str_Match preamble: >=20 > 0x1819cb8 : mflr r0 > 0x1819cbc : stwu r1,-32(r1) # Stack pointer saved = and changed first. > 0x1819cc0 : stw r31,28(r1) # r31 saved after = stack pointer changed. > 0x1819cc4 : mr r31,r3 # gcc 4.2.1 does not = reserve > # r31 for use as a = frame pointer. > 0x1819cc8 : stw r30,24(r1) > 0x1819ccc : stw r0,36(r1) # lr saved after = stack pointer changed. >=20 >=20 > (Str_Match is a self contained routine, although it is recursive.) >=20 >=20 > Looking at some other gcc 4.2.1 preamble examples. . . >=20 > 0x1823b58 : cmpwi cr7,r6,0 > 0x1823b5c : stwu r1,-64(r1) # Stack pointer saved = and changed "first" > 0x1823b60 : mflr r0 > 0x1823b64 : lis r9,396 > 0x1823b68 : stw r25,36(r1) > 0x1823b6c : addi r25,r9,8944 > 0x1823b70 : stw r26,40(r1) > 0x1823b74 : mr r26,r3 > 0x1823b78 : stw r27,44(r1) > 0x1823b7c : mr r27,r4 > 0x1823b80 : stw r28,48(r1) > 0x1823b84 : mr r28,r8 > 0x1823b88 : stw r29,52(r1) > 0x1823b8c : mr r29,r5 > 0x1823b90 : stw r31,60(r1) > 0x1823b94 : mr r31,r7 # Again r31 is not a = frame pointer > 0x1823b98 : stw r0,68(r1) > 0x1823b9c : lwz r0,0(r25) > 0x1823ba0 : stw r0,28(r1) > 0x1823ba4 : li r0,0 > 0x1823ba8 : stw r30,56(r1) > 0x1823bac : beq- cr7,0x1823bbc >=20 >=20 > 0x1819f30 : mflr r0 # Stack pointer saved = and changed first > 0x1819f34 : stwu r1,-32(r1) > 0x1819f38 : stw r28,16(r1) > 0x1819f3c : mr r28,r5 > 0x1819f40 : stw r30,24(r1) > 0x1819f44 : mr r30,r3 > 0x1819f48 : stw r31,28(r1) > 0x1819f4c : mr r31,r4 # Again r31 is not a = frame pointer > 0x1819f50 : stw r29,20(r1) > 0x1819f54 : stw r0,36(r1) > 0x1819f58 : lbz r29,0(r4) >=20 >=20 > 0x181fcac : mflr r0 # Stack pointer saved = and changed first > 0x181fcb0 : stwu r1,-48(r1) > 0x181fcb4 : lis r9,396 > 0x181fcb8 : stw r27,28(r1) > 0x181fcbc : mr r27,r4 > 0x181fcc0 : stw r0,52(r1) > 0x181fcc4 : stw r28,32(r1) > 0x181fcc8 : mr r28,r7 > 0x181fccc : lwz r0,-1344(r9) > 0x181fcd0 : stw r29,36(r1) > 0x181fcd4 : mr r29,r5 > 0x181fcd8 : andi. r9,r0,512 > 0x181fcdc : stw r30,40(r1) > 0x181fce0 : stw r31,44(r1) > 0x181fce4 : mr r30,r8 > 0x181fce8 : mr r31,r6 # Again r31 is not a = frame pointer >=20 >=20 > 0x1801d58 : mflr r0 # Stack pointer saved = and changed first > 0x1801d5c : stwu r1,-48(r1) > 0x1801d60 : stw r28,32(r1) > 0x1801d64 : stw r0,52(r1) > 0x1801d68 : stw r30,40(r1) > 0x1801d6c : mr r30,r4 > 0x1801d70 : lwz r28,4(r3) > 0x1801d74 : lwz r4,0(r3) > 0x1801d78 : stw r29,36(r1) > 0x1801d7c : add r29,r30,r28 > 0x1801d80 : cmpw cr7,r29,r4 > 0x1801d84 : stw r27,28(r1) > 0x1801d88 : stw r31,44(r1) > 0x1801d8c : mr r27,r5 > 0x1801d90 : mr r31,r3 # Again r31 is not a = frame pointer >=20 >=20 > And so it goes for every intermittent SEGV related example (clang = 3.8.0 buildworld based) that I've examined: the matching gcc 4.2.1 code = would not try to use the the r31 values that clang does use. Instead gcc = 4.2.1 assigns an independent value to r31 before using it. >=20 >=20 > In effect gcc 4.2.1 and clang 3.8.0 are not following the exact-same = call standard. If clang 3.8.0's code generation is left as is then a = conversion to its call standard requirements will be required if clang = 3.8.0 is to be used for powerpc (32-bit). >=20 > "Works when gcc 4.2.1 is used" is not a great guide to "appropriate = for use with clang 3.8.0", at least in this area for powerpc (32-bit). >=20 > (These notes presume a context with sys/powerpc/powerpc/sigcode32.S = -r295186 in place so that signal delivery maintains the modulo 16 byte = stack/frame alignment status instead of changing the alignment. It = appears that, while necessary, this is not sufficient for a clang 3.8.0 = based buildworld to operate with signals reliably. See = https://lists.freebsd.org/pipermail/freebsd-ppc/2016-February/008002.html = .) >=20 > =3D=3D=3D Mark Millard markmi at dsl-only.net