Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 6 Feb 2016 21:58:24 -0800
From:      Mark Millard <markmi@dsl-only.net>
To:        FreeBSD PowerPC ML <freebsd-ppc@freebsd.org>, FreeBSD Toolchain <freebsd-toolchain@freebsd.org>
Cc:        Roman Divacky <rdivacky@vlakno.cz>, Justin Hibbits <chmeeedalf@gmail.com>, Konstantin Belousov <kib@freebsd.org>, Nathan Whitehorn <nwhitehorn@freebsd.org>
Subject:   Re: powerpc (32-bit) clang 3.8.0 vs. gcc 4.2.1 routine preamble mismatches: contributions to SEGV's differences
Message-ID:  <D985966A-2997-4A5D-955E-222D5F29645B@dsl-only.net>
In-Reply-To: <3D08EB58-7FEF-432E-8192-77F988A75621@dsl-only.net>
References:  <3D08EB58-7FEF-432E-8192-77F988A75621@dsl-only.net>

next in thread | previous in thread | raw e-mail | index | archive | help
I've submitted bug 206990 ( =
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D206990 ) with a =
proof-of-concept patch for avoiding signal delivery mixed with clang =
3.8.0 generated code causing SEGV's --a patch that has passed my =
personal testing so far. "make -j 6 buildworld" finished normally =
instead of getting a SEGV in a few minutes on a =
dual-processor/each-being-dual-core G5.

Now a "make -j 3 buildworld" on a dual processor G4 is in process, =
booted from the same SSD. We will see.

The official TARGET_ARCH=3Dpowerpc sendsig code could tromp on the frame =
pointer stored at "-4(r1)" (as seen in the clang3.8.0-generated code) =
during the period in which the frame pointer is outside the range =
identified by the stack pointer (r1) and where the stack started. The =
change respects a Darwin-like/AIX-like "red-zone"/scratch area on the =
smaller-address side of the stack. This should still be compatible with =
gcc 4.2.1 style code, although it "wastes" more bytes temporarily in =
that context.


=3D=3D=3D
Mark Millard
markmi at dsl-only.net

On 2016-Feb-5, at 1:59 AM, Mark Millard <markmi at dsl-only.net> wrote:
>=20
> Clang 3.8.0 produced code uses r31 as a frame pointer in contexts =
where gcc 4.2.1 produced code does not (ever?). This leaves clang's =
produced code being more dependent on r31 handling, such as when =
resuming after signal delivery.
>=20
> The following is one of the routines in "make" where a clang 3.8.0 =
based "make" sometimes gets a SEGV after resuming after a SIGCHLD =
delivery, the SEGV being from having r31=3D0x0 in a Frame Pointer (r31) =
based address calculation that is at some point dereferenced. (See =
https://lists.freebsd.org/pipermail/freebsd-ppc/2016-February/008002.html =
.)
>=20
> But gcc 4.2.1 does not use r31 as a frame pointer in the Str_Match =
that it produces and so does not see the problem. gcc 4.2.1's produced =
code simply uses the stack pointer as needed.
>=20
>=20
> clang 3.8.0 based Str_Match preamble (from make):
>=20
> 0x181a4a8 <Str_Match>:	mflr    r0
> 0x181a4ac <Str_Match+4>:	stw     r31,-4(r1) # Clang's frame =
pointer (r31)=20
>                                                   # saved before stack =
pointer changed.
> 0x181a4b0 <Str_Match+8>:	stw     r0,4(r1)   # lr saved before =
stack pointer changed.
> 0x181a4b4 <Str_Match+12>:	stwu    r1,-32(r1) # Stack pointer =
finally saved and
>                                                   # changed.
> 0x181a4b8 <Str_Match+16>:	mr      r31,r1     # r31 is the frame =
pointer under clang.
> 0x181a4bc <Str_Match+20>:	stw     r30,24(r31)
>=20
> gcc 4.2.1 based Str_Match preamble:
>=20
> 0x1819cb8 <Str_Match>:	mflr    r0
> 0x1819cbc <Str_Match+4>:	stwu    r1,-32(r1) # Stack pointer saved =
and changed first.
> 0x1819cc0 <Str_Match+8>:	stw     r31,28(r1) # r31 saved after =
stack pointer changed.
> 0x1819cc4 <Str_Match+12>:	mr      r31,r3     # gcc 4.2.1 does not =
reserve
>                                                   # r31 for use as a =
frame pointer.
> 0x1819cc8 <Str_Match+16>:	stw     r30,24(r1)
> 0x1819ccc <Str_Match+20>:	stw     r0,36(r1)  # lr saved after =
stack pointer changed.
>=20
>=20
> (Str_Match is a self contained routine, although it is recursive.)
>=20
>=20
> Looking at some other gcc 4.2.1 preamble examples. . .
>=20
> 0x1823b58 <VarSYSVMatch>:	cmpwi   cr7,r6,0
> 0x1823b5c <VarSYSVMatch+4>:	stwu    r1,-64(r1) # Stack pointer saved =
and changed "first"
> 0x1823b60 <VarSYSVMatch+8>:	mflr    r0
> 0x1823b64 <VarSYSVMatch+12>:	lis     r9,396
> 0x1823b68 <VarSYSVMatch+16>:	stw     r25,36(r1)
> 0x1823b6c <VarSYSVMatch+20>:	addi    r25,r9,8944
> 0x1823b70 <VarSYSVMatch+24>:	stw     r26,40(r1)
> 0x1823b74 <VarSYSVMatch+28>:	mr      r26,r3
> 0x1823b78 <VarSYSVMatch+32>:	stw     r27,44(r1)
> 0x1823b7c <VarSYSVMatch+36>:	mr      r27,r4
> 0x1823b80 <VarSYSVMatch+40>:	stw     r28,48(r1)
> 0x1823b84 <VarSYSVMatch+44>:	mr      r28,r8
> 0x1823b88 <VarSYSVMatch+48>:	stw     r29,52(r1)
> 0x1823b8c <VarSYSVMatch+52>:	mr      r29,r5
> 0x1823b90 <VarSYSVMatch+56>:	stw     r31,60(r1)
> 0x1823b94 <VarSYSVMatch+60>:	mr      r31,r7     # Again r31 is not a =
frame pointer
> 0x1823b98 <VarSYSVMatch+64>:	stw     r0,68(r1)
> 0x1823b9c <VarSYSVMatch+68>:	lwz     r0,0(r25)
> 0x1823ba0 <VarSYSVMatch+72>:	stw     r0,28(r1)
> 0x1823ba4 <VarSYSVMatch+76>:	li      r0,0
> 0x1823ba8 <VarSYSVMatch+80>:	stw     r30,56(r1)
> 0x1823bac <VarSYSVMatch+84>:	beq-    cr7,0x1823bbc <VarSYSVMatch+100>
>=20
>=20
> 0x1819f30 <Str_SYSVMatch>:	mflr    r0         # Stack pointer saved =
and changed first
> 0x1819f34 <Str_SYSVMatch+4>:	stwu    r1,-32(r1)
> 0x1819f38 <Str_SYSVMatch+8>:	stw     r28,16(r1)
> 0x1819f3c <Str_SYSVMatch+12>:	mr      r28,r5
> 0x1819f40 <Str_SYSVMatch+16>:	stw     r30,24(r1)
> 0x1819f44 <Str_SYSVMatch+20>:	mr      r30,r3
> 0x1819f48 <Str_SYSVMatch+24>:	stw     r31,28(r1)
> 0x1819f4c <Str_SYSVMatch+28>:	mr      r31,r4     # Again r31 is not a =
frame pointer
> 0x1819f50 <Str_SYSVMatch+32>:	stw     r29,20(r1)
> 0x1819f54 <Str_SYSVMatch+36>:	stw     r0,36(r1)
> 0x1819f58 <Str_SYSVMatch+40>:	lbz     r29,0(r4)
>=20
>=20
> 0x181fcac <VarMatch>:	mflr    r0                 # Stack pointer saved =
and changed first
> 0x181fcb0 <VarMatch+4>:	stwu    r1,-48(r1)
> 0x181fcb4 <VarMatch+8>:	lis     r9,396
> 0x181fcb8 <VarMatch+12>:	stw     r27,28(r1)
> 0x181fcbc <VarMatch+16>:	mr      r27,r4
> 0x181fcc0 <VarMatch+20>:	stw     r0,52(r1)
> 0x181fcc4 <VarMatch+24>:	stw     r28,32(r1)
> 0x181fcc8 <VarMatch+28>:	mr      r28,r7
> 0x181fccc <VarMatch+32>:	lwz     r0,-1344(r9)
> 0x181fcd0 <VarMatch+36>:	stw     r29,36(r1)
> 0x181fcd4 <VarMatch+40>:	mr      r29,r5
> 0x181fcd8 <VarMatch+44>:	andi.   r9,r0,512
> 0x181fcdc <VarMatch+48>:	stw     r30,40(r1)
> 0x181fce0 <VarMatch+52>:	stw     r31,44(r1)
> 0x181fce4 <VarMatch+56>:	mr      r30,r8
> 0x181fce8 <VarMatch+60>:	mr      r31,r6     # Again r31 is not a =
frame pointer
>=20
>=20
> 0x1801d58 <Buf_AddBytes>:	mflr    r0         # Stack pointer saved =
and changed first
> 0x1801d5c <Buf_AddBytes+4>:	stwu    r1,-48(r1)
> 0x1801d60 <Buf_AddBytes+8>:	stw     r28,32(r1)
> 0x1801d64 <Buf_AddBytes+12>:	stw     r0,52(r1)
> 0x1801d68 <Buf_AddBytes+16>:	stw     r30,40(r1)
> 0x1801d6c <Buf_AddBytes+20>:	mr      r30,r4
> 0x1801d70 <Buf_AddBytes+24>:	lwz     r28,4(r3)
> 0x1801d74 <Buf_AddBytes+28>:	lwz     r4,0(r3)
> 0x1801d78 <Buf_AddBytes+32>:	stw     r29,36(r1)
> 0x1801d7c <Buf_AddBytes+36>:	add     r29,r30,r28
> 0x1801d80 <Buf_AddBytes+40>:	cmpw    cr7,r29,r4
> 0x1801d84 <Buf_AddBytes+44>:	stw     r27,28(r1)
> 0x1801d88 <Buf_AddBytes+48>:	stw     r31,44(r1)
> 0x1801d8c <Buf_AddBytes+52>:	mr      r27,r5
> 0x1801d90 <Buf_AddBytes+56>:	mr      r31,r3     # Again r31 is not a =
frame pointer
>=20
>=20
> And so it goes for every intermittent SEGV related example (clang =
3.8.0 buildworld based) that I've examined: the matching gcc 4.2.1 code =
would not try to use the the r31 values that clang does use. Instead gcc =
4.2.1 assigns an independent value to r31 before using it.
>=20
>=20
> In effect gcc 4.2.1 and clang 3.8.0 are not following the exact-same =
call standard. If clang 3.8.0's code generation is left as is then a =
conversion to its call standard requirements will be required if clang =
3.8.0 is to be used for powerpc (32-bit).
>=20
> "Works when gcc 4.2.1 is used" is not a great guide to "appropriate =
for use with clang 3.8.0", at least in this area for powerpc (32-bit).
>=20
> (These notes presume a context with sys/powerpc/powerpc/sigcode32.S =
-r295186 in place so that signal delivery maintains the modulo 16 byte =
stack/frame alignment status instead of changing the alignment. It =
appears that, while necessary, this is not sufficient for a clang 3.8.0 =
based buildworld to operate with signals reliably. See =
https://lists.freebsd.org/pipermail/freebsd-ppc/2016-February/008002.html =
.)
>=20
> =3D=3D=3D
Mark Millard
markmi at dsl-only.net





Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?D985966A-2997-4A5D-955E-222D5F29645B>