Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 5 Feb 2016 01:59:32 -0800
From:      Mark Millard <markmi@dsl-only.net>
To:        FreeBSD PowerPC ML <freebsd-ppc@freebsd.org>, FreeBSD Toolchain <freebsd-toolchain@freebsd.org>
Cc:        Roman Divacky <rdivacky@vlakno.cz>, Justin Hibbits <chmeeedalf@gmail.com>, Konstantin Belousov <kib@freebsd.org>, Nathan Whitehorn <nwhitehorn@freebsd.org>
Subject:   powerpc (32-bit) clang 3.8.0 vs. gcc 4.2.1 routine preamble mismatches: contributions to SEGV's differences
Message-ID:  <3D08EB58-7FEF-432E-8192-77F988A75621@dsl-only.net>

next in thread | raw e-mail | index | archive | help
Clang 3.8.0 produced code uses r31 as a frame pointer in contexts where =
gcc 4.2.1 produced code does not (ever?). This leaves clang's produced =
code being more dependent on r31 handling, such as when resuming after =
signal delivery.

The following is one of the routines in "make" where a clang 3.8.0 based =
"make" sometimes gets a SEGV after resuming after a SIGCHLD delivery, =
the SEGV being from having r31=3D0x0 in a Frame Pointer (r31) based =
address calculation that is at some point dereferenced. (See =
https://lists.freebsd.org/pipermail/freebsd-ppc/2016-February/008002.html =
.)

But gcc 4.2.1 does not use r31 as a frame pointer in the Str_Match that =
it produces and so does not see the problem. gcc 4.2.1's produced code =
simply uses the stack pointer as needed.


clang 3.8.0 based Str_Match preamble (from make):

0x181a4a8 <Str_Match>:	mflr    r0
0x181a4ac <Str_Match+4>:	stw     r31,-4(r1) # Clang's frame =
pointer (r31)=20
                                                   # saved before stack =
pointer changed.
0x181a4b0 <Str_Match+8>:	stw     r0,4(r1)   # lr saved before =
stack pointer changed.
0x181a4b4 <Str_Match+12>:	stwu    r1,-32(r1) # Stack pointer =
finally saved and
                                                   # changed.
0x181a4b8 <Str_Match+16>:	mr      r31,r1     # r31 is the frame =
pointer under clang.
0x181a4bc <Str_Match+20>:	stw     r30,24(r31)

gcc 4.2.1 based Str_Match preamble:

0x1819cb8 <Str_Match>:	mflr    r0
0x1819cbc <Str_Match+4>:	stwu    r1,-32(r1) # Stack pointer saved =
and changed first.
0x1819cc0 <Str_Match+8>:	stw     r31,28(r1) # r31 saved after =
stack pointer changed.
0x1819cc4 <Str_Match+12>:	mr      r31,r3     # gcc 4.2.1 does not =
reserve
                                                   # r31 for use as a =
frame pointer.
0x1819cc8 <Str_Match+16>:	stw     r30,24(r1)
0x1819ccc <Str_Match+20>:	stw     r0,36(r1)  # lr saved after =
stack pointer changed.


(Str_Match is a self contained routine, although it is recursive.)


Looking at some other gcc 4.2.1 preamble examples. . .

0x1823b58 <VarSYSVMatch>:	cmpwi   cr7,r6,0
0x1823b5c <VarSYSVMatch+4>:	stwu    r1,-64(r1) # Stack pointer saved =
and changed "first"
0x1823b60 <VarSYSVMatch+8>:	mflr    r0
0x1823b64 <VarSYSVMatch+12>:	lis     r9,396
0x1823b68 <VarSYSVMatch+16>:	stw     r25,36(r1)
0x1823b6c <VarSYSVMatch+20>:	addi    r25,r9,8944
0x1823b70 <VarSYSVMatch+24>:	stw     r26,40(r1)
0x1823b74 <VarSYSVMatch+28>:	mr      r26,r3
0x1823b78 <VarSYSVMatch+32>:	stw     r27,44(r1)
0x1823b7c <VarSYSVMatch+36>:	mr      r27,r4
0x1823b80 <VarSYSVMatch+40>:	stw     r28,48(r1)
0x1823b84 <VarSYSVMatch+44>:	mr      r28,r8
0x1823b88 <VarSYSVMatch+48>:	stw     r29,52(r1)
0x1823b8c <VarSYSVMatch+52>:	mr      r29,r5
0x1823b90 <VarSYSVMatch+56>:	stw     r31,60(r1)
0x1823b94 <VarSYSVMatch+60>:	mr      r31,r7     # Again r31 is not a =
frame pointer
0x1823b98 <VarSYSVMatch+64>:	stw     r0,68(r1)
0x1823b9c <VarSYSVMatch+68>:	lwz     r0,0(r25)
0x1823ba0 <VarSYSVMatch+72>:	stw     r0,28(r1)
0x1823ba4 <VarSYSVMatch+76>:	li      r0,0
0x1823ba8 <VarSYSVMatch+80>:	stw     r30,56(r1)
0x1823bac <VarSYSVMatch+84>:	beq-    cr7,0x1823bbc <VarSYSVMatch+100>


0x1819f30 <Str_SYSVMatch>:	mflr    r0         # Stack pointer saved =
and changed first
0x1819f34 <Str_SYSVMatch+4>:	stwu    r1,-32(r1)
0x1819f38 <Str_SYSVMatch+8>:	stw     r28,16(r1)
0x1819f3c <Str_SYSVMatch+12>:	mr      r28,r5
0x1819f40 <Str_SYSVMatch+16>:	stw     r30,24(r1)
0x1819f44 <Str_SYSVMatch+20>:	mr      r30,r3
0x1819f48 <Str_SYSVMatch+24>:	stw     r31,28(r1)
0x1819f4c <Str_SYSVMatch+28>:	mr      r31,r4     # Again r31 is not a =
frame pointer
0x1819f50 <Str_SYSVMatch+32>:	stw     r29,20(r1)
0x1819f54 <Str_SYSVMatch+36>:	stw     r0,36(r1)
0x1819f58 <Str_SYSVMatch+40>:	lbz     r29,0(r4)


0x181fcac <VarMatch>:	mflr    r0                 # Stack pointer saved =
and changed first
0x181fcb0 <VarMatch+4>:	stwu    r1,-48(r1)
0x181fcb4 <VarMatch+8>:	lis     r9,396
0x181fcb8 <VarMatch+12>:	stw     r27,28(r1)
0x181fcbc <VarMatch+16>:	mr      r27,r4
0x181fcc0 <VarMatch+20>:	stw     r0,52(r1)
0x181fcc4 <VarMatch+24>:	stw     r28,32(r1)
0x181fcc8 <VarMatch+28>:	mr      r28,r7
0x181fccc <VarMatch+32>:	lwz     r0,-1344(r9)
0x181fcd0 <VarMatch+36>:	stw     r29,36(r1)
0x181fcd4 <VarMatch+40>:	mr      r29,r5
0x181fcd8 <VarMatch+44>:	andi.   r9,r0,512
0x181fcdc <VarMatch+48>:	stw     r30,40(r1)
0x181fce0 <VarMatch+52>:	stw     r31,44(r1)
0x181fce4 <VarMatch+56>:	mr      r30,r8
0x181fce8 <VarMatch+60>:	mr      r31,r6     # Again r31 is not a =
frame pointer


0x1801d58 <Buf_AddBytes>:	mflr    r0         # Stack pointer saved =
and changed first
0x1801d5c <Buf_AddBytes+4>:	stwu    r1,-48(r1)
0x1801d60 <Buf_AddBytes+8>:	stw     r28,32(r1)
0x1801d64 <Buf_AddBytes+12>:	stw     r0,52(r1)
0x1801d68 <Buf_AddBytes+16>:	stw     r30,40(r1)
0x1801d6c <Buf_AddBytes+20>:	mr      r30,r4
0x1801d70 <Buf_AddBytes+24>:	lwz     r28,4(r3)
0x1801d74 <Buf_AddBytes+28>:	lwz     r4,0(r3)
0x1801d78 <Buf_AddBytes+32>:	stw     r29,36(r1)
0x1801d7c <Buf_AddBytes+36>:	add     r29,r30,r28
0x1801d80 <Buf_AddBytes+40>:	cmpw    cr7,r29,r4
0x1801d84 <Buf_AddBytes+44>:	stw     r27,28(r1)
0x1801d88 <Buf_AddBytes+48>:	stw     r31,44(r1)
0x1801d8c <Buf_AddBytes+52>:	mr      r27,r5
0x1801d90 <Buf_AddBytes+56>:	mr      r31,r3     # Again r31 is not a =
frame pointer


And so it goes for every intermittent SEGV related example (clang 3.8.0 =
buildworld based) that I've examined: the matching gcc 4.2.1 code would =
not try to use the the r31 values that clang does use. Instead gcc 4.2.1 =
assigns an independent value to r31 before using it.


In effect gcc 4.2.1 and clang 3.8.0 are not following the exact-same =
call standard. If clang 3.8.0's code generation is left as is then a =
conversion to its call standard requirements will be required if clang =
3.8.0 is to be used for powerpc (32-bit).

"Works when gcc 4.2.1 is used" is not a great guide to "appropriate for =
use with clang 3.8.0", at least in this area for powerpc (32-bit).

(These notes presume a context with sys/powerpc/powerpc/sigcode32.S =
-r295186 in place so that signal delivery maintains the modulo 16 byte =
stack/frame alignment status instead of changing the alignment. It =
appears that, while necessary, this is not sufficient for a clang 3.8.0 =
based buildworld to operate with signals reliably. See =
https://lists.freebsd.org/pipermail/freebsd-ppc/2016-February/008002.html =
.)

=3D=3D=3D
Mark Millard
markmi at dsl-only.net




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?3D08EB58-7FEF-432E-8192-77F988A75621>