Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 31 Jan 2016 22:58:47 -0800
From:      Mark Millard <markmi@dsl-only.net>
To:        Justin Hibbits <chmeeedalf@gmail.com>
Cc:        Roman Divacky <rdivacky@vlakno.cz>, Nathan Whitehorn <nwhitehorn@freebsd.org>, FreeBSD Toolchain <freebsd-toolchain@freebsd.org>, FreeBSD PowerPC ML <freebsd-ppc@freebsd.org>
Subject:   Re: 3 quick questions about stack alignment for powerpc (32-bit) signal handlers
Message-ID:  <261D8A47-3B8A-4DE6-9D2C-F536C9143E84@dsl-only.net>
In-Reply-To: <1CCB483E-882A-4068-AF5B-EF43DAF0BA79@dsl-only.net>
References:  <517B7923-5166-42D0-8FA8-52C05F956F06@dsl-only.net> <20160131140807.GA83147@vlakno.cz> <0716BE3E-B7D1-4A10-B011-C1F0245296E7@dsl-only.net> <E591AEFA-8BB0-4CD2-BD29-5B7D6C8F6D91@gmail.com> <70A66DFD-557A-4D82-813C-05EED6EAB089@dsl-only.net> <FCCE1402-A7FA-4476-9179-E88999D832A3@dsl-only.net> <1CCB483E-882A-4068-AF5B-EF43DAF0BA79@dsl-only.net>

next in thread | previous in thread | raw e-mail | index | archive | help
Just a correction to a sentence that I wrote. I had written:

> Frame at:            0x...90 vs. 0x...1c
> call by frame:       0x...b0 vs. 0x...1c
> Arglist at:          0x...70 vs. 0x...dc
> Locals at:           0x...70 vs. 0x...dc
> Previous frame's sp: 0x...90 vs. 0x...1c
>=20
> It looks like 4 additional pad bytes on the user/process stack are =
needed to get back to alignment.

Of course the figures on the right need to get smaller, not larger: The =
stack grows towards smaller addresses. So to get to 0x...0 on the right =
I should have said:

It looks like 12 additional pad bytes on the user/process stack are =
needed to get back to alignment.

That would produce:

Frame at:            0x...90 vs. 0x...10
call by frame:       0x...b0 vs. 0x...10
Arglist at:          0x...70 vs. 0x...d0
Locals at:           0x...70 vs. 0x...d0
Previous frame's sp: 0x...90 vs. 0x...10

=3D=3D=3D
Mark Millard
markmi at dsl-only.net

On 2016-Jan-31, at 10:47 PM, Mark Millard <markmi at dsl-only.net> =
wrote:

More evidence: By adding "break raise" and then using "info frame" to =
show the alignment at that point I can show that the later signal =
delivery changes the alignment on the user process stack compared to =
when raise was called. (Later I show the same for thr_kill.)

> Breakpoint 2, __raise (s=3D29) at /usr/src/lib/libc/gen/raise.c:50
> warning: Source file is more recent than executable.
> 50		if (__sys_thr_self(&id) =3D=3D -1)
> (gdb) info frame
> Stack level 0, frame at 0xffffdc90:
> pc =3D 0x41904630 in __raise (/usr/src/lib/libc/gen/raise.c:50); saved =
pc =3D 0x1800774
> called by frame at 0xffffdcb0
> source language c.
> Arglist at 0xffffdc70, args: s=3D29
> Locals at 0xffffdc70, Previous frame's sp is 0xffffdc90
> Saved registers:
>  r29 at 0xffffdc84, r30 at 0xffffdc88, r31 at 0xffffdc8c, pc at =
0xffffdc94, lr at 0xffffdc94
> (gdb) cont
> Continuing.
>=20
> Program received signal SIGINFO, Information request.
>=20
> Breakpoint 1, 0x018006d0 in handler ()
> (gdb) info frame
> Stack level 0, frame at 0xffffd71c:
> pc =3D 0x18006d0 in handler; saved pc =3D 0xffffe008
> called by frame at 0xffffd71c
> Arglist at 0xffffd6dc, args:=20
> Locals at 0xffffd6dc, Previous frame's sp is 0xffffd71c
> Saved registers:
>  r31 at 0xffffd718, pc at 0xffffd720, lr at 0xffffd720

Note the difference (raise before delivery vs. handler via delivery):

Frame at:            0x...90 vs. 0x...1c
call by frame:       0x...b0 vs. 0x...1c
Arglist at:          0x...70 vs. 0x...dc
Locals at:           0x...70 vs. 0x...dc
Previous frame's sp: 0x...90 vs. 0x...1c

It looks like 4 additional pad bytes on the user/process stack are =
needed to get back to alignment.

[The span of addresses seems to be about: =
0xffffdc90-0xffffd6dc=3D=3D0x5B4=3D=3D1460 (raise's "frame at" minus =
handler's "Locals at").]


If I look at the frame for "break thr_kill" it also still shows an =
aligned user/process stack before the delivery:

> Breakpoint 3, 0x419046a0 in thr_kill () from /lib/libc.so.7
> (gdb) info frame
> Stack level 0, frame at 0xffffdc70:
> pc =3D 0x419046a0 in thr_kill; saved pc =3D 0x41904650
> called by frame at 0xffffdc90
> Arglist at 0xffffdc70, args:=20
> Locals at 0xffffdc70, Previous frame's sp is 0xffffdc70

(The relevant addresses are the same as raise showed.)


Reminder of the source program structure that uses the potentially =
frame/stack alignment sensitive libc/stdio library code:

> # more sig_snprintf_use_test.c=20
> #include <signal.h> // for signal, SIGINFO, SIG_ERR, raise.
> #include <stdio.h>  // for snprintf
>=20
> void handler(int sig)
> {
>    char buf[32];
>    snprintf(buf, sizeof buf, "%d", sig); // FreeBSD's world does such
>                                          // things in some of its =
handlers.
> }
>=20
> int main(void)
> {
>    handler(0); // handler gets aligned stack frame for this; snprintf =
works here.
>    if (signal(SIGINFO, handler) !=3D SIG_ERR) raise(SIGINFO);
>                                // raise gets aligned stack frame;
>                                // handler gets misaligned stack frame;
>                                // =
snprintf/__vfrpintf/io_flush/__sfvwrite/memcpy:
>                                // when built by clang 3.8.0 are =
sensitive to
>                                // the misalignment.
>    return 0;
> }




=3D=3D=3D
Mark Millard
markmi at dsl-only.net

On 2016-Jan-31, at 9:12 PM, Mark Millard <markmi at dsl-only.net> wrote:

A summary of the later finding details for what I've done so far:

It is system library code (__vfprintf and its inline io_flush call to =
__sfvwrite) that may produce and use a potentially bad &iop->uio =
address, depending the mix of how the calculation works and the =
stack/frame alignment present in signal delivery. The gcc 4.2.1 vs. =
clang 3.8.0 program status makes no difference to if it ends up with a =
segmentation fault or not.

When __vfprintf and its inline io_flush call to __sfvwrite is compiled =
by gcc 4.2.1 --which always uses addition for offsets, voiding alignment =
assumptions-- no variant of the program gets a segmentation fault. gcc =
4.2.1 does not create the dependency on the alignment that clang 3.8.0 =
does. Yet the misalignment is present. (See the details.)

When clang3.8.0 compiles __vfprintf and its inline io_flush call to =
__sfvwrite --which uses masking for the offset in calculating &iop->uio, =
making alignment assumptions-- every variant of the program gets a =
segmentation fault. (The misalignment is still present.)



The details for the misalignment evidence follow.

For (C) "on a pure gcc 4.2.1 buildworld/buildkernel system". . .

C0) For gcc421-a.out gets signal delivery to its handler: "info frame" =
in this (C) context:

This *has* a misaligned signal delivery stack but there is no =
segmentation fault.

> Program received signal SIGINFO, Information request.
>=20
> Breakpoint 1, 0x018006e0 in handler ()
> (gdb) bt       =20
> #0  0x018006e0 in handler ()
> #1  <signal handler called>
> #2  0x00000000 in ?? ()
> (gdb) info frame
> Stack level 0, frame at 0xffffd73c:
> pc =3D 0x18006e0 in handler; saved pc =3D 0xffffe008
> called by frame at 0xffffd73c
> Arglist at 0xffffd6fc, args:=20
> Locals at 0xffffd6fc, Previous frame's sp is 0xffffd73c
> Saved registers:
> r31 at 0xffffd738, pc at 0xffffd740, lr at 0xffffd740


So misaligned (multiple of 4 but of no higher power of 2) for "frame =
at", "called by frame at" (which is listed as the same as "frame at"), =
"Arglist", "Locals", and "Previous frame's sp" (which is listed as the =
same as "frame at").

In this case I also list __vfprintf's misalignment evidence for =
reference:
(break __vfprintf used.)

> (gdb) info frame
> Stack level 0, frame at 0xffffd57c:
> pc =3D 0x41930af8 in __vfprintf =
(/usr/src/lib/libc/stdio/vfprintf.c:452); saved pc =3D 0x41992e18
> called by frame at 0xffffd6fc
> source language c.
> Arglist at 0xffffd29c, args: fp=3D0xffffd5dc, locale=3D0x419c41e0 =
<__xlocale_global_locale>, fmt0=3D0x1800a1c "%d", ap=3D0xffffd6cc
> Locals at 0xffffd29c, Previous frame's sp is 0xffffd57c
> Saved registers:
> r30 at 0xffffd574, r31 at 0xffffd578, pc at 0xffffd580, lr at =
0xffffd580


So misaligned (multiple of 4 but of no higher power of 2) for "frame =
at", "called by frame at", "Arglist", "Locals", and "Previous frame's =
sp" (which is listed as the same as "frame at").

Just to have one for reference, here is the "info frame" for the direct =
handler call --which gets a properly aligned frame/stack:

> (gdb) info frame
> Stack level 0, frame at 0xffffdcc0:
> pc =3D 0x18006e0 in handler; saved pc =3D 0x1800734
> called by frame at 0xffffdcd0
> Arglist at 0xffffdc80, args:=20
> Locals at 0xffffdc80, Previous frame's sp is 0xffffdcc0
> Saved registers:
> r31 at 0xffffdcbc, pc at 0xffffdcc4, lr at 0xffffdcc4

Only the signal delivery is creating non-aligned stack frames.


C1) For clang380-a.out gets signal delivery to its handler: "info frame" =
in this (C) context:

This *has* a misaligned signal delivery stack but there is no =
segmentation fault.

> (gdb) info frame
> Stack level 0, frame at 0xffffd70c:
> pc =3D 0x18006d0 in handler; saved pc =3D 0xffffe008
> called by frame at 0xffffd70c
> Arglist at 0xffffd6cc, args:=20
> Locals at 0xffffd6cc, Previous frame's sp is 0xffffd70c
> Saved registers:
> r31 at 0xffffd708, pc at 0xffffd710, lr at 0xffffd710

So misaligned (multiple of 4 but of no higher power of 2) for "frame =
at", "called by frame at", "Arglist", "Locals", and "Previous frame's =
sp" (which is listed as the same as "frame at").



For (B) "on a clang 3.8.0 buildworld and gcc 4.2.1 buildkernel mix". . .

B0) For gcc421-a.out gets signal delivery to its handler: "info frame" =
in this (B) context:

This *has* a misaligned signal delivery stack and there *is* a =
segmentation fault.

> Program received signal SIGINFO, Information request.
>=20
> Breakpoint 1, 0x018006e0 in handler ()
> (gdb) bt
> #0  0x018006e0 in handler ()
> #1  <signal handler called>
> #2  0x00000000 in ?? ()
> (gdb) info frame
> Stack level 0, frame at 0xffffd74c:
> pc =3D 0x18006e0 in handler; saved pc =3D 0xffffe008
> called by frame at 0xffffd74c
> Arglist at 0xffffd70c, args:=20
> Locals at 0xffffd70c, Previous frame's sp is 0xffffd74c
> Saved registers:
> r31 at 0xffffd748, pc at 0xffffd750, lr at 0xffffd750
> (gdb) cont
> Continuing.
>=20
> Program received signal SIGSEGV, Segmentation fault.
> 0x419a89c8 in memcpy (dst0=3D0xffffd714, src0=3D<optimized out>, =
length=3D<optimized out>) at /usr/src/lib/libc/string/bcopy.c:124
> warning: Source file is more recent than executable.
> 124				TLOOP1(*--dst =3D *--src);



B1) For clang380-a.out gets signal delivery to its handler: "info frame" =
in this (B) context:
  (i.e., what I originally reported on and submitted a Bug report for)

This *has* a misaligned signal delivery stack and there *is* a =
segmentation fault.

> Program received signal SIGINFO, Information request.
>=20
> Breakpoint 1, 0x018006d0 in handler ()
> (gdb) info frame
> Stack level 0, frame at 0xffffd71c:
> pc =3D 0x18006d0 in handler; saved pc =3D 0xffffe008
> called by frame at 0xffffd71c
> Arglist at 0xffffd6dc, args:=20
> Locals at 0xffffd6dc, Previous frame's sp is 0xffffd71c
> Saved registers:
> r31 at 0xffffd718, pc at 0xffffd720, lr at 0xffffd720
> (gdb) cont
> Continuing.
>=20
> Program received signal SIGSEGV, Segmentation fault.
> 0x419a89c8 in memcpy (dst0=3D0xffffd6f4, src0=3D<optimized out>, =
length=3D<optimized out>) at /usr/src/lib/libc/string/bcopy.c:124
> warning: Source file is more recent than executable.
> 124				TLOOP1(*--dst =3D *--src);

So misaligned (multiple of 4 but of no higher power of 2) for "frame =
at", "called by frame at" (which is listed as the same as "frame at"), =
"Arglist", "Locals", and "Previous frame's sp" (which is listed as the =
same as "frame at").



More context notes. . .

The "pure gcc 4.2.1 buildworld/buildkernel system" has:

# freebsd-version -ku; uname -aKU
11.0-CURRENT
11.0-CURRENT
FreeBSD FBSDG4C0 11.0-CURRENT FreeBSD 11.0-CURRENT #5 r294960M: Wed Jan =
27 18:25:04 PST 2016     =
root@FBSDG4C0:/usr/obj/gcc421/powerpc.powerpc/usr/src/sys/GENERICvtsc-NODE=
BUG  powerpc 1100097 1100097


The "clang 3.8.0 buildworld and gcc 4.2.1 buildkernel mix" has:

# freebsd-version -ku; uname -aKU
11.0-CURRENT
11.0-CURRENT
FreeBSD FBSDG4C1 11.0-CURRENT FreeBSD 11.0-CURRENT #1 r294962M: Fri Jan =
29 18:28:17 PST 2016     =
markmi@FreeBSDx64:/usr/obj/clang_gcc421/powerpc.powerpc/usr/src/sys/GENERI=
Cvtsc-NODEBUG  powerpc 1100097 1100097

(Same PowerMac, different SSD.)


[I have renamed a.out's to indicate compiler context as I've gone =
along.]
[I copied each a.out to the other SSD for use after compiling/linking.]
[I'm not generally showing the "direct call" properly aligned "info =
frame" texts.]
[handle SIGINFO nostop print pass; break handler used in gdb 7.10_5.]
[For gcc 4.2.1 I used: gcc -std=3Dc99 -Wall sig_snprintf_use_test.c .]
[For clang 3.8.0 I used: clang -std=3Dc11 -Wall -Wpedantic =
sig_snprintf_use_test.c .]

=3D=3D=3D
Mark Millard
markmi at dsl-only.net

On 2016-Jan-31, at 6:32 PM, Mark Millard <markmi at dsl-only.net> wrote:

> [I've never noticed gcc 4.2.1 generating code that was based on =
presuming the alignment was present. For example: it always seems to use =
addition to deal with address offsets, never masking. So I'd not expect =
to see segmentation faults for that context even when the stack is =
aligned modulo only 4. Separately checking the alignment is appropriate =
for me to do.]
>=20
> A) The reported context:
>=20
> The kernel context here is a gcc 4.2.1 based buildkernel then =
installkernel.
> The world context here is a clang 3.8.0 based buildworld then =
installworld.
> The program context here is a clang 3.8.0 based:
>=20
>> # clang -std=3Dc11 -Wall -Wpedantic sig_snprintf_use_test.c
>> # /usr/local/bin/gdb a.out
>=20
>=20
> Using "break handler" in gdb (7.10_5) and using "info frame" when it =
stops for the "raise" shows the misalignment of the frame that the =
handler was given ny the signal delivery.
>=20
> By contrast the earlier direct call of the handler gets a "info frame" =
result that shows the expected sort of alignment.
>=20
> I find no evidence of frame/stack misalignment via gdb except for the =
one that is created by the signal delivery.
>=20
>=20
> B) I'll look at trying one or more of gcc 4.2.1, gcc49, gcc5 for the =
program context, still based on a clang 3.8.0 buildworld and gcc 4.2.1 =
buildkernel based on projects/clang380-import (-r294962).
>=20
> C) I will look at trying the same program builds on a pure gcc 4.2.1 =
buildworld/buildkernel context. (Likely 11.0-CURRENT -r294960.)
>=20
>=20
> I'll send more results when I have them.
>=20
>=20




=3D=3D=3D
Mark Millard
markmi at dsl-only.net

On 2016-Jan-31, at 5:50 PM, Justin Hibbits <chmeeedalf at gmail.com> =
wrote:

Does this occur with gcc-built world and/or kernel?  You could put some =
printf()s in sendsig(), and there are KTR tracepoints already present.  =
The code assumes a fully aligned user stack, which should be correct, =
but may not be.

- Justin
On Jan 31, 2016, at 6:41 PM, Mark Millard wrote:

> I have submitted Bug 206810 for this 11.0-CURRENT/clang380-import =
stack alignment problem for TARGET_ARCH=3Dpowerpc signal delivery.
>=20
> =3D=3D=3D
> Mark Millard
> markmi at dsl-only.net
>=20
> On 2016-Jan-31, at 6:08 AM, Roman Divacky <rdivacky at vlakno.cz> =
wrote:
>=20
> Fwiw, LLVM expect 16B aligned stack on PowerPC.
>=20
> On Sun, Jan 31, 2016 at 05:55:20AM -0800, Mark Millard wrote:
>> 3 quick FreeBSD for powerpc (32-bit) questions:
>>=20
>>=20
>> A) For PowerPC (32-bit) what is the stack alignment requirement by =
the ABI(s) that FreeBSD targets?
>>=20
>> B) Are signal handlers supposed to be given that alignment?
>>=20
>>=20
>> I ask because signal handlers are at times begin given just 4-byte =
alignment but clang 3.8.0 powerpc's code generation can depend on the =
alignment being more than 4.
>>=20
>> clang 3.8.0 can calculate addresses by, for example, masking in a 0x4 =
relative to what would need to be an aligned address with alignment 8 or =
more instead of adding 0x4 to a more arbitrary address.
>>=20
>> So far I've only seen less than 8 byte stack alignment via signal =
handler activity.
>>=20
>>=20
>> C) Which should be blamed for problems here: clang's code generation, =
FreeBSD's stack alignment handling for signals, or both?
>>=20
>> =3D=3D=3D
>> Mark Millard
>> markmi at dsl-only.net
>>=20
>> _______________________________________________
>> freebsd-toolchain@freebsd.org mailing list
>> https://lists.freebsd.org/mailman/listinfo/freebsd-toolchain
>> To unsubscribe, send any mail to =
"freebsd-toolchain-unsubscribe@freebsd.org"
>=20








Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?261D8A47-3B8A-4DE6-9D2C-F536C9143E84>