Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 14 Feb 2016 14:34:30 -0800
From:      Mark Millard <markmi@dsl-only.net>
To:        Roman Divacky <rdivacky@vlakno.cz>
Cc:        FreeBSD PowerPC ML <freebsd-ppc@freebsd.org>, FreeBSD Toolchain <freebsd-toolchain@freebsd.org>
Subject:   Re: I've submitted 207175 for a clang 3.8.0 va_list handling problem for powerpc
Message-ID:  <70B405C4-E1AC-4F35-9786-051FDA2F8BE7@dsl-only.net>
In-Reply-To: <20160214192903.GA96697@vlakno.cz>
References:  <F6846682-10F7-4D0D-A691-ED8D4366805C@dsl-only.net> <20160214192903.GA96697@vlakno.cz>

next in thread | previous in thread | raw e-mail | index | archive | help
On 2016-Feb-14, at 11:29 AM, Roman Divacky <rdivacky@vlakno.cz> wrote:
>=20
> Fwiw, the code to handle the vaarg is in=20
> tools/clang/lib/CodeGen/TargetInfo.cpp:PPC32_SVR4_ABIInfo::EmitVAArg()
>=20
> You can take a look to see whats wrong.
>=20
> On Sat, Feb 13, 2016 at 07:03:29PM -0800, Mark Millard wrote:
>> I've isolated another clang 3.8.0 TARGET_ARCH=3Dpowerpc SEGV problem =
that shows up for using clang 3.8.0 to buildworld/installworld for =
powerpc.
>>=20
>>> ls -l -n /
>>=20
>> gets a SEGV. As listed in =
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D207175 ( and  =
https://llvm.org/bugs/show_bug.cgi?id=3D26605 ) the following simplified =
program also gets the SEGV on powerpc:
>>=20
>>> #include <stdarg.h> // for va_list, va_start, va_arg, va_end
>>> #include <stdint.h> // for intmax_t
>>>=20
>>> intmax_t
>>> va_test (char *s, ...)
>>> {
>>>    va_list vap;
>>>=20
>>>    va_start(vap, s);
>>>=20
>>>    char*        t0 =3D va_arg(vap, char*);
>>>    unsigned int o0 =3D va_arg(vap, unsigned int);
>>>    int          c0 =3D va_arg(vap, int);
>>>    unsigned int u0 =3D va_arg(vap, unsigned int);
>>>    int          c1 =3D va_arg(vap, int);
>>>    char *       t1 =3D va_arg(vap, char*);
>>>=20
>>>    intmax_t     j0 =3D va_arg(vap, intmax_t); // This spans into =
overflow_arg_area.
>>>=20
>>>    int          c2 =3D va_arg(vap, int);      // A copy was put in =
the=20
>>>                                             // overflow_arg_area =
because of the
>>>                                             // above.
>>>                                             // But this tries to =
extract from the
>>>                                             // last 4 bytes of the =
reg_save_area.
>>>                                             // It does not increment =
the
>>>                                             // overflow_arg_area =
position pointer
>>>                                             // past the copy that is =
there.
>>>=20
>>>    char *       t2 =3D va_arg(vap, char*);    // The lack of =
increment before makes
>>>                                             // this extraction off =
by 4 bytes.
>>>=20
>>>    char         t2fc =3D *t2;  // <<< This gets SEGV. t2 actually =
got what should be
>>>                              //     the c2 value.
>>>=20
>>>    intmax_t     j1 =3D va_arg(vap, intmax_t);
>>>=20
>>>    va_end(vap);
>>>=20
>>>    return (intmax_t) ((s-t2)+(t0-t1)+o0+u0+j0+j1+c0+c1+c2+t2fc);
>>>    // Avoid any optimize-away for lack of use.
>>> }
>>>=20
>>> int main(void)
>>> {
>>>    char         s[1025] =3D "test string for this";
>>>=20
>>>    char*        t0 =3D s + 5;
>>>    unsigned int o0 =3D 3;
>>>    int          c0 =3D 1;
>>>    unsigned int u0 =3D 1;
>>>    int          c1 =3D 3;
>>>    char *       t1 =3D s + 12;
>>>    intmax_t     j0 =3D 314159265358979323;
>>>    int          c2 =3D 4;
>>>    char *       t2 =3D s + 16;
>>>    intmax_t     j1 =3D ~314159265358979323;
>>>=20
>>>    intmax_t      result =3D =
va_test(s,t0,o0,c0,u0,c1,t1,j0,c1,t2,j1);
>>>=20
>>>    return (int) (result - (intmax_t) =
((s-t2)+(t0-t1)+o0+u0+j0+j1+c0+c1+c2+*t2));
>>>    // Avoid any optimize-away for lack of use.
>>> }
>>=20
>>=20
>>=20
>> =3D=3D=3D
>> Mark Millard
>> markmi at dsl-only.net
>>=20
>> _______________________________________________
>> freebsd-toolchain@freebsd.org mailing list
>> https://lists.freebsd.org/mailman/listinfo/freebsd-toolchain
>> To unsubscribe, send any mail to =
"freebsd-toolchain-unsubscribe@freebsd.org"

clang's code base is not familiar material for me nor do I have solid =
reference material for the FreeBSD TARGET_ARCH=3Dpowerpc ABI rules so =
the below has my guess work involved.

The following code appears to have hard wired a global, unvarying =
constant (8) into the test for picking UsingRegs vs. UsingOverflow.


>   llvm::Value *NumRegs =3D Builder.CreateLoad(NumRegsAddr, =
"numUsedRegs");
. . .
>   llvm::Value *CC =3D
>       Builder.CreateICmpULT(NumRegs, Builder.getInt8(8), "cond");
>=20
>   llvm::BasicBlock *UsingRegs =3D CGF.createBasicBlock("using_regs");
>   llvm::BasicBlock *UsingOverflow =3D =
CGF.createBasicBlock("using_overflow");
>   llvm::BasicBlock *Cont =3D CGF.createBasicBlock("cont");
>=20
>   Builder.CreateCondBr(CC, UsingRegs, UsingOverflow);
. . .
>   // Case 1: consume registers.
>   Address RegAddr =3D Address::invalid();
>   {
. . .
>     // Increase the used-register count.
>     NumRegs =3D
>       Builder.CreateAdd(NumRegs,
>                         Builder.getInt8((isI64 || (isF64 && =
IsSoftFloatABI)) ? 2 : 1));
>     Builder.CreateStore(NumRegs, NumRegsAddr);. . .
. . .
>   }
>=20
>   // Case 2: consume space in the overflow area.
>   Address MemAddr =3D Address::invalid();
>   {
. . . (no adjustments to NumRegs) . . .

If so the means of counting NumRegs (a.k.a. gpr) then needs to take into =
account an allocated but unused last UsingRegs "slot" sometimes. =
Imagine. . .

r3, r4, r5, r6, r7, r8, r9 in use already so r10 is the last possible =
"UsingRegs" context.
(0  1   2   3   4   5   6, leaving r10 as position 7, the last < 8 =
value)

Then the next two arguments are a 8 byte integer then a a 4 byte integer =
(in that order). That results in what should be:

r10 "UsingRegs" slot reserved and un-accessed
In other words: counted as allocated so that the rest goes in in the =
overflow area
(so no position 7 usage)

then

overflow with the 8 byte integer then the 4 byte integer.


And, in fact, the memory content reflects this in the overflow area.


But the va_arg access code does not count r10's slot as allocated in =
"Using Regs" after the 8 byte integer. So later it tries to use r10's =
slot for the 4 byte integer that is actually in the UsingOverflow area.

One fix of sorts is to have "Case 2: consume space in the overflow =
area." set NumRegs (a.k.a. gpr) to the bound from the =
Builder.CreateICmpULT (8 in this context). Then the first (or any/every) =
use of the UsingOverflow area forces no more use of the UsingRegs area =
(for the involved va_list).



=3D=3D=3D
Mark Millard
markmi at dsl-only.net




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?70B405C4-E1AC-4F35-9786-051FDA2F8BE7>