From owner-freebsd-sparc64@FreeBSD.ORG Tue Mar 9 20:50:55 2010 Return-Path: Delivered-To: freebsd-sparc64@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id A148D1065686 for ; Tue, 9 Mar 2010 20:50:55 +0000 (UTC) (envelope-from marius@alchemy.franken.de) Received: from alchemy.franken.de (alchemy.franken.de [194.94.249.214]) by mx1.freebsd.org (Postfix) with ESMTP id 045248FC08 for ; Tue, 9 Mar 2010 20:50:54 +0000 (UTC) Received: from alchemy.franken.de (localhost [127.0.0.1]) by alchemy.franken.de (8.14.3/8.14.3/ALCHEMY.FRANKEN.DE) with ESMTP id o29KomQV049433; Tue, 9 Mar 2010 21:50:48 +0100 (CET) (envelope-from marius@alchemy.franken.de) Received: (from marius@localhost) by alchemy.franken.de (8.14.3/8.14.3/Submit) id o29KomVN049432; Tue, 9 Mar 2010 21:50:48 +0100 (CET) (envelope-from marius) Date: Tue, 9 Mar 2010 21:50:48 +0100 From: Marius Strobl To: Peter Jeremy Message-ID: <20100309205048.GB18466@alchemy.franken.de> References: <20100228192329.GA68252@server.vk2pj.dyndns.org> <20100308190301.GA69938@server.vk2pj.dyndns.org> <20100309102753.GC3978@server.vk2pj.dyndns.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20100309102753.GC3978@server.vk2pj.dyndns.org> User-Agent: Mutt/1.4.2.3i Cc: freebsd-sparc64@freebsd.org Subject: Re: gcc code generation problems X-BeenThere: freebsd-sparc64@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Porting FreeBSD to the Sparc List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 09 Mar 2010 20:50:55 -0000 On Tue, Mar 09, 2010 at 09:27:54PM +1100, Peter Jeremy wrote: > On 2010-Mar-09 06:03:01 +1100, Peter Jeremy wrote: > >code works. The "UltraSPARC IIIi Processor User's Manual", indicates > >that fxtod will trap for operands >= 2^51 (fdtox will trap for > >operands >= 2^53 and is therefore executed in hardware) which seems to > >point the finger at the FP emulation code in the userland trap handler. > > Attached is a block of assembler that loads all 32 double FP registers > with known non-zero values, executes a fxtod and then prints the > values of all FP registers. This shows that the register following > the emulated fxtod target is incorrectly zeroed. In the attached, the > target is %f10 but I've also tried %f8 - which zeroed %f10. The three > columns are: FP register number, value before fxtod, value after fxtod. > > Typical output on my USIIIi: > 0 0xfddb757dbd5b7ddf 0xfddb757dbd5b7ddf > 2 0xfbb6eafb7ab6fbbf 0xfbb6eafb7ab6fbbf > 4 0xf76dd5f6f56df77f 0xf76dd5f6f56df77f > 6 0xeedbabedeadbeeff 0xeedbabedeadbeeff > 8 0xddb757dbd5b7ddff 0xddb757dbd5b7ddff > 10 0xbb6eafb7ab6fbbff 0x432ffffffffffffe > 12 0x76dd5f6f56df77ff 0x0000000000000000 > 14 0x000fffffffffffff 0x000fffffffffffff > 16 0xdb757dbd5b7ddffd 0xdb757dbd5b7ddffd > 18 0xb6eafb7ab6fbbffb 0xb6eafb7ab6fbbffb > 20 0x6dd5f6f56df77ff7 0x6dd5f6f56df77ff7 > 22 0xdbabedeadbeeffee 0xdbabedeadbeeffee > 24 0xb757dbd5b7ddffdd 0xb757dbd5b7ddffdd > 26 0x6eafb7ab6fbbffbb 0x6eafb7ab6fbbffbb > 28 0xdd5f6f56df77ff76 0xdd5f6f56df77ff76 > 30 0xbabedeadbeeffeed 0xbabedeadbeeffeed > 32 0x757dbd5b7ddffddb 0x757dbd5b7ddffddb > 34 0xeafb7ab6fbbffbb6 0xeafb7ab6fbbffbb6 > 36 0xd5f6f56df77ff76d 0xd5f6f56df77ff76d > 38 0xabedeadbeeffeedb 0xabedeadbeeffeedb > 40 0x57dbd5b7ddffddb7 0x57dbd5b7ddffddb7 > 42 0xafb7ab6fbbffbb6e 0xafb7ab6fbbffbb6e > 44 0x5f6f56df77ff76dd 0x5f6f56df77ff76dd > 46 0xbedeadbeeffeedba 0xbedeadbeeffeedba > 48 0x7dbd5b7ddffddb75 0x7dbd5b7ddffddb75 > 50 0xfb7ab6fbbffbb6ea 0xfb7ab6fbbffbb6ea > 52 0xf6f56df77ff76dd5 0xf6f56df77ff76dd5 > 54 0xedeadbeeffeedbab 0xedeadbeeffeedbab > 56 0xdbd5b7ddffddb757 0xdbd5b7ddffddb757 > 58 0xb7ab6fbbffbb6eaf 0xb7ab6fbbffbb6eaf > 60 0x6f56df77ff76dd5f 0x6f56df77ff76dd5f > 62 0xdeadbeeffeedbabe 0xdeadbeeffeedbabe > > >I am still looking into the emulation code. > > I haven't found the above bug yet but I have found two other bugs in > the FP register decoding in __fpu_execute(). > > Firstly, for fxto{s,d,q} decoding, rs2 is set generically using > RN_DECODE() with 'type' == 0 (because the low 2 bits are always 0. In > this case, RN_DECODE() will assume a 32-bit rs2, whereas the SPARC > architecture manual specifies that fxto{s,d,q} has a 64-bit rs2. > The effect is that using a source register in the upper half will > alias to the lower half. > > The rd value used in f{s,d,q}tox suffers from the same problem. > Apparently you're right about these. What do you think about the following patch? http://people.freebsd.org/~marius/fpu.c.diff Marius