Date: Mon, 10 May 2004 17:37:41 +0200 From: Stefan Farfeleder <stefanf@FreeBSD.org> To: Bruce Evans <bde@zeta.org.au> Cc: freebsd-standards@freebsd.org Subject: Re: Fixing ilogb() Message-ID: <20040510153731.GP29712@wombat.fafoe.narf.at> In-Reply-To: <20040509201148.P8241@gamplex.bde.org> References: <20040508194748.GN29712@wombat.fafoe.narf.at> <20040508225838.GA15663@VARK.homeunix.com> <20040509201148.P8241@gamplex.bde.org>
next in thread | previous in thread | raw e-mail | index | archive | help
--8tUgZ4IE8L4vmMyh Content-Type: text/plain; charset=us-ascii Content-Disposition: inline On Sun, May 09, 2004 at 09:44:03PM +1000, Bruce Evans wrote: > % Index: src/lib/msun/i387/s_ilogb.S > % =================================================================== > % RCS file: /usr/home/ncvs/src/lib/msun/i387/s_ilogb.S,v > % retrieving revision 1.8 > % diff -u -r1.8 s_ilogb.S > % --- src/lib/msun/i387/s_ilogb.S 6 Jun 2000 12:12:36 -0000 1.8 > % +++ src/lib/msun/i387/s_ilogb.S 8 May 2004 18:57:27 -0000 > % @@ -43,11 +43,27 @@ > % subl $4,%esp > % > % fldl 8(%ebp) > % + fxam > % + fnstsw %ax > % + sahf > > This is the main runtime overhead. I think it can mostly be avoided by > checking the return value. ilogb() can only be INT_MIN after overflow > or other conversion errors (check this). There 3 cases: > - logb(0) = -Inf; fistpl(-Inf) = INT_MIN + IE > - logb(Inf) = Inf; fistpl(-Inf) = INT_MIN + IE > - logb(NaN) = same NaN; fistpl(NaN) = INT_MIN + IE > After finding one of these rare cases, the exact case still needs to be > determined by looking at the original value or the result of fxtract. > Then fucom with 0 should be a faster way to do the classification. A > full classification is not needed sice denormals are not special here > and unsupported formats are unsupported here too. Thanks for your comments, Bruce. A revised patch is attached, it only adds two instructions in the common case. <snip> > % + movl $0x7fffffff,%eax /* FP_ILOGBNAN, INT_MAX */ > > Style bugs: some comments could be improved, but won't be needed when > <machine/_limits.h> is used. Unfortunately <math.h> cannot be included for FP_ILOGB{0,NAN}, so I've duplicated those #defines for now. Any suggestions? Move them to another/new header instead? Cheers, Stefan --8tUgZ4IE8L4vmMyh Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename="s_ilogb.S.diff" Index: src/lib/msun/i387/s_ilogb.S =================================================================== RCS file: /usr/home/ncvs/src/lib/msun/i387/s_ilogb.S,v retrieving revision 1.8 diff -I.svn -u -r1.8 s_ilogb.S --- src/lib/msun/i387/s_ilogb.S 6 Jun 2000 12:12:36 -0000 1.8 +++ src/lib/msun/i387/s_ilogb.S 10 May 2004 15:24:43 -0000 @@ -33,10 +33,15 @@ * J.T. Conklin (jtc@wimsey.com), Winning Strategies, Inc. */ +#include <machine/_limits.h> #include <machine/asm.h> RCSID("$FreeBSD: src/lib/msun/i387/s_ilogb.S,v 1.8 2000/06/06 12:12:36 bde Exp $") +#define FP_ILOGB0 (-__INT_MAX) +#define FP_ILOGBNAN __INT_MAX +#define FP_ILOGBINF __INT_MAX + ENTRY(ilogb) pushl %ebp movl %esp,%ebp @@ -44,10 +49,35 @@ fldl 8(%ebp) fxtract - fstp %st + fstp %st(0) fistpl -4(%ebp) movl -4(%ebp),%eax + /* fistpl yields __INT_MIN for NaN, Inf and 0. */ + cmpl $__INT_MIN,%eax + je .L2 + +.L1: leave ret + +.L2: + fldl 8(%ebp) + fldz + fucompp + fnstsw %ax + sahf + jp .L3 + jz .L4 + + movl $FP_ILOGBINF,%eax + jmp .L1 + +.L3: + movl $FP_ILOGBNAN,%eax + jmp .L1 + +.L4: + movl $FP_ILOGB0,%eax + jmp .L1 --8tUgZ4IE8L4vmMyh--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20040510153731.GP29712>