Date: Thu, 7 Feb 2008 03:17:05 +0000 (UTC) From: Bruce Evans <bde@FreeBSD.org> To: src-committers@FreeBSD.org, cvs-src@FreeBSD.org, cvs-all@FreeBSD.org Subject: cvs commit: src/lib/msun/ld128 s_exp2l.c src/lib/msun/ld80 s_exp2l.c src/lib/msun/src e_exp.c e_expf.c s_exp2.c s_exp2f.c Message-ID: <200802070317.m173H5Ts079831@repoman.freebsd.org>
next in thread | raw e-mail | index | archive | help
bde 2008-02-07 03:17:05 UTC
FreeBSD src repository
Modified files:
lib/msun/ld128 s_exp2l.c
lib/msun/ld80 s_exp2l.c
lib/msun/src e_exp.c e_expf.c s_exp2.c s_exp2f.c
Log:
Use a better method of scaling by 2**k. Instead of adding to the
exponent bits of the reduced result, construct 2**k (hopefully in
parallel with the construction of the reduced result) and multiply by
it. This tends to be much faster if the construction of 2**k is
actually in parallel, and might be faster even with no parallelism
since adjustment of the exponent requires a read-modify-wrtite at an
unfortunate time for pipelines.
In some cases involving exp2* on amd64 (A64), this change saves about
40 cycles or 30%. I think it is inherently only about 12 cycles faster
in these cases and the rest of the speedup is from partly-accidentally
avoiding compiler pessimizations (the construction of 2**k is now
manually scheduled for good results, and -O2 doesn't always mess this
up). In most cases on amd64 (A64) and i386 (A64) the speedup is about
20 cycles. The worst case that I found is expf on ia64 where this
change is a pessimization of about 10 cycles or 5%. The manual
scheduling for plain exp[f] is harder and not as tuned.
This change ld128/s_exp2l.c has not been tested.
Revision Changes Path
1.2 +15 -11 src/lib/msun/ld128/s_exp2l.c
1.2 +14 -11 src/lib/msun/ld80/s_exp2l.c
1.11 +8 -9 src/lib/msun/src/e_exp.c
1.13 +8 -9 src/lib/msun/src/e_expf.c
1.5 +9 -9 src/lib/msun/src/s_exp2.c
1.5 +3 -6 src/lib/msun/src/s_exp2f.c
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200802070317.m173H5Ts079831>
