Date: Sun, 30 Apr 2017 05:09:26 +1000 (EST) From: Bruce Evans <brde@optusnet.com.au> To: Steve Kargl <sgk@troutmask.apl.washington.edu> Cc: freebsd-hackers@freebsd.org, freebsd-numerics@freebsd.org Subject: Re: Implementation of half-cycle trignometric functions Message-ID: <20170430042756.A862@besplex.bde.org> In-Reply-To: <20170429181022.GA41420@troutmask.apl.washington.edu> References: <20170427231411.GA11346@troutmask.apl.washington.edu> <20170428010122.GA12814@troutmask.apl.washington.edu> <20170428183733.V1497@besplex.bde.org> <20170428165658.GA17560@troutmask.apl.washington.edu> <20170429035131.E3406@besplex.bde.org> <20170428201522.GA32785@troutmask.apl.washington.edu> <20170429070036.A4005@besplex.bde.org> <20170428233552.GA34580@troutmask.apl.washington.edu> <20170429005924.GA37947@troutmask.apl.washington.edu> <20170429151457.F809@besplex.bde.org> <20170429181022.GA41420@troutmask.apl.washington.edu>
next in thread | previous in thread | raw e-mail | index | archive | help
On Sat, 29 Apr 2017, Steve Kargl wrote: > On Sat, Apr 29, 2017 at 05:54:21PM +1000, Bruce Evans wrote: >> On Fri, 28 Apr 2017, Steve Kargl wrote: > ... >>> GET_FLOAT_WORD(ix, p); >>> SET_FLOAT_WORD(phi, (ix >> 14) << 14); >>> >>> GET_FLOAT_WORD(ix, x2); >>> SET_FLOAT_WORD(x2hi, (ix >> 14) << 14); >> >> I expect that these GET/SET's are the slowest part. They are quite fast >> in float prec, but in double prec on old i386 CPUs compilers generate bad >> code which can have penalties of 20 cycles per GET/SET. >> >> Why the strange reduction? The double shift is just a manual optimization >> or pssimization (usually the latter) for clearing low bits. Here it is >> used to clear 14 low bits instead of the usual 12. This is normally >> written using just a mask of 0xffff0000, unless you want a different >> number of bits in the hi terms for technical reasons. Double precision >> can benefit more from asymmetric splitting of terms since 53 is not >> divisible by 2; 1 hi term must have less than 26.5 bits and the other term >> can hold an extra bit. > > Because I didn't think about using a mask. :-) > > It's easy to change 14 to 13 or 11 or ..., while I would > need to write out zeros and one to come up with 0xffff8000, > etc. Here are some examples of more delicate splittings from the uncommitted clog*(). They are usually faster than GET/SET, but slower than converting to lower precision as is often possible for double precision and ld128 only. clog*() can't use the casting method since it needs to split in the middle, and doesn't use GET/SET since it is slow. It uses methods that only work on args that are not too large or too small, and uses a GET earlier to classify the arg size. XX double prec: XX double_t ax, ax2h, ax2l, t; XX XX t = (double)(ax * (0x1p27 + 1)); XX axh = (double)(ax - t) + t; XX axl = ax - axh; Splitting in the middle is required to calculate ax*ax exactly. Since the middle bit is 26.5, splitting in the middle is impossible, but there are tricks to make spitting at bit 26 or 27 work just as well. Multipling by 0x1p27 shifts to a place where the lower bits are canceled by the subtraction. Adding 1 is a trick to create a extra (virtual) bit needed for the exact multiplication. At least ax needs to have the extra-precision type double_t so that casting to double works. Otherwise, this code would need STRICT_ASSIGN() to work around compiler bugfreatures. We assign back to double_t for efficiency. XX float prec: XX float_t ax, axh, axl, t; XX XX t = (float)(ax * (0x1p12F + 1)); XX axh = (float)(ax - t) + t; XX axl = ax - axh; This is simpler because there is a middle. I think adding 1 is not needed. XX long double double prec: XX #if LDBL_MANT_DIG == 64 XX #define MULT_REDUX 0x1p32 /* exponent MANT_DIG / 2 rounded up */ XX #elif LDBL_MANT_DIG == 113 XX #define MULT_REDUX 0x1p57 XX #else XX #error "Unsupported long double format" XX #endif XX XX long double ax, axh, axl, t; XX XX /* Calculate ax*ax and ay*ay exactly using Dekker's algorithm. */ XX t = (long double)(ax * (MULT_REDUX + 1)); XX axh = (long double)(ax - t) + t; XX axl = ax - axh; This is more unsimpler because the middle depends on a parameter and it is fractional in some cases. If adding 1 is not needed for the non- fractional case, then it should be added in MULT_REDUX in the fractional case only. Several functions including expl and e_rem_pio* use this REDUX method with the parameter hard-coded in a different way, usually with a slower STRICT_ASSIGN() and a special conversion to an integer or integer divided by a power of 2. The above is supposed to be more careful with extra precision so that STRICT_ASSIGN() is not needed. The calculation of axh may have problems with double rounding, but I think this doesn't matter here -- whatever axh is, the rest ends up in axl and we just need to ensure that both have 27 lower bits zero. Bruce
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20170430042756.A862>