From owner-freebsd-numerics@FreeBSD.ORG Tue Aug 27 18:47:08 2013 Return-Path: Delivered-To: freebsd-numerics@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id 83960B72 for ; Tue, 27 Aug 2013 18:47:08 +0000 (UTC) (envelope-from sgk@troutmask.apl.washington.edu) Received: from troutmask.apl.washington.edu (troutmask.apl.washington.edu [128.95.76.21]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 594D92B3E for ; Tue, 27 Aug 2013 18:47:08 +0000 (UTC) Received: from troutmask.apl.washington.edu (localhost.apl.washington.edu [127.0.0.1]) by troutmask.apl.washington.edu (8.14.7/8.14.7) with ESMTP id r7RIl2QI076154; Tue, 27 Aug 2013 11:47:02 -0700 (PDT) (envelope-from sgk@troutmask.apl.washington.edu) Received: (from sgk@localhost) by troutmask.apl.washington.edu (8.14.7/8.14.7/Submit) id r7RIl10g076153; Tue, 27 Aug 2013 11:47:01 -0700 (PDT) (envelope-from sgk) Date: Tue, 27 Aug 2013 11:47:01 -0700 From: Steve Kargl To: Bruce Evans Subject: Re: (2nd time) tweaks to erff() threshold values Message-ID: <20130827184701.GA76013@troutmask.apl.washington.edu> References: <20130822213315.GA6708@troutmask.apl.washington.edu> <20130823202257.Q1593@besplex.bde.org> <20130823165744.GA47369@troutmask.apl.washington.edu> <20130824202102.GA54981@troutmask.apl.washington.edu> <20130825023029.GA56772@troutmask.apl.washington.edu> <20130825171910.GA60396@troutmask.apl.washington.edu> <20130827003147.R2328@besplex.bde.org> <20130826174307.GA67511@troutmask.apl.washington.edu> <20130827202751.X1787@besplex.bde.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20130827202751.X1787@besplex.bde.org> User-Agent: Mutt/1.5.21 (2010-09-15) Cc: freebsd-numerics@FreeBSD.org X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 27 Aug 2013 18:47:08 -0000 On Tue, Aug 27, 2013 at 09:24:39PM +1000, Bruce Evans wrote: > On Mon, 26 Aug 2013, Steve Kargl wrote: > > > erfc (Look at the old and new ULP for 3rd and 4th interval. This is due > > to the change in mask from 0xfffff000 to 0xffffe000.) > > A very interesting fix. Well, I stumped across this fix. See below. > Apparently the splitting point 0.84375 is not very well chosen, since > we get largest errors just before it and much smaller errors above it > by using another algorithm. The only hint about the choice of 0.84375 is from the comment int s_erf.c. * is close to one. The interval is chosen because the fix * point of erf(x) is near 0.6174 (i.e., erf(x)=x when x is * near 0.6174), and by some experiment, 0.84375 is chosen to * guarantee the error is less than one ulp for erf. I haven't investigated whether 0.84375 is best choice. However, exhaustive testing in [0.6, 0.62] gives a max ulp of 0.666. > > /* > > - * Coefficients for approximation to erf in [0.84375,1.25] > > + * Domain [0, 0.84375], range ~[-5.4446e-10,5.5197e-10]: > > + * |(erf(x) - x)/x - p(x)/q(x)| < 2**-31. > > + */ > > Extra spaces. Fixed. > > ... > > /* > > - * Coefficients for approximation to erfc in [1.25,1/0.35] > > + * Domain [1.25,1/0.35], range [-7.043e-10,7.457e-10]: > > + * |log(x*erfc(x)) + x**2 + 0.5625 - r(x)/s(x)| < 2**-30 > > */ > > It is even less clear than for the others that the scaling of the error > is correct. The comment could make this clearer by giving the ratio of > the correct error scale with the one used in the approximation, at the > endpoints (max and min for this, usually at the endpoints). I'm not sure what you mean here. In looking at error curves, it turns out that one of the 2 limits in the range is taken from an endpoint. > > +sa4 = -5.77397496e-02F, /* -0x1.d90108p-5 */ > > > /* > > - * Coefficients for approximation to erfc in [1/.35,28] > > + * Domain [1.25,1/0.35], range [-3.753e-12,3.875e-12]: > > + * |log(x*erfc(x)) + x**2 + 0.5625 - r(x)/s(x)| < 2**-43 > > */ > > Domain needs more editing. Fixed. > > - if (ix < 0x41e00000) { /* |x|<28 */ > > + if (ix < 0x41200000) { /* |x|<10 */ > > 10 is too small. It gives spurious underflow to 0 and larger than necessary > errors, at least on i386. 11 works. I changed to 11. This required a new set of rb and sb coefficients as the old set was for [1/0.35, 10]. With the new set the max ulp for erfc goes up, but it is still much better that 63.something . ./testf -m 2.857143 -M 11 -E double Interval tested: [2.8571,11.0000] Max ULP: 2.92095 x Max ULP: 4.08856726e+00, 0x1.05ab16p+2 > > + R=rb0+s*(rb1+s*(rb2+s*(rb3+s*rb4))); > > + S=one+s*(sb1+s*(sb2+s*(sb3+s*sb4))); > > } > > GET_FLOAT_WORD(ix,x); > > This and the GET_FLOAT_WORD() for erff() are bogus. The word has already > been read into hx. Well spotted. Fixed. > > - SET_FLOAT_WORD(z,ix&0xfffff000); > > - r = __ieee754_expf(-z*z-(float)0.5625)* > > - __ieee754_expf((z-x)*(z+x)+R/S); > > + SET_FLOAT_WORD(z,ix&0xffffe000); > > Change further to use hx. Fixed. > > + r = expf(-z*z-0.5625F)*expf((z-x)*(z+x)+R/S); > > Good fix. Is the point that -z*z-0.5625F must be exact, but the old > splitting only made z*z exact? Is -z*z-0.5625F now always exact? > Anyway, the error for this case is now smaller than near 1.88, and > changing the expf()'s to exp()s doesn't reduce it much more, so the > old extra-precision methods can't be much better. Back to the choice of mask. I was trying to understand the origins of the constant 0.5625F. The asymptotic expansion of erfc(x) is (exp(-x**2) / (x*sqrt(pi))) * P(x) with P(x) being some polynomial. It turns out exp(-0.5625) = 0.56978 and 1/sqrt(pi) = 0.56419. So, the algorithm drops the 1/sqrt(pi) and approximates it by exp(-0.5625). Well, in reading and re-reading the comment in s_erf.c, this part caught my attention: * Note1: * To compute exp(-x*x-0.5625+R/S), let s be a single * precision number and s := x; then * -x*x = -s*s + (s-x)*(s+x) * exp(-x*x-0.5626+R/S) = * exp(-s*s-0.5625)*exp((s-x)*(s+x)+R/S); This applies to the double precision version, ie erf(), so I simply ask the question whether 0xfffff000 was adequate masking out the low order bits. Fortunately, I only needed to test 0xffff8000, 0xffffc0000, and 0xffffe000. -- Steve