From owner-freebsd-numerics@FreeBSD.ORG Wed Aug 28 11:31:30 2013 Return-Path: Delivered-To: freebsd-numerics@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id 27EFBA88 for ; Wed, 28 Aug 2013 11:31:30 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from mail109.syd.optusnet.com.au (mail109.syd.optusnet.com.au [211.29.132.80]) by mx1.freebsd.org (Postfix) with ESMTP id A5A932FB1 for ; Wed, 28 Aug 2013 11:31:29 +0000 (UTC) Received: from c122-106-156-23.carlnfd1.nsw.optusnet.com.au (c122-106-156-23.carlnfd1.nsw.optusnet.com.au [122.106.156.23]) by mail109.syd.optusnet.com.au (Postfix) with ESMTPS id 00F10D631B8; Wed, 28 Aug 2013 21:31:22 +1000 (EST) Date: Wed, 28 Aug 2013 21:31:21 +1000 (EST) From: Bruce Evans X-X-Sender: bde@besplex.bde.org To: Steve Kargl Subject: Re: (2nd time) tweaks to erff() threshold values In-Reply-To: <20130827184701.GA76013@troutmask.apl.washington.edu> Message-ID: <20130828205705.S1447@besplex.bde.org> References: <20130822213315.GA6708@troutmask.apl.washington.edu> <20130823202257.Q1593@besplex.bde.org> <20130823165744.GA47369@troutmask.apl.washington.edu> <20130824202102.GA54981@troutmask.apl.washington.edu> <20130825023029.GA56772@troutmask.apl.washington.edu> <20130825171910.GA60396@troutmask.apl.washington.edu> <20130827003147.R2328@besplex.bde.org> <20130826174307.GA67511@troutmask.apl.washington.edu> <20130827202751.X1787@besplex.bde.org> <20130827184701.GA76013@troutmask.apl.washington.edu> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Optus-CM-Score: 0 X-Optus-CM-Analysis: v=2.1 cv=DstvpgP+ c=1 sm=1 tr=0 a=ebeQFi2P/qHVC0Yw9JDJ4g==:117 a=PO7r1zJSAAAA:8 a=F6uwjYSw4mAA:10 a=kj9zAlcOel0A:10 a=JzwRw_2MAAAA:8 a=bRjWgPqadKYA:10 a=fiHkbJjbcZh7ihRK9ZwA:9 a=CjuIK1q_8ugA:10 Cc: freebsd-numerics@FreeBSD.org X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 28 Aug 2013 11:31:30 -0000 On Tue, 27 Aug 2013, Steve Kargl wrote: > On Tue, Aug 27, 2013 at 09:24:39PM +1000, Bruce Evans wrote: >> On Mon, 26 Aug 2013, Steve Kargl wrote: >>> ... >>> /* >>> - * Coefficients for approximation to erfc in [1.25,1/0.35] >>> + * Domain [1.25,1/0.35], range [-7.043e-10,7.457e-10]: >>> + * |log(x*erfc(x)) + x**2 + 0.5625 - r(x)/s(x)| < 2**-30 >>> */ >> >> It is even less clear than for the others that the scaling of the error >> is correct. The comment could make this clearer by giving the ratio of >> the correct error scale with the one used in the approximation, at the >> endpoints (max and min for this, usually at the endpoints). > > I'm not sure what you mean here. In looking at error curves, > it turns out that one of the 2 limits in the range is taken > from an endpoint. I mean that these are absolute errors for a function related to the result that is especially obscure in this case -- the function is log(x*result). The error that needs to be minimized is the relative error of the result, where "relative" is not quite the ratio of the error and the result -- it is the ratio of the error and [result rounded to a nearby power of 2; I'm not sure if the rounding is up or down]. For the absolute error to be close enough to the relative error, the relative error must not vary much across the interval, AND the scaling for the abslute error must be compatible. This is far from clear here. I think taking logs changes the scale for the absolute error to logarithmic, so this error is not really absolute. Hopefully the scale for the result is similarly logarithmic, so that the scales are compatible. There is also a factor of x before taking logs, so the scaling is not logarithmic either. >>> - if (ix < 0x41e00000) { /* |x|<28 */ >>> + if (ix < 0x41200000) { /* |x|<10 */ >> >> 10 is too small. It gives spurious underflow to 0 and larger than necessary >> errors, at least on i386. 11 works. > > I changed to 11. This required a new set of rb and sb coefficients > as the old set was for [1/0.35, 10]. With the new set the max ulp > for erfc goes up, but it is still much better that 63.something . Something wrong here. I got smaller errors by changing 10 to 11 without changing the coeffs. Now with updated coeffs, I get larger errors, at least on i386 (previously, there were no errors of more than 1.5 ulps above 1.88... Now there are some above 2.04). >>> + r = expf(-z*z-0.5625F)*expf((z-x)*(z+x)+R/S); >> >> Good fix. Is the point that -z*z-0.5625F must be exact, but the old >> splitting only made z*z exact? Is -z*z-0.5625F now always exact? >> Anyway, the error for this case is now smaller than near 1.88, and >> changing the expf()'s to exp()s doesn't reduce it much more, so the >> old extra-precision methods can't be much better. > > Back to the choice of mask. I was trying to understand > the origins of the constant 0.5625F. The asymptotic > expansion of erfc(x) is (exp(-x**2) / (x*sqrt(pi))) * P(x) > with P(x) being some polynomial. It turns out > exp(-0.5625) = 0.56978 and 1/sqrt(pi) = 0.56419. So, the > algorithm drops the 1/sqrt(pi) and approximates it by > exp(-0.5625). It is a hi+lo decomposition done multiplicatively and exponentially: 1/sqrt(pi) = exp(-0.5625) * exp(lo) where -0.5625 and lo are folded into other terms, and the hi part must be done exactly before exp() on it for enough accuracy. Taking exp() loses between 0.5 and 1 ulps of accuracy, and more is lost in the produce of the exp()'s. Hopefully the arg of the second exp() has a small error, so not much more accuracy is lost for it. > Well, in reading and re-reading the comment in s_erf.c, this > part caught my attention: > > * Note1: > * To compute exp(-x*x-0.5625+R/S), let s be a single > * precision number and s := x; then > * -x*x = -s*s + (s-x)*(s+x) > * exp(-x*x-0.5626+R/S) = The comment but not the code has this off-by-.0001 0.5626. > * exp(-s*s-0.5625)*exp((s-x)*(s+x)+R/S); > > This applies to the double precision version, ie erf(), so I > simply ask the question whether 0xfffff000 was adequate masking > out the low order bits. Fortunately, I only needed to test > 0xffff8000, 0xffffc0000, and 0xffffe000. I tried the others too, and they worked less well. But I haven't checked that -s*s-0.5625[F] is exact. In double precision, many more bits are stripped from x to create x, so exactness is less surprising. Bruce