From owner-freebsd-numerics@FreeBSD.ORG  Tue Aug 27 18:47:08 2013
Return-Path: <owner-freebsd-numerics@FreeBSD.ORG>
Delivered-To: freebsd-numerics@FreeBSD.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTP id 83960B72
 for <freebsd-numerics@FreeBSD.org>; Tue, 27 Aug 2013 18:47:08 +0000 (UTC)
 (envelope-from sgk@troutmask.apl.washington.edu)
Received: from troutmask.apl.washington.edu (troutmask.apl.washington.edu
 [128.95.76.21])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (No client certificate requested)
 by mx1.freebsd.org (Postfix) with ESMTPS id 594D92B3E
 for <freebsd-numerics@FreeBSD.org>; Tue, 27 Aug 2013 18:47:08 +0000 (UTC)
Received: from troutmask.apl.washington.edu (localhost.apl.washington.edu
 [127.0.0.1])
 by troutmask.apl.washington.edu (8.14.7/8.14.7) with ESMTP id r7RIl2QI076154; 
 Tue, 27 Aug 2013 11:47:02 -0700 (PDT)
 (envelope-from sgk@troutmask.apl.washington.edu)
Received: (from sgk@localhost)
 by troutmask.apl.washington.edu (8.14.7/8.14.7/Submit) id r7RIl10g076153;
 Tue, 27 Aug 2013 11:47:01 -0700 (PDT) (envelope-from sgk)
Date: Tue, 27 Aug 2013 11:47:01 -0700
From: Steve Kargl <sgk@troutmask.apl.washington.edu>
To: Bruce Evans <brde@optusnet.com.au>
Subject: Re: (2nd time) tweaks to erff() threshold values
Message-ID: <20130827184701.GA76013@troutmask.apl.washington.edu>
References: <20130822213315.GA6708@troutmask.apl.washington.edu>
 <20130823202257.Q1593@besplex.bde.org>
 <20130823165744.GA47369@troutmask.apl.washington.edu>
 <20130824202102.GA54981@troutmask.apl.washington.edu>
 <20130825023029.GA56772@troutmask.apl.washington.edu>
 <20130825171910.GA60396@troutmask.apl.washington.edu>
 <20130827003147.R2328@besplex.bde.org>
 <20130826174307.GA67511@troutmask.apl.washington.edu>
 <20130827202751.X1787@besplex.bde.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20130827202751.X1787@besplex.bde.org>
User-Agent: Mutt/1.5.21 (2010-09-15)
Cc: freebsd-numerics@FreeBSD.org
X-BeenThere: freebsd-numerics@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: "Discussions of high quality implementation of libm functions."
 <freebsd-numerics.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-numerics>, 
 <mailto:freebsd-numerics-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-numerics>
List-Post: <mailto:freebsd-numerics@freebsd.org>
List-Help: <mailto:freebsd-numerics-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
 <mailto:freebsd-numerics-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 27 Aug 2013 18:47:08 -0000

On Tue, Aug 27, 2013 at 09:24:39PM +1000, Bruce Evans wrote:
> On Mon, 26 Aug 2013, Steve Kargl wrote:
> 
> > erfc  (Look at the old and new ULP for 3rd and 4th interval.  This is due
> >       to the change in mask from 0xfffff000 to 0xffffe000.)
> 
> A very interesting fix.

Well, I stumped across this fix.  See below.

> Apparently the splitting point 0.84375 is not very well chosen, since
> we get largest errors just before it and much smaller errors above it
> by using another algorithm.

The only hint about the choice of 0.84375 is from the comment
int s_erf.c.

 *	   is close to one. The interval is chosen because the fix
 *	   point of erf(x) is near 0.6174 (i.e., erf(x)=x when x is
 *	   near 0.6174), and by some experiment, 0.84375 is chosen to
 * 	   guarantee the error is less than one ulp for erf.

I haven't investigated whether 0.84375 is best choice.  However,
exhaustive testing in [0.6, 0.62] gives a max ulp of 0.666.

> > /*
> > - * Coefficients for approximation to  erf  in [0.84375,1.25]
> > + *  Domain [0, 0.84375], range ~[-5.4446e-10,5.5197e-10]:
> > + *  |(erf(x) - x)/x - p(x)/q(x)| < 2**-31.
> > + */
> 
> Extra spaces.

Fixed.

> > ...
> > /*
> > - * Coefficients for approximation to  erfc in [1.25,1/0.35]
> > + *  Domain [1.25,1/0.35], range [-7.043e-10,7.457e-10]:
> > + *  |log(x*erfc(x)) + x**2 + 0.5625 - r(x)/s(x)| < 2**-30
> >  */
> 
> It is even less clear than for the others that the scaling of the error
> is correct.  The comment could make this clearer by giving the ratio of
> the correct error scale with the one used in the approximation, at the
> endpoints (max and min for this, usually at the endpoints).

I'm not sure what you mean here.  In looking at error curves,
it turns out that one of the 2 limits in the range is taken
from an endpoint.

> > +sa4  = -5.77397496e-02F, /* -0x1.d90108p-5 */
> 
> > /*
> > - * Coefficients for approximation to  erfc in [1/.35,28]
> > + *  Domain [1.25,1/0.35], range [-3.753e-12,3.875e-12]:
> > + *  |log(x*erfc(x)) + x**2 + 0.5625 - r(x)/s(x)| < 2**-43
> >  */
> 
> Domain needs more editing.

Fixed.

> > -	if (ix < 0x41e00000) {		/* |x|<28 */
> > +	if (ix < 0x41200000) {		/* |x|<10 */
> 
> 10 is too small.  It gives spurious underflow to 0 and larger than necessary
> errors, at least on i386.  11 works.

I changed to 11.  This required a new set of rb and sb coefficients
as the old set was for [1/0.35, 10].  With the new set the max ulp
for erfc goes up, but it is still much better that 63.something .

./testf -m 2.857143 -M 11 -E double
Interval tested: [2.8571,11.0000]
    Max ULP: 2.92095
  x Max ULP: 4.08856726e+00, 0x1.05ab16p+2

> > +	        R=rb0+s*(rb1+s*(rb2+s*(rb3+s*rb4)));
> > +		S=one+s*(sb1+s*(sb2+s*(sb3+s*sb4)));
> > 	    }
> > 	    GET_FLOAT_WORD(ix,x);
> 
> This and the GET_FLOAT_WORD() for erff() are bogus.  The word has already
> been read into hx.

Well spotted.  Fixed.

> > -	    SET_FLOAT_WORD(z,ix&0xfffff000);
> > -	    r  =  __ieee754_expf(-z*z-(float)0.5625)*
> > -			__ieee754_expf((z-x)*(z+x)+R/S);
> > +	    SET_FLOAT_WORD(z,ix&0xffffe000);
> 
> Change further to use hx.

Fixed.

> > +	    r  = expf(-z*z-0.5625F)*expf((z-x)*(z+x)+R/S);
> 
> Good fix.  Is the point that -z*z-0.5625F must be exact, but the old
> splitting only made z*z exact?  Is -z*z-0.5625F now always exact?
> Anyway, the error for this case is now smaller than near 1.88, and
> changing the expf()'s to exp()s doesn't reduce it much more, so the
> old extra-precision methods can't be much better.

Back to the choice of mask.  I was trying to understand
the origins of the constant 0.5625F.  The asymptotic 
expansion of erfc(x) is (exp(-x**2) / (x*sqrt(pi))) * P(x)
with P(x) being some polynomial.  It turns out 
exp(-0.5625) = 0.56978 and 1/sqrt(pi) = 0.56419.  So, the
algorithm drops the 1/sqrt(pi) and approximates it by
exp(-0.5625).

Well, in reading and re-reading the comment in s_erf.c, this 
part caught my attention:

 *     Note1:
 *	   To compute exp(-x*x-0.5625+R/S), let s be a single
 *	   precision number and s := x; then
 *		-x*x = -s*s + (s-x)*(s+x)
 *	        exp(-x*x-0.5626+R/S) =
 *			exp(-s*s-0.5625)*exp((s-x)*(s+x)+R/S);

This applies to the double precision version, ie erf(), so I 
simply ask the question whether 0xfffff000 was adequate masking
out the low order bits. Fortunately, I only needed to test
0xffff8000, 0xffffc0000, and 0xffffe000.

-- 
Steve