From owner-freebsd-numerics@freebsd.org  Thu Mar  9 15:23:09 2017
Return-Path: <owner-freebsd-numerics@freebsd.org>
Delivered-To: freebsd-numerics@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id 89BD1D048BB
 for <freebsd-numerics@mailman.ysv.freebsd.org>;
 Thu,  9 Mar 2017 15:23:09 +0000 (UTC)
 (envelope-from sgk@troutmask.apl.washington.edu)
Received: from troutmask.apl.washington.edu (troutmask.apl.washington.edu
 [128.95.76.21])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (Client CN "troutmask", Issuer "troutmask" (not verified))
 by mx1.freebsd.org (Postfix) with ESMTPS id 653421532
 for <freebsd-numerics@freebsd.org>; Thu,  9 Mar 2017 15:23:09 +0000 (UTC)
 (envelope-from sgk@troutmask.apl.washington.edu)
Received: from troutmask.apl.washington.edu (localhost [127.0.0.1])
 by troutmask.apl.washington.edu (8.15.2/8.15.2) with ESMTPS id v29FN8AJ033321
 (version=TLSv1.2 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO);
 Thu, 9 Mar 2017 07:23:08 -0800 (PST)
 (envelope-from sgk@troutmask.apl.washington.edu)
Received: (from sgk@localhost)
 by troutmask.apl.washington.edu (8.15.2/8.15.2/Submit) id v29FN7MU033320;
 Thu, 9 Mar 2017 07:23:07 -0800 (PST) (envelope-from sgk)
Date: Thu, 9 Mar 2017 07:23:07 -0800
From: Steve Kargl <sgk@troutmask.apl.washington.edu>
To: Bruce Evans <brde@optusnet.com.au>
Cc: freebsd-numerics@freebsd.org
Subject: Re: Bit twiddling question
Message-ID: <20170309152307.GA32781@troutmask.apl.washington.edu>
Reply-To: sgk@troutmask.apl.washington.edu
References: <20170308202417.GA23103@troutmask.apl.washington.edu>
 <20170309173152.F1269@besplex.bde.org>
 <20170309075236.GA30199@troutmask.apl.washington.edu>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20170309075236.GA30199@troutmask.apl.washington.edu>
User-Agent: Mutt/1.7.2 (2016-11-26)
X-BeenThere: freebsd-numerics@freebsd.org
X-Mailman-Version: 2.1.23
Precedence: list
List-Id: "Discussions of high quality implementation of libm functions."
 <freebsd-numerics.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-numerics>, 
 <mailto:freebsd-numerics-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-numerics/>
List-Post: <mailto:freebsd-numerics@freebsd.org>
List-Help: <mailto:freebsd-numerics-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
 <mailto:freebsd-numerics-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 09 Mar 2017 15:23:09 -0000

On Wed, Mar 08, 2017 at 11:52:36PM -0800, Steve Kargl wrote:
> To give a hint at what I have been working on, I have implementations
> for sinpif(x), sinpi(x), cospif(x), and cospi(x).  For sinpif(x) and
> cospif(x) I have kernels that give correctly rounded results for 
> FLT_MIN <= x < 0x1p23 (at least on x86_64).   I also have slightly
> faster kernels that give max ULPs of about 0.5008.  For sinpi(x) and
> cospi(x), my kernels currently give about 1.25 ULP for the worse case
> error if I accumulate the polynomials in double precision.  If I 
> accumulate the polynomials in long double precision, then I get 
> correctly rounded results.  To complete the set, I was hoping to 
> work out ld80 and ld128 versions.  `ld128 is going to be problematic
> due to the absense of int128_t.
> 
> I'll send you what I have in a few days.

It is sometime amazing what happens when I sleep.  For sinpi[fl](x)
and the method I came up with one needs to do argument reduction
where x = n + r.  Here n is an integer and 0 < r < 1.  If n is even
then one is in the positive half-cyle of sin(x) and if n is odd then
one has the negative half-cycle.  For sinpif(x) this looks like

        if (ix < 0x4b000000) {          /* 1 <= |x| < 0x1p23 */
/* 1 */         ix = (uint32_t)ax;
/* 2 */         ax = ax == ix ? zero : __compute_sinpif(ax - ix);
/* 3 */         if (ix & 1) ax = -ax;
/* 4 */         return ((hx & 0x80000000) ? -ax : ax);
        }

Line 1 determines n.  Line 2 computes either sinpi(n) exactly or
sinpi(r).  Line 3 uses n odd to set the sign for the half-cycle.
Line 4 is used to set the sign from sin(-x) = -sin(x).

For double and ld80 one can use int64_t instead of uint32_t.  Last
night I realized that I don't need to use int64_t, and more importantly
I don't need int128_t.  The argument reduction can use uint32_t
everywhere.  For double, I currently have

        if (ix < 0x43300000) {          /* 1 <= |x| < 0x1p52 */
                int64_t n;
                n = (int64_t)ax;
                ax = ax == n ? zero : __compute_sinpi(ax - n);
                if (n & 1) ax = -ax;
                return ((hx & 0x80000000) ? -ax : ax);
        }

which can be written (off the top-of-head and not tested)

        if (ix < 0x43300000) {          /* 1 <= |x| < 0x1p52 */
                double volatile vx;
                uint32_t n;
                vx = ax + 0x1p52;
                vx = vx - 0x1p52;
                if (vx == ax) return(0);
                if (vx > UINT32_MAX) {   /* ax = m + n + r with m + n */
                    vx -= UINT32_MAX;    /* m = UINT32_MAX.  n is in range */
                    ax -= UINT32_MAX;    
				}
                n = (uint32_t)vx;
                ax = __compute_sinpi(ax - n);
                if (n & 1) ax = -ax;
                return ((hx & 0x80000000) ? -ax : ax);
        }

Something similar can be applied to ld128, but reduction may take
two rounds (ie., a comparison with UINT64_MAX and then UINT32_MAX).

-- 
Steve
20161221 https://www.youtube.com/watch?v=IbCHE-hONow