From owner-freebsd-numerics@freebsd.org Sun Feb 3 19:34:50 2019 Return-Path: Delivered-To: freebsd-numerics@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 3B81C14B01CB for ; Sun, 3 Feb 2019 19:34:50 +0000 (UTC) (envelope-from sgk@troutmask.apl.washington.edu) Received: from troutmask.apl.washington.edu (troutmask.apl.washington.edu [128.95.76.21]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) server-signature RSA-PSS (4096 bits) client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "troutmask", Issuer "troutmask" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id E86448338C for ; Sun, 3 Feb 2019 19:34:48 +0000 (UTC) (envelope-from sgk@troutmask.apl.washington.edu) Received: from troutmask.apl.washington.edu (localhost [127.0.0.1]) by troutmask.apl.washington.edu (8.15.2/8.15.2) with ESMTPS id x13JYkKA043330 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NO) for ; Sun, 3 Feb 2019 11:34:46 -0800 (PST) (envelope-from sgk@troutmask.apl.washington.edu) Received: (from sgk@localhost) by troutmask.apl.washington.edu (8.15.2/8.15.2/Submit) id x13JYkAs043329 for freebsd-numerics@freebsd.org; Sun, 3 Feb 2019 11:34:46 -0800 (PST) (envelope-from sgk) Date: Sun, 3 Feb 2019 11:34:46 -0800 From: Steve Kargl To: freebsd-numerics@freebsd.org Subject: optimization for Bessel function routines Message-ID: <20190203193446.GA43322@troutmask.apl.washington.edu> Reply-To: sgk@troutmask.apl.washington.edu MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.11.2 (2019-01-07) X-Rspamd-Queue-Id: E86448338C X-Spamd-Bar: +++ Authentication-Results: mx1.freebsd.org X-Spamd-Result: default: False [3.53 / 15.00]; ARC_NA(0.00)[]; HAS_REPLYTO(0.00)[sgk@troutmask.apl.washington.edu]; FROM_HAS_DN(0.00)[]; TO_MATCH_ENVRCPT_ALL(0.00)[]; NEURAL_SPAM_MEDIUM(0.88)[0.885,0]; MIME_GOOD(-0.10)[text/plain]; PREVIOUSLY_DELIVERED(0.00)[freebsd-numerics@freebsd.org]; TO_DN_NONE(0.00)[]; AUTH_NA(1.00)[]; RCPT_COUNT_ONE(0.00)[1]; RCVD_COUNT_THREE(0.00)[3]; RCVD_TLS_LAST(0.00)[]; NEURAL_SPAM_SHORT(0.93)[0.933,0]; RCVD_IN_DNSWL_MED(-0.20)[21.76.95.128.list.dnswl.org : 127.0.11.2]; MX_GOOD(-0.01)[cached: troutmask.apl.washington.edu]; DMARC_NA(0.00)[washington.edu]; NEURAL_SPAM_LONG(0.87)[0.875,0]; R_SPF_NA(0.00)[]; REPLYTO_ADDR_EQ_FROM(0.00)[]; FROM_EQ_ENVFROM(0.00)[]; R_DKIM_NA(0.00)[]; MIME_TRACE(0.00)[0:+]; ASN(0.00)[asn:73, ipnet:128.95.0.0/16, country:US]; MID_RHS_MATCH_FROM(0.00)[]; IP_SCORE(0.14)[ip: (0.33), ipnet: 128.95.0.0/16(0.37), asn: 73(0.09), country: US(-0.07)] X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 03 Feb 2019 19:34:50 -0000 * src/e_j0.c: . Replace calls to sin(x) and cos(x) with single call to sincos(). * src/e_j0f.c: . Replace calls to sinf(x) and cosf(x) with single call to sincosf(). * src/e_j1.c: . Replace calls to sin(x) and cos(x) with single call to sincos(). * src/e_j1f.c: . Replace calls to sinf(x) and cosf(x) with single call to sincosf(). * src/e_jn.c: . Replace calls to sin(x) and cos(x) with single call to sincos(). Index: src/e_j0.c =================================================================== --- src/e_j0.c (revision 343477) +++ src/e_j0.c (working copy) @@ -93,8 +93,7 @@ __ieee754_j0(double x) if(ix>=0x7ff00000) return one/(x*x); x = fabs(x); if(ix >= 0x40000000) { /* |x| >= 2.0 */ - s = sin(x); - c = cos(x); + sincos(x, &s, &c); ss = s-c; cc = s+c; if(ix<0x7fe00000) { /* Make sure x+x does not overflow. */ @@ -173,8 +172,7 @@ __ieee754_y0(double x) * sin(x) +- cos(x) = -cos(2x)/(sin(x) -+ cos(x)) * to compute the worse one. */ - s = sin(x); - c = cos(x); + sincos(x, &s, &c); ss = s-c; cc = s+c; /* Index: src/e_j0f.c =================================================================== --- src/e_j0f.c (revision 343477) +++ src/e_j0f.c (working copy) @@ -55,8 +55,7 @@ __ieee754_j0f(float x) if(ix>=0x7f800000) return one/(x*x); x = fabsf(x); if(ix >= 0x40000000) { /* |x| >= 2.0 */ - s = sinf(x); - c = cosf(x); + sincosf(x, &s, &c); ss = s-c; cc = s+c; if(ix<0x7f000000) { /* Make sure x+x does not overflow. */ @@ -128,8 +127,7 @@ __ieee754_y0f(float x) * sin(x) +- cos(x) = -cos(2x)/(sin(x) -+ cos(x)) * to compute the worse one. */ - s = sinf(x); - c = cosf(x); + sincosf(x, &s, &c); ss = s-c; cc = s+c; /* Index: src/e_j1.c =================================================================== --- src/e_j1.c (revision 343477) +++ src/e_j1.c (working copy) @@ -94,8 +94,7 @@ __ieee754_j1(double x) if(ix>=0x7ff00000) return one/x; y = fabs(x); if(ix >= 0x40000000) { /* |x| >= 2.0 */ - s = sin(y); - c = cos(y); + sincos(y, &s, &c); ss = -s-c; cc = s-c; if(ix<0x7fe00000) { /* make sure y+y not overflow */ @@ -159,8 +158,7 @@ __ieee754_y1(double x) /* y1(x<0) = NaN and raise invalid exception. */ if(hx<0) return vzero/vzero; if(ix >= 0x40000000) { /* |x| >= 2.0 */ - s = sin(x); - c = cos(x); + sincos(x, &s, &c); ss = -s-c; cc = s-c; if(ix<0x7fe00000) { /* make sure x+x not overflow */ Index: src/e_j1f.c =================================================================== --- src/e_j1f.c (revision 343477) +++ src/e_j1f.c (working copy) @@ -56,8 +56,7 @@ __ieee754_j1f(float x) if(ix>=0x7f800000) return one/x; y = fabsf(x); if(ix >= 0x40000000) { /* |x| >= 2.0 */ - s = sinf(y); - c = cosf(y); + sincosf(y, &s, &c); ss = -s-c; cc = s-c; if(ix<0x7f000000) { /* make sure y+y not overflow */ @@ -114,8 +113,7 @@ __ieee754_y1f(float x) if(ix==0) return -one/vzero; if(hx<0) return vzero/vzero; if(ix >= 0x40000000) { /* |x| >= 2.0 */ - s = sinf(x); - c = cosf(x); + sincosf(x, &s, &c); ss = -s-c; cc = s-c; if(ix<0x7f000000) { /* make sure x+x not overflow */ Index: src/e_jn.c =================================================================== --- src/e_jn.c (revision 343477) +++ src/e_jn.c (working copy) @@ -54,7 +54,7 @@ double __ieee754_jn(int n, double x) { int32_t i,hx,ix,lx, sgn; - double a, b, temp, di; + double a, b, c, s, temp, di; double z, w; /* J(-n,x) = (-1)^n * J(n, x), J(n, -x) = (-1)^n * J(n, x) @@ -91,11 +91,12 @@ __ieee754_jn(int n, double x) * 2 -s+c -c-s * 3 s+c c-s */ + sincos(x, &s, &c); switch(n&3) { - case 0: temp = cos(x)+sin(x); break; - case 1: temp = -cos(x)+sin(x); break; - case 2: temp = -cos(x)-sin(x); break; - case 3: temp = cos(x)-sin(x); break; + case 0: temp = c+s; break; + case 1: temp = -c+s; break; + case 2: temp = -c-s; break; + case 3: temp = c-s; break; } b = invsqrtpi*temp/sqrt(x); } else { @@ -216,7 +217,7 @@ __ieee754_yn(int n, double x) { int32_t i,hx,ix,lx; int32_t sign; - double a, b, temp; + double a, b, c, s, temp; EXTRACT_WORDS(hx,lx,x); ix = 0x7fffffff&hx; @@ -248,11 +249,12 @@ __ieee754_yn(int n, double x) * 2 -s+c -c-s * 3 s+c c-s */ + sincos(x, &s, &c); switch(n&3) { - case 0: temp = sin(x)-cos(x); break; - case 1: temp = -sin(x)-cos(x); break; - case 2: temp = -sin(x)+cos(x); break; - case 3: temp = sin(x)+cos(x); break; + case 0: temp = s-c; break; + case 1: temp = -s-c; break; + case 2: temp = -s+c; break; + case 3: temp = s+c; break; } b = invsqrtpi*temp/sqrt(x); } else { -- Steve