From owner-freebsd-numerics@freebsd.org Sun Feb 19 08:59:59 2017 Return-Path: Delivered-To: freebsd-numerics@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 95455CE521E for ; Sun, 19 Feb 2017 08:59:59 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from mail104.syd.optusnet.com.au (mail104.syd.optusnet.com.au [211.29.132.246]) by mx1.freebsd.org (Postfix) with ESMTP id 60981106E for ; Sun, 19 Feb 2017 08:59:58 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from besplex.bde.org (c122-106-153-191.carlnfd1.nsw.optusnet.com.au [122.106.153.191]) by mail104.syd.optusnet.com.au (Postfix) with ESMTPS id 14D2142A65C; Sun, 19 Feb 2017 19:59:48 +1100 (AEDT) Date: Sun, 19 Feb 2017 19:59:48 +1100 (EST) From: Bruce Evans X-X-Sender: bde@besplex.bde.org To: Steve Kargl cc: freebsd-numerics@freebsd.org Subject: Re: [PATCH] avoid function call overhead in tgammaf In-Reply-To: <20170219024101.GA79177@troutmask.apl.washington.edu> Message-ID: <20170219190343.Y1567@besplex.bde.org> References: <20170219024101.GA79177@troutmask.apl.washington.edu> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Optus-CM-Score: 0 X-Optus-CM-Analysis: v=2.2 cv=Ca543Pjl c=1 sm=1 tr=0 a=Tj3pCpwHnMupdyZSltBt7Q==:117 a=Tj3pCpwHnMupdyZSltBt7Q==:17 a=kj9zAlcOel0A:10 a=RRXJm6L_Oh1pcGlzDb0A:9 a=CjuIK1q_8ugA:10 X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 19 Feb 2017 08:59:59 -0000 On Sat, 18 Feb 2017, Steve Kargl wrote: > The following patch treats special values (i.e., +-inf, nan) > and values outside a limited domain to avoid the function > call overhead of tgamma. Anyone with a commit bit is more > than welcomed to commit the patch (after of course a review). Since it is claimed to be "essentially useless", optimizing it doesn't seem useful. > Index: src/s_tgammaf.c > =================================================================== > --- src/s_tgammaf.c (revision 1857) > +++ src/s_tgammaf.c (working copy) > ... > +static u_int32_t overflow = 0x420c290f; /* 35.0400981 */ > +static u_int32_t underflow = 0x421a67d8; /* 38.6014118 */ This is a confusing name. tgamma doesn't underflow like exp. It overflows near negative integers, but below a threshold overflow can't happen because it is impossible to get near enough to an integer. It also underflows for large negative args not near an integer. The threshold here must be the minimum of the 2 thresholds. Apparently, the one for underflow is always smaller (more negative) for float precision. It is easy to check all cases for float precision to find the minimum. Double precision uses a fuzzier threshold and spells it -190. It uses the less fuzzy threshold -177.69 in a comment, except it misspells this as [plus] 177.69. > +static volatile float huge = 1.e30, tiny = 1.e-30; > + > float > tgammaf(float x) > { > + u_int32_t hx; > + int32_t ix, sg; > + > + GET_FLOAT_WORD(hx, x); > + ix = hx & 0x7fffffff; > + sg = hx & 0x80000000; > + > + if (ix > overflow) { > + if (ix >= 0x7f800000) > + return (sg ? x / x : x + x); > + if (!sg) > + return (huge * huge); > + } > + > + if (ix == 0) > + return (1 / x); > + > + if (sg && ix > underflow) { > + /* > + * tgammaf(x) for integral x returns an NaN, so we implement > + * a poor man's rintf(). > + */ > + volatile float vz; > + float y,z; > + > + y = -x; > + > + vz = y + 0x1p23F; /* depend on 0 <= y < 0x1p23 */ > + z = vz - 0x1p23F; /* rintf(y) for the above range */ > + if (z == y) > + return ((x - x) / (x - x)); This has delicate optimizations which belong more in tgamma(). tgamma() just uses (x == ceil(x)) to classify negative integers. tgamma() also has fewer/different style bugs. > + > + return (tiny * tiny); The first error is immediately below the underflow threshold: X x = 0xc21a67d9 -38.6014137 X tgamma(x) = 0xb68ffffca9859b0c 0x80000000 -7.00648117e-46 X tgammaf(x) = 0 0 0 X err = -0x800000000ffffe54 17179869184.50000 This is just a sign error. tgamma(x) is the small negative value ~ 7e-46. This underflows to -0.0F, but tiny*tiny gives +0.0F. tgamma() does similar things plus extras to get the sign right. The sign is -1 if ceil(x) is even for x below the underflow threshold. It is not so easy to determine the sign from the raw bits. > + } > > return (tgamma(x)); > } Bruce