From owner-freebsd-numerics@freebsd.org  Sun Feb 19 02:41:01 2017
Return-Path: <owner-freebsd-numerics@freebsd.org>
Delivered-To: freebsd-numerics@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id BD661CDBD0A
 for <freebsd-numerics@mailman.ysv.freebsd.org>;
 Sun, 19 Feb 2017 02:41:01 +0000 (UTC)
 (envelope-from sgk@troutmask.apl.washington.edu)
Received: from troutmask.apl.washington.edu (troutmask.apl.washington.edu
 [128.95.76.21])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (Client CN "troutmask", Issuer "troutmask" (not verified))
 by mx1.freebsd.org (Postfix) with ESMTPS id A411A3C3
 for <freebsd-numerics@freebsd.org>; Sun, 19 Feb 2017 02:41:01 +0000 (UTC)
 (envelope-from sgk@troutmask.apl.washington.edu)
Received: from troutmask.apl.washington.edu (localhost [127.0.0.1])
 by troutmask.apl.washington.edu (8.15.2/8.15.2) with ESMTPS id v1J2f1UR079192
 (version=TLSv1.2 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO)
 for <freebsd-numerics@freebsd.org>; Sat, 18 Feb 2017 18:41:01 -0800 (PST)
 (envelope-from sgk@troutmask.apl.washington.edu)
Received: (from sgk@localhost)
 by troutmask.apl.washington.edu (8.15.2/8.15.2/Submit) id v1J2f1oF079191
 for freebsd-numerics@freebsd.org; Sat, 18 Feb 2017 18:41:01 -0800 (PST)
 (envelope-from sgk)
Date: Sat, 18 Feb 2017 18:41:01 -0800
From: Steve Kargl <sgk@troutmask.apl.washington.edu>
To: freebsd-numerics@freebsd.org
Subject: [PATCH] avoid function call overhead in tgammaf
Message-ID: <20170219024101.GA79177@troutmask.apl.washington.edu>
Reply-To: sgk@troutmask.apl.washington.edu
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
User-Agent: Mutt/1.7.2 (2016-11-26)
X-BeenThere: freebsd-numerics@freebsd.org
X-Mailman-Version: 2.1.23
Precedence: list
List-Id: "Discussions of high quality implementation of libm functions."
 <freebsd-numerics.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-numerics>, 
 <mailto:freebsd-numerics-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-numerics/>
List-Post: <mailto:freebsd-numerics@freebsd.org>
List-Help: <mailto:freebsd-numerics-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, 
 <mailto:freebsd-numerics-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 19 Feb 2017 02:41:01 -0000

The following patch treats special values (i.e., +-inf, nan)
and values outside a limited domain to avoid the function
call overhead of tgamma.  Anyone with a commit bit is more
than welcomed to commit the patch (after of course a review).

Index: src/s_tgammaf.c
===================================================================
--- src/s_tgammaf.c	(revision 1857)
+++ src/s_tgammaf.c	(working copy)
@@ -27,17 +27,59 @@
 #include <sys/cdefs.h>
 __FBSDID("$FreeBSD: head/lib/msun/src/s_tgammaf.c 176388 2008-02-18 17:27:11Z das $");
 
-#include <math.h>
+#include "math.h"
+#include "math_private.h"
 
 /*
- * We simply call tgamma() rather than bloating the math library with
- * a float-optimized version of it. The reason is that tgammaf() is
- * essentially useless, since the function is superexponential and
- * floats have very limited range.
+ * The gamma function is superexponential, which means that floats have
+ * a very limited domain.  Rather than bloating the math library with a
+ * float-optimized version of tgammaf, we call tgamma() within the limited
+ * domain of [-underflow,overflow].  However, to avoid function call overhead,
+ * tgammaf() directly treats special values and values outside the limited
+ * domain.
  */
+
+static u_int32_t overflow = 0x420c290f;		/*  35.0400981 */
+static u_int32_t underflow = 0x421a67d8;	/*  38.6014118 */
+static volatile float huge = 1.e30, tiny = 1.e-30;
+
 float
 tgammaf(float x)
 {
+	u_int32_t hx;
+	int32_t ix, sg;
+
+	GET_FLOAT_WORD(hx, x);
+	ix = hx & 0x7fffffff;
+	sg = hx & 0x80000000;
+
+	if (ix > overflow) {
+		if (ix >= 0x7f800000)
+			return (sg ? x / x : x + x);
+		if (!sg)
+		 	return (huge * huge);
+	}
+
+	if (ix == 0)
+		return (1 / x);
+
+	if (sg && ix > underflow) {
+		/*
+		 * tgammaf(x) for integral x returns an NaN, so we implement
+		 * a poor man's rintf().
+		 */
+		volatile float vz;
+		float y,z;
+
+		y = -x;
+
+		vz = y + 0x1p23F;	/* depend on 0 <= y < 0x1p23 */
+		z = vz - 0x1p23F;	/* rintf(y) for the above range */
+		if (z == y)
+	    		return ((x - x) / (x - x));
+
+		return (tiny * tiny);
+	}
 
 	return (tgamma(x));
 }


-- 
Steve
20161221 https://www.youtube.com/watch?v=IbCHE-hONow