From owner-freebsd-numerics@freebsd.org Thu Mar 9 07:52:37 2017 Return-Path: Delivered-To: freebsd-numerics@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id DA5C6D041B4 for ; Thu, 9 Mar 2017 07:52:37 +0000 (UTC) (envelope-from sgk@troutmask.apl.washington.edu) Received: from troutmask.apl.washington.edu (troutmask.apl.washington.edu [128.95.76.21]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "troutmask", Issuer "troutmask" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id B54321594 for ; Thu, 9 Mar 2017 07:52:37 +0000 (UTC) (envelope-from sgk@troutmask.apl.washington.edu) Received: from troutmask.apl.washington.edu (localhost [127.0.0.1]) by troutmask.apl.washington.edu (8.15.2/8.15.2) with ESMTPS id v297qawX030230 (version=TLSv1.2 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO); Wed, 8 Mar 2017 23:52:36 -0800 (PST) (envelope-from sgk@troutmask.apl.washington.edu) Received: (from sgk@localhost) by troutmask.apl.washington.edu (8.15.2/8.15.2/Submit) id v297qaxt030229; Wed, 8 Mar 2017 23:52:36 -0800 (PST) (envelope-from sgk) Date: Wed, 8 Mar 2017 23:52:36 -0800 From: Steve Kargl To: Bruce Evans Cc: freebsd-numerics@freebsd.org Subject: Re: Bit twiddling question Message-ID: <20170309075236.GA30199@troutmask.apl.washington.edu> Reply-To: sgk@troutmask.apl.washington.edu References: <20170308202417.GA23103@troutmask.apl.washington.edu> <20170309173152.F1269@besplex.bde.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20170309173152.F1269@besplex.bde.org> User-Agent: Mutt/1.7.2 (2016-11-26) X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 09 Mar 2017 07:52:37 -0000 On Thu, Mar 09, 2017 at 05:58:52PM +1100, Bruce Evans wrote: > On Wed, 8 Mar 2017, Steve Kargl wrote: > > > Suppose I have a float 'x' that I know is in the > > range 1 <= x <= 0x1p23 and I know that 'x' is > > integral, e.g., x = 12.000. If I use GET_FLOAT_WORD > > from math_private.h, then x=12.000 maps to ix=0x41400000. > > Is there a bit twiddling method that I can apply to ix to > > unambiguously determine if x is even of odd? > > I don't know of any good method. Spent a day or so searching; hence, my question. > > Yes, I know I can do > > > > float x; > > int32_t ix; > > ix = (int32_t)x; > > > > and then test (ix & 1). But, this does not generalize to > > This isn't bit twiddling, and is also slow. Yes, I know it isn't bit twiddling, but it achieves want I need. I was hoping that if a bit twiddling algorithm was available (and was faster), then I could change my algorithm. > > the case of long double on a ld128 architecture. That is, > > if I have 1 <= x < 1xp112, then I would need to have > > > > long double x; > > int128_t ix; > > ix = (int128_t)x; > > > > and AFAICT sparc64 doesn't have an int128_t. > > If sparc64 has this, it would be even slower. Sparc64 emulates > all 128-bit FP. This makes 128-bit sparc64 ~10-100 times slower > than 64-bit sparc64 FP, and 100-1000 times slower than modern > x86 64 and 8-bit FP. FP to integer conversions tend to be slower > than pure FP, and are especially tricky for integers with more > bits than FP mantissas. > > Consider bit twiddling to classifiy oddness of 2.0F (0x40000000 in > bits) and 3.0F (0x40400000 in bits). The following method seems > to be not so bad. Calculates that the unbiased exponent for 3.0F > is 1. This means that the 1's bit is (1 << (23 - 1)) = 0x00400000 > where we see it for 3.0F but not for 2.0F. All bits to the right > of this must be 0 for the value to be an integer. I don't know > how you classified integers efficiently without already determining > the position of the "point" and looking at these bits. There are > complications for powers of 2 and the normalization bit being implicit. Thanks. I'll look into the above to see if I can up with something that does not require the cast. To give a hint at what I have been working on, I have implementations for sinpif(x), sinpi(x), cospif(x), and cospi(x). For sinpif(x) and cospif(x) I have kernels that give correctly rounded results for FLT_MIN <= x < 0x1p23 (at least on x86_64). I also have slightly faster kernels that give max ULPs of about 0.5008. For sinpi(x) and cospi(x), my kernels currently give about 1.25 ULP for the worse case error if I accumulate the polynomials in double precision. If I accumulate the polynomials in long double precision, then I get correctly rounded results. To complete the set, I was hoping to work out ld80 and ld128 versions. `ld128 is going to be problematic due to the absense of int128_t. I'll send you what I have in a few days. -- Steve 20161221 https://www.youtube.com/watch?v=IbCHE-hONow