From owner-freebsd-numerics@freebsd.org Sat May 13 02:02:32 2017 Return-Path: Delivered-To: freebsd-numerics@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id C3E49D6AD76 for ; Sat, 13 May 2017 02:02:32 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from mailman.ysv.freebsd.org (unknown [127.0.1.3]) by mx1.freebsd.org (Postfix) with ESMTP id B1C091F27 for ; Sat, 13 May 2017 02:02:32 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: by mailman.ysv.freebsd.org (Postfix) id B1231D6AD75; Sat, 13 May 2017 02:02:32 +0000 (UTC) Delivered-To: numerics@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id AE746D6AD74; Sat, 13 May 2017 02:02:32 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from mail109.syd.optusnet.com.au (mail109.syd.optusnet.com.au [211.29.132.80]) by mx1.freebsd.org (Postfix) with ESMTP id 5D3A11F23; Sat, 13 May 2017 02:02:31 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from besplex.bde.org (c122-106-153-191.carlnfd1.nsw.optusnet.com.au [122.106.153.191]) by mail109.syd.optusnet.com.au (Postfix) with ESMTPS id 80712D654C9; Sat, 13 May 2017 11:35:54 +1000 (AEST) Date: Sat, 13 May 2017 11:35:49 +1000 (EST) From: Bruce Evans X-X-Sender: bde@besplex.bde.org To: Steve Kargl cc: numerics@freebsd.org, freebsd-hackers@freebsd.org Subject: Re: catrig[fl].c and inexact In-Reply-To: <20170512215654.GA82545@troutmask.apl.washington.edu> Message-ID: <20170513103208.M845@besplex.bde.org> References: <20170512215654.GA82545@troutmask.apl.washington.edu> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Optus-CM-Score: 0 X-Optus-CM-Analysis: v=2.2 cv=WvBbCZXv c=1 sm=1 tr=0 a=Tj3pCpwHnMupdyZSltBt7Q==:117 a=Tj3pCpwHnMupdyZSltBt7Q==:17 a=kj9zAlcOel0A:10 a=0K0djoc-qRL17_fbz0IA:9 a=CjuIK1q_8ugA:10 X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 13 May 2017 02:02:32 -0000 On Fri, 12 May 2017, Steve Kargl wrote: > So, I've been making improvements to my implementations of > the half-cycle trig functions. In doing so, I decide to > add WARNS=2 to msun/Makefile. clang 4.0.0 dies with an > error about an unused variable in raise_inexact() from > catrig[fl].c. > > /usr/home/kargl/trunk/math/libm/msun/src/catrigl.c:195:2: error: unused variable > 'junk' [-Werror,-Wunused-variable] > raise_inexact(); > ^ > /usr/home/kargl/trunk/math/libm/msun/src/catrigl.c:56:45: note: expanded from > macro 'raise_inexact' > #define raise_inexact() do { volatile float junk = 1 + tiny; } while(0) > ^ > Grepping catrig.o for the variable 'junk' suggests that 'junk' is > optimized out (with at least -O2). Just another bug in clang. Volatile variables cannot be optimized out (if they are accessed). > A quick and dirty patch to achieve the intent of the original > code follows. It would be nice if some would like to commit > the patch. Of course, you may want to wait for Bruce to > review the diff. > > Index: src/catrig.c > =================================================================== > --- src/catrig.c (revision 1935) > +++ src/catrig.c (working copy) > @@ -37,7 +37,7 @@ __FBSDID("$FreeBSD: head/lib/msun/src/catrig.c 313863 > #define isinf(x) (fabs(x) == INFINITY) > #undef isnan > #define isnan(x) ((x) != (x)) > -#define raise_inexact() do { volatile float junk = 1 + tiny; } while(0) > +#define raise_inexact(x) do { (x) = 1 + tiny; } while(0) > #undef signbit > #define signbit(x) (__builtin_signbit(x)) > > @@ -315,7 +315,7 @@ casinh(double complex z) > return (z); > > /* All remaining cases are inexact. */ > - raise_inexact(); > + raise_inexact(new_y); > > if (ax < SQRT_6_EPSILON / 4 && ay < SQRT_6_EPSILON / 4) > return (z); Now it doesn't take compiler bugs to optimize it out, since new_y is not volatile, and a good compiler would optimize it out in all cases. new_y is obviously unused before the early returns, so it doesn't need to be evalated before the returns as far as the compiler can see. Later, new_y is initialized indirectly, and the compiler can see that too (not so easily, so it can see that raise_inexact() has no effect except possibly for its side effect of raising inexact for 1 + tiny. The change might defeat the intent of the original code in another way. 'junk' is intentionally independent of other variables, so that there are no dependencies on it. If the compiler doesn't optimize away the assignment to new_y, then it is probably because it doesn't see that the assignment is dead, so there is a dependency. Actually, we want the variable 'junk' to be optimized away. We only want the side effect of evaluating 1 + tiny. Compilers have bugs evaluating expressions like 1 + tiny, tiny*tiny and huge*huge, and we use assignments of the result to volatile variables in tens if not hundreds of places to try to work around compiler bugs. If that doesn't work here, then all the other places are probably broken too. The other places mostly use a static volatile, while this uses an auto volatile. 'tiny' is also volatile, as required for the standard magic. I planned to fix all this magic using macros like raise_inexact(). Another subtety in the macro is that variable is float instead of double to possibly allow optimizations. Since the variable shouldn't be optimized away, it will waste sizeof(var) for each use of the macro. A file scope variable would work better here, but the macro is written to be self-contained to make it easier to use. The change also defeats that. Whether not evaluating 1 + tiny at compile time is a compiler bug is delicate. We don't have any C99 compilers yet, since gcc and clang don't support #pragma FENV_ACCESS ON/OFF. The pragma should be set to ON before the magic accesses, but we don't do that because it would be a lot of churn and we know that the pragma doesn't work. We more or less depend on the default state of the pragma being ON, but in gcc-4.2.1 it is documented as being OFF unless compiled with -frounding-math when it is documented as being ON, and for clang it is undocumented. -frounding-math is too inefficient to use by default, and another bug in clang is that it is not even supported. With macro or at least inline wrappers, the #pragma should only be needed in a few places. clang-3.9.0 seems to be only partly broken here. Volatile works correctly for v = huge*huge and also for v = 1+tiny provided v is static instead of auto. It also works to declare 'junk' as __unused. The following don't work with either clang-3.9.0 or gcc-4.2.1: - declaring 'junk' as __used (syntax error) - the expression 1+tiny not assigned to anything, or 1+tiny assigned to an __unused non-volatile variable. This gives the weird code of loading 'tiny' (because the compiler handles read accesses to volatile variables correctly), but not adding 1 (because the compiler doesn't know that adding 1 has a side effect, or is optimizing for FENV_ACCESS OFF). The following is documented to not work with gcc-4.2.1: - #pragma FENV_ACCESS ON. clang handles this correctly by warning that this is unsupported, but this makes it even more unusable. gcc-4.2.1 doesn't warn, so it is hard to tell if it worked. Bruce