Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 14 May 2017 02:19:24 +1000 (EST)
From:      Bruce Evans <brde@optusnet.com.au>
To:        Dimitry Andric <dimitry@andric.com>
Cc:        sgk@troutmask.apl.washington.edu, freebsd-hackers@freebsd.org,  numerics@freebsd.org
Subject:   Re: catrig[fl].c and inexact
Message-ID:  <20170514020559.F1038@besplex.bde.org>
In-Reply-To: <F5F8736B-D7E1-48AD-BC6C-8C74AF0A3272@andric.com>
References:  <20170512215654.GA82545@troutmask.apl.washington.edu> <20170513103208.M845@besplex.bde.org> <20170513060803.GA84399@troutmask.apl.washington.edu> <F5F8736B-D7E1-48AD-BC6C-8C74AF0A3272@andric.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Sat, 13 May 2017, Dimitry Andric wrote:

> On 13 May 2017, at 08:08, Steve Kargl <sgk@troutmask.apl.washington.edu> wrote:
>>
>> On Sat, May 13, 2017 at 11:35:49AM +1000, Bruce Evans wrote:
>>> On Fri, 12 May 2017, Steve Kargl wrote:
> ...
>>> required for the standard magic.  I planned to fix all this magic using
>>> macros like raise_inexact().
>>
>> If you plan to fix the magic with raise_inexact, then please
>> test with a suite of compilers.  AFAICT, clang is optimizing
>> out the code.  I haven't written a testcase to demonstrate this
>> as I have other irons in the fire.
>
> Using the full catrig.c and -O3, I tried gcc 4.2.1, 4.7.4, 4.8.5, 4.9.4,
> 5.4.0, 6.3.0 and 7.0.1, in addition to clang 3.4.1, 3.8.0, 3.9.1, 4.0.0
> and 5.0.0.  All versions of gcc produced something similar to the
> following for i386:

Yes, all compilers I tried (only gcc-3.3.3, gcc-4.2.1 and clang-3.9.0)
generate the intended code, but clang-3.9.0 also generates a -Wunused
warning about the variable that it has just used to generated the intended
code!

> # /usr/src/lib/msun/src/catrig.c:318:   raise_inexact();
>        flds    tiny    # tiny
>        fadds   .LC2    #
>        fstps   120(%esp)       # junk

I don't know how to ask for the best code, which is more like

 	flds	tiny
 	fadds	one
 	ffree	%st(0)		# or fstp %st(0) -- MD optimization

but the best code runs insignificantly faster in practice.

> and for amd64:
> [...]
> .L34:
> .LBB33:
> # /usr/src/lib/msun/src/catrig.c:318:   raise_inexact();
>        movss   tiny(%rip), %xmm0       # tiny, tiny.0_28
>        addss   .LC13(%rip), %xmm0      #, _29
>        movss   %xmm0, 188(%rsp)        # _29, junk

Discarding the result is easier for amd64 (just omit the store).  The
volatile hack forces the store.

> E.g., these all look good, at least with regards to not optimizing out
> the desired addition.
>
> The only compiler I could find that does optimize everything away (at
> least in the simplified test case), is the Intel compiler:
>
> https://godbolt.org/g/g1UT2m

Urk.

Bruce



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20170514020559.F1038>