FreeBSD Mail Archives

Date:      Fri, 09 Apr 2021 21:58:45 +0000
From:      bugzilla-noreply@freebsd.org
To:        bugs@FreeBSD.org
Subject:   =?UTF-8?B?W0J1ZyAyNTQ5MTFdIGxpYi9tc3VuL2N0cmlnX3Rlc3QgZmFpbHMg?= =?UTF-8?B?aWYgY29tcGlsZWQgd2l0aCBBVljCoCgtbWF2eCkgb3IgYW55IENQVVNFVCBl?= =?UTF-8?B?bmFibGluZyBBVlg=?=
Message-ID:  <bug-254911-227-9N5qNE5EtM@https.bugs.freebsd.org/bugzilla/>
In-Reply-To: <bug-254911-227@https.bugs.freebsd.org/bugzilla/>
References:  <bug-254911-227@https.bugs.freebsd.org/bugzilla/>

https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D254911

--- Comment #3 from Dimitry Andric <dim@FreeBSD.org> ---
Hmm it seems that we have a case here that is similar to what is described
here:

https://stackoverflow.com/questions/63125919/how-to-avoid-floating-point-ex=
ceptions-in-unused-simd-lanes

The gist being that clang indeed uses the vdivps (Divide Packed
Single-Precision) instruction by default, so the two calculations (beta * r=
ho *
s) / denom, t / denom) are emitted as:

        #DEBUG_VALUE: ctanhf:denom <- $xmm2
        .loc    1 77 35 is_stmt 1               #
src/lib/msun/src/s_ctanhf.c:77:35
        vmulss  %xmm1, %xmm3, %xmm1
        .loc    1 77 41 is_stmt 0               #
src/lib/msun/src/s_ctanhf.c:77:41
        vmulss  %xmm1, %xmm0, %xmm0
        .loc    1 77 46                         #
src/lib/msun/src/s_ctanhf.c:77:46
        vinsertps       $16, -80(%rbp), %xmm0, %xmm0 # 16-byte Folded Reload
                                        # xmm0 =3D xmm0[0],mem[0],xmm0[2,3]
        vmovsldup       %xmm2, %xmm1            # xmm1 =3D xmm2[0,0,2,2]
        vdivps  %xmm1, %xmm0, %xmm0

Now the problem with vdivps is apparently that the unused 'lanes' of the SI=
MD
registers can still result in floating point exception bits being set, such=
 as
FE_INVALID (in this case probably because the unused lanes have zero in the=
m,
giving 0/0).

That stackoverflow article suggests using clang's
-ffp-exception-behavior=3Dmaytrap option (documented at
<https://releases.llvm.org/11.0.1/tools/clang/docs/UsersManual.html#cmdopti=
on-ffp-exception-behavior>),
meaning "The compiler avoids transformations that may raise exceptions that
would not have been raised by the original code". It is supported from clan=
g 10
onwards.

In practice, this indeed avoids using vdivps, and uses vdivss (Divide Scalar
Single-Precision) instead, and the assembly for line 77 then looks like:

        #DEBUG_VALUE: ctanhf:denom <- $xmm1
        .loc    1 77 35 is_stmt 1               #
src/lib/msun/src/s_ctanhf.c:77:35
        vmulss  %xmm2, %xmm4, %xmm2
        .loc    1 77 41 is_stmt 0               #
src/lib/msun/src/s_ctanhf.c:77:41
        vmulss  %xmm0, %xmm2, %xmm0
        .loc    1 77 46                         #
src/lib/msun/src/s_ctanhf.c:77:46
        vdivss  %xmm1, %xmm0, %xmm2
        vmovss  -80(%rbp), %xmm0                # 4-byte Reload
                                        # xmm0 =3D mem[0],zero,zero,zero
        #DEBUG_VALUE: ctanhf:t <- $xmm0
        .loc    1 77 57                         #
src/lib/msun/src/s_ctanhf.c:77:57
        vdivss  %xmm1, %xmm0, %xmm0

And indeed, in this case the FE_INVALID is gone, and the tests succeed.

I guess it may be good to use this -ffp-exception-behavior=3Dmaytrap flag f=
or the
whole of lib/msun, as many of these functions rely on this behavior. It does
not seem to be required for gcc.

--=20
You are receiving this mail because:
You are the assignee for the bug.=

Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?bug-254911-227-9N5qNE5EtM>

Header And Logo

Peripheral Links

Site Navigation

Header And Logo

Peripheral Links

Search

Site Navigation