From owner-freebsd-toolchain@freebsd.org  Sun Mar 13 20:10:05 2016
Return-Path: <owner-freebsd-toolchain@freebsd.org>
Delivered-To: freebsd-toolchain@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id B9AE2ACFDF7
 for <freebsd-toolchain@mailman.ysv.freebsd.org>;
 Sun, 13 Mar 2016 20:10:05 +0000 (UTC)
 (envelope-from sgk@troutmask.apl.washington.edu)
Received: from troutmask.apl.washington.edu (troutmask.apl.washington.edu
 [128.95.76.21])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (Client CN "troutmask", Issuer "troutmask" (not verified))
 by mx1.freebsd.org (Postfix) with ESMTPS id A5BA413D0;
 Sun, 13 Mar 2016 20:10:05 +0000 (UTC)
 (envelope-from sgk@troutmask.apl.washington.edu)
Received: from troutmask.apl.washington.edu (localhost [127.0.0.1])
 by troutmask.apl.washington.edu (8.15.2/8.15.2) with ESMTPS id u2DKA4hZ026361
 (version=TLSv1.2 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO);
 Sun, 13 Mar 2016 13:10:04 -0700 (PDT)
 (envelope-from sgk@troutmask.apl.washington.edu)
Received: (from sgk@localhost)
 by troutmask.apl.washington.edu (8.15.2/8.15.2/Submit) id u2DKA4HZ026360;
 Sun, 13 Mar 2016 13:10:04 -0700 (PDT) (envelope-from sgk)
Date: Sun, 13 Mar 2016 13:10:04 -0700
From: Steve Kargl <sgk@troutmask.apl.washington.edu>
To: Dimitry Andric <dim@FreeBSD.org>
Cc: freebsd-toolchain@freebsd.org
Subject: Re: clang gets numerical underflow wrong, please fix.
Message-ID: <20160313201004.GA26343@troutmask.apl.washington.edu>
References: <20160313182521.GA25361@troutmask.apl.washington.edu>
 <74970883-FE44-47C0-BDA0-92DB0723398A@FreeBSD.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <74970883-FE44-47C0-BDA0-92DB0723398A@FreeBSD.org>
User-Agent: Mutt/1.5.24 (2015-08-30)
X-BeenThere: freebsd-toolchain@freebsd.org
X-Mailman-Version: 2.1.21
Precedence: list
List-Id: Maintenance of FreeBSD's integrated toolchain
 <freebsd-toolchain.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-toolchain>, 
 <mailto:freebsd-toolchain-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-toolchain/>
List-Post: <mailto:freebsd-toolchain@freebsd.org>
List-Help: <mailto:freebsd-toolchain-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-toolchain>, 
 <mailto:freebsd-toolchain-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 13 Mar 2016 20:10:05 -0000

On Sun, Mar 13, 2016 at 09:03:57PM +0100, Dimitry Andric wrote:
> On 13 Mar 2016, at 19:25, Steve Kargl <sgk@troutmask.apl.washington.edu> wrote:
> > 
> > Consider this small piece of code:
> > 
> > #include <fenv.h>
> > #include <stdio.h>
> > 
> > float
> > foo()
> > {
> > 	static const volatile float tiny = 1.e-30f;
> > 	return (tiny * tiny);
> > }
> > 
> > int
> > main(void)
> > {
> >   float x;
> >   feclearexcept(FE_ALL_EXCEPT);
> >   x = foo();
> >   if (fetestexcept(FE_UNDERFLOW)) printf("FE_UNDERFLOW: ");
> >   printf("x = %e\n", x);
> >   return 0;
> > }
> > 
> > clang seems to get the underflow condition wrong.
> > 
> > % cc -o z a.c -lm && ./z
> > FE_UNDERFLOW: x = 0.000000e+00
> > 
> > % cc -O -o z a.c -lm && ./z
> > x = 1.000000e-60             <--- This is not a possible value!
> > 
> > % gcc -o z a.c -lm && ./z
> > FE_UNDERFLOW: x = 0.000000e+00
> > 
> > % gcc -O -o z a.c -lm && ./z
> > FE_UNDERFLOW: x = 0.000000e+00
> 
> Hmm, this is an interesting one.  On amd64, it works as expected with
> clang, but there it always uses SSE, obviously:
> 
> $ ./underflow-amd64
> FE_UNDERFLOW: x = 0.000000e+00
> 
> The problem seems to be caused by the intermediate result being stored
> using fstpl instead of fstps, e.g. simplifying the sample program (to
> get rid of all the SSE stuff the fexxx() macros insert):
> 
> int main(void)
> {
>   float x;
>   __uint16_t status;
>   __fnclex();
>   x = foo();
>   __fnstsw(&status);
>   printf("status: %#x\n", (unsigned)status);
>   printf("x = %e\n", x);
>   return 0;
> }
> 
> With gcc, the assembly becomes:
> 
> foo:
>         flds    tiny.1853
>         flds    tiny.1853
>         fmulp   %st, %st(1)
>         ret
> [...]
> main:
> [...]
>         fnclex
>         call    foo
>         fstps   12(%esp)
>         fnstsw %ax
> 
> In this case, fmulp does not generate an underflow, but the fstps will.
> With clang, the assembly becomes:
> 
> foo:
>         flds    foo.tiny
>         fmuls   foo.tiny
>         retl
> [...]
> main:
>         subl    $24, %esp
>         fnclex
>         calll   foo
>         fstpl   12(%esp)                # 8-byte Folded Spill
>         fnstsw  22(%esp)
> 
> So it's storing the intermediate result in a double, for some reason.
> The fnstsw will then result in zero, since there was no underflow at
> that point.
> 
> I will submit a bug for this upstream, thanks for the report.
> 

Thanks for the quick reply.  But, it must be using an 80-bit
extended double instead of a double for storage.  This variation

#include <fenv.h>
#include <stdio.h>

int
main(void)
{
   int i;
//   float x = 1.f;
   double x = 1.;
   i = 0;
   feclearexcept(FE_ALL_EXCEPT);
   do {
      x /= 2;
      i++;
   } while(!fetestexcept(FE_UNDERFLOW));
   if (fetestexcept(FE_UNDERFLOW)) printf("FE_UNDERFLOW: ");
   printf("x = %e after %d iterations\n", x, i);

   return 0;
}

yields

% cc -O -o z b.c -lm && ./z
FE_UNDERFLOW: x = 0.000000e+00 after 16435 iterations

It should be 1075 iterations.

Note, there is a similar issue with OVERFLOW.  The upshot is
that clang on current is probably miscompiling libm.
-- 
Steve