From owner-freebsd-toolchain@freebsd.org Sun Mar 13 20:10:05 2016 Return-Path: Delivered-To: freebsd-toolchain@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id B9AE2ACFDF7 for ; Sun, 13 Mar 2016 20:10:05 +0000 (UTC) (envelope-from sgk@troutmask.apl.washington.edu) Received: from troutmask.apl.washington.edu (troutmask.apl.washington.edu [128.95.76.21]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "troutmask", Issuer "troutmask" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id A5BA413D0; Sun, 13 Mar 2016 20:10:05 +0000 (UTC) (envelope-from sgk@troutmask.apl.washington.edu) Received: from troutmask.apl.washington.edu (localhost [127.0.0.1]) by troutmask.apl.washington.edu (8.15.2/8.15.2) with ESMTPS id u2DKA4hZ026361 (version=TLSv1.2 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO); Sun, 13 Mar 2016 13:10:04 -0700 (PDT) (envelope-from sgk@troutmask.apl.washington.edu) Received: (from sgk@localhost) by troutmask.apl.washington.edu (8.15.2/8.15.2/Submit) id u2DKA4HZ026360; Sun, 13 Mar 2016 13:10:04 -0700 (PDT) (envelope-from sgk) Date: Sun, 13 Mar 2016 13:10:04 -0700 From: Steve Kargl To: Dimitry Andric Cc: freebsd-toolchain@freebsd.org Subject: Re: clang gets numerical underflow wrong, please fix. Message-ID: <20160313201004.GA26343@troutmask.apl.washington.edu> References: <20160313182521.GA25361@troutmask.apl.washington.edu> <74970883-FE44-47C0-BDA0-92DB0723398A@FreeBSD.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <74970883-FE44-47C0-BDA0-92DB0723398A@FreeBSD.org> User-Agent: Mutt/1.5.24 (2015-08-30) X-BeenThere: freebsd-toolchain@freebsd.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: Maintenance of FreeBSD's integrated toolchain List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 13 Mar 2016 20:10:05 -0000 On Sun, Mar 13, 2016 at 09:03:57PM +0100, Dimitry Andric wrote: > On 13 Mar 2016, at 19:25, Steve Kargl wrote: > > > > Consider this small piece of code: > > > > #include > > #include > > > > float > > foo() > > { > > static const volatile float tiny = 1.e-30f; > > return (tiny * tiny); > > } > > > > int > > main(void) > > { > > float x; > > feclearexcept(FE_ALL_EXCEPT); > > x = foo(); > > if (fetestexcept(FE_UNDERFLOW)) printf("FE_UNDERFLOW: "); > > printf("x = %e\n", x); > > return 0; > > } > > > > clang seems to get the underflow condition wrong. > > > > % cc -o z a.c -lm && ./z > > FE_UNDERFLOW: x = 0.000000e+00 > > > > % cc -O -o z a.c -lm && ./z > > x = 1.000000e-60 <--- This is not a possible value! > > > > % gcc -o z a.c -lm && ./z > > FE_UNDERFLOW: x = 0.000000e+00 > > > > % gcc -O -o z a.c -lm && ./z > > FE_UNDERFLOW: x = 0.000000e+00 > > Hmm, this is an interesting one. On amd64, it works as expected with > clang, but there it always uses SSE, obviously: > > $ ./underflow-amd64 > FE_UNDERFLOW: x = 0.000000e+00 > > The problem seems to be caused by the intermediate result being stored > using fstpl instead of fstps, e.g. simplifying the sample program (to > get rid of all the SSE stuff the fexxx() macros insert): > > int main(void) > { > float x; > __uint16_t status; > __fnclex(); > x = foo(); > __fnstsw(&status); > printf("status: %#x\n", (unsigned)status); > printf("x = %e\n", x); > return 0; > } > > With gcc, the assembly becomes: > > foo: > flds tiny.1853 > flds tiny.1853 > fmulp %st, %st(1) > ret > [...] > main: > [...] > fnclex > call foo > fstps 12(%esp) > fnstsw %ax > > In this case, fmulp does not generate an underflow, but the fstps will. > With clang, the assembly becomes: > > foo: > flds foo.tiny > fmuls foo.tiny > retl > [...] > main: > subl $24, %esp > fnclex > calll foo > fstpl 12(%esp) # 8-byte Folded Spill > fnstsw 22(%esp) > > So it's storing the intermediate result in a double, for some reason. > The fnstsw will then result in zero, since there was no underflow at > that point. > > I will submit a bug for this upstream, thanks for the report. > Thanks for the quick reply. But, it must be using an 80-bit extended double instead of a double for storage. This variation #include #include int main(void) { int i; // float x = 1.f; double x = 1.; i = 0; feclearexcept(FE_ALL_EXCEPT); do { x /= 2; i++; } while(!fetestexcept(FE_UNDERFLOW)); if (fetestexcept(FE_UNDERFLOW)) printf("FE_UNDERFLOW: "); printf("x = %e after %d iterations\n", x, i); return 0; } yields % cc -O -o z b.c -lm && ./z FE_UNDERFLOW: x = 0.000000e+00 after 16435 iterations It should be 1075 iterations. Note, there is a similar issue with OVERFLOW. The upshot is that clang on current is probably miscompiling libm. -- Steve