From owner-freebsd-standards@FreeBSD.ORG  Fri Dec  7 06:58:46 2007
Return-Path: <owner-freebsd-standards@FreeBSD.ORG>
Delivered-To: freebsd-standards@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 69A7216A420
	for <freebsd-standards@freebsd.org>;
	Fri,  7 Dec 2007 06:58:46 +0000 (UTC)
	(envelope-from brde@optusnet.com.au)
Received: from mail18.syd.optusnet.com.au (mail18.syd.optusnet.com.au
	[211.29.132.199])
	by mx1.freebsd.org (Postfix) with ESMTP id 0A1DD13C442
	for <freebsd-standards@freebsd.org>;
	Fri,  7 Dec 2007 06:58:45 +0000 (UTC)
	(envelope-from brde@optusnet.com.au)
Received: from c211-30-219-213.carlnfd3.nsw.optusnet.com.au
	(c211-30-219-213.carlnfd3.nsw.optusnet.com.au [211.30.219.213])
	by mail18.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id
	lB76wdAQ031723
	(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO);
	Fri, 7 Dec 2007 17:58:44 +1100
Date: Fri, 7 Dec 2007 17:58:39 +1100 (EST)
From: Bruce Evans <brde@optusnet.com.au>
X-X-Sender: bde@delplex.bde.org
To: Steve Kargl <sgk@troutmask.apl.washington.edu>
In-Reply-To: <20071206231143.GA63969@troutmask.apl.washington.edu>
Message-ID: <20071207173222.D702@delplex.bde.org>
References: <20071012180959.GA36345@troutmask.apl.washington.edu>
	<20071206090833.GA95428@VARK.MIT.EDU>
	<20071206231143.GA63969@troutmask.apl.washington.edu>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
Cc: freebsd-standards@freebsd.org
Subject: Re: [PATCH] hypotl, cabsl, and code removal in cabs
X-BeenThere: freebsd-standards@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Standards compliance <freebsd-standards.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-standards>, 
	<mailto:freebsd-standards-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-standards>
List-Post: <mailto:freebsd-standards@freebsd.org>
List-Help: <mailto:freebsd-standards-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-standards>, 
	<mailto:freebsd-standards-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 07 Dec 2007 06:58:46 -0000

On Thu, 6 Dec 2007, Steve Kargl wrote:

> On Thu, Dec 06, 2007 at 04:08:33AM -0500, David Schultz wrote:
>> Also, umm, I've been busy and unable to pay attention for a while,
>> so forgive me if I'm missing something, but isn't it the case that
>> we don't have a sqrtl(), except for the gcc builtin on some
>> architectures?
>
> bde pointed me to the right file in src/libm/ieee that explains
> the rounding issues with hypotl.  I haven't had a chance to
> update my implementation to use extra care in the evaluation of
> a*a+b*b.

I fixed it in your mailbox for the float precision case.  (It is useful
to test algorithms for the float precision case, since only that case
can be tested resonably exhaustively (not actually exhaustively for
2-arg functions like hypotf()).  But after a lot of work, the debugged
version reduces to almost the fdlibm version except for different
style bugs.)

> As to the sqrtl question, I have an implementation that supposely
> does correct rounding in all rounding modes.  It is restricted to
> 64-bit significand long doubles.  The code does not use bit twiddle;
> instead, it uses fenv.

This I haven't looked at closely.  I fear extreme slowness.  On
athlon-xp, fenv accesses take a about 100 cycles each (129 for fldenv
and 89 for fstenv; thus > 200 for fldenv+fstenv in a C-level fenv
access), while bit twiddling instructions can be executed at up to 3
per cycle.  mxcsr accesses are much faster, but mxcsr gives just more
environment to handle for general C-level access functions, since the
i387 and the SSE environments must be maintained in parallel, even on
amd64 in case someone actually uses long doubles (SSE would suffice
without long doubles).

Anyway, the software version of sqrtl is irrelevant on
athlon-xp, since athlon-xp has sqrtl in hardware (takes 35 cycles).
Similarly for amd64, ia64 and possibly sparc64 (sparc64 has sqrt in
hardware so it hopefully has sqrtl in hardware).  arm and powerpc
apparently have long double == double, so the software version of sqrtl
is apparently only needed on ia64.

When gcc and gcc actually support C99+IEC-mumble floating point,
rounding and setting exception flags will have to continue to be
handled using bit fiddling integer instructions or ordinary FP
instructions, possibly moved to the C fenv access functions, since
i387 fenv accesses are too slow to use for anything except
initialization.

Bruce