From owner-freebsd-standards@FreeBSD.ORG Fri Dec 7 06:58:46 2007 Return-Path: Delivered-To: freebsd-standards@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 69A7216A420 for ; Fri, 7 Dec 2007 06:58:46 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from mail18.syd.optusnet.com.au (mail18.syd.optusnet.com.au [211.29.132.199]) by mx1.freebsd.org (Postfix) with ESMTP id 0A1DD13C442 for ; Fri, 7 Dec 2007 06:58:45 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from c211-30-219-213.carlnfd3.nsw.optusnet.com.au (c211-30-219-213.carlnfd3.nsw.optusnet.com.au [211.30.219.213]) by mail18.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id lB76wdAQ031723 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Fri, 7 Dec 2007 17:58:44 +1100 Date: Fri, 7 Dec 2007 17:58:39 +1100 (EST) From: Bruce Evans X-X-Sender: bde@delplex.bde.org To: Steve Kargl In-Reply-To: <20071206231143.GA63969@troutmask.apl.washington.edu> Message-ID: <20071207173222.D702@delplex.bde.org> References: <20071012180959.GA36345@troutmask.apl.washington.edu> <20071206090833.GA95428@VARK.MIT.EDU> <20071206231143.GA63969@troutmask.apl.washington.edu> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: freebsd-standards@freebsd.org Subject: Re: [PATCH] hypotl, cabsl, and code removal in cabs X-BeenThere: freebsd-standards@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Standards compliance List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 07 Dec 2007 06:58:46 -0000 On Thu, 6 Dec 2007, Steve Kargl wrote: > On Thu, Dec 06, 2007 at 04:08:33AM -0500, David Schultz wrote: >> Also, umm, I've been busy and unable to pay attention for a while, >> so forgive me if I'm missing something, but isn't it the case that >> we don't have a sqrtl(), except for the gcc builtin on some >> architectures? > > bde pointed me to the right file in src/libm/ieee that explains > the rounding issues with hypotl. I haven't had a chance to > update my implementation to use extra care in the evaluation of > a*a+b*b. I fixed it in your mailbox for the float precision case. (It is useful to test algorithms for the float precision case, since only that case can be tested resonably exhaustively (not actually exhaustively for 2-arg functions like hypotf()). But after a lot of work, the debugged version reduces to almost the fdlibm version except for different style bugs.) > As to the sqrtl question, I have an implementation that supposely > does correct rounding in all rounding modes. It is restricted to > 64-bit significand long doubles. The code does not use bit twiddle; > instead, it uses fenv. This I haven't looked at closely. I fear extreme slowness. On athlon-xp, fenv accesses take a about 100 cycles each (129 for fldenv and 89 for fstenv; thus > 200 for fldenv+fstenv in a C-level fenv access), while bit twiddling instructions can be executed at up to 3 per cycle. mxcsr accesses are much faster, but mxcsr gives just more environment to handle for general C-level access functions, since the i387 and the SSE environments must be maintained in parallel, even on amd64 in case someone actually uses long doubles (SSE would suffice without long doubles). Anyway, the software version of sqrtl is irrelevant on athlon-xp, since athlon-xp has sqrtl in hardware (takes 35 cycles). Similarly for amd64, ia64 and possibly sparc64 (sparc64 has sqrt in hardware so it hopefully has sqrtl in hardware). arm and powerpc apparently have long double == double, so the software version of sqrtl is apparently only needed on ia64. When gcc and gcc actually support C99+IEC-mumble floating point, rounding and setting exception flags will have to continue to be handled using bit fiddling integer instructions or ordinary FP instructions, possibly moved to the C fenv access functions, since i387 fenv accesses are too slow to use for anything except initialization. Bruce