From owner-svn-src-all@FreeBSD.ORG Tue Oct 23 06:00:32 2012 Return-Path: Delivered-To: svn-src-all@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id C0F7BBFF; Tue, 23 Oct 2012 06:00:32 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from mail04.syd.optusnet.com.au (mail04.syd.optusnet.com.au [211.29.132.185]) by mx1.freebsd.org (Postfix) with ESMTP id 4CB388FC0C; Tue, 23 Oct 2012 06:00:31 +0000 (UTC) Received: from c122-106-175-26.carlnfd1.nsw.optusnet.com.au (c122-106-175-26.carlnfd1.nsw.optusnet.com.au [122.106.175.26]) by mail04.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id q9N60Hub026655 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Tue, 23 Oct 2012 17:00:19 +1100 Date: Tue, 23 Oct 2012 17:00:17 +1100 (EST) From: Bruce Evans X-X-Sender: bde@besplex.bde.org To: Warner Losh Subject: Re: svn commit: r241755 - head/lib/msun/src In-Reply-To: <45589524-E249-43E3-91B7-6A78068208AD@bsdimp.com> Message-ID: <20121023160721.O1282@besplex.bde.org> References: <201210192246.q9JMkm4R092929@svn.freebsd.org> <20121020150917.I1095@besplex.bde.org> <18177777-6EE0-4103-98B0-272EFF98FE96@bsdimp.com> <20121022040651.GA49632@troutmask.apl.washington.edu> <20121022134003.GA52156@troutmask.apl.washington.edu> <45589524-E249-43E3-91B7-6A78068208AD@bsdimp.com> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Optus-Cloudmark-Score: 0 X-Optus-Cloudmark-Analysis: v=2.0 cv=fLlhRume c=1 sm=1 a=xI64QrV9ptIA:10 a=kj9zAlcOel0A:10 a=PO7r1zJSAAAA:8 a=JzwRw_2MAAAA:8 a=Aet6fyW9sl8A:10 a=xwBt9yZo37ERmlmNs5IA:9 a=CjuIK1q_8ugA:10 a=bxQHXO5Py4tHmhUgaywp5w==:117 Cc: src-committers@freebsd.org, Steve Kargl , svn-src-all@freebsd.org, Bruce Evans , svn-src-head@freebsd.org, Warner Losh X-BeenThere: svn-src-all@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "SVN commit messages for the entire src tree \(except for " user" and " projects" \)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 23 Oct 2012 06:00:32 -0000 On Mon, 22 Oct 2012, Warner Losh wrote: > On Oct 22, 2012, at 7:40 AM, Steve Kargl wrote: >> ... >> BTW, besides bde's technical points, your change made >> our sources different from OpenBSD, NetBSD, and new >> project openlibm. Diffing against the other trees >> would become cluttered. > > BDE's technical points vary in quality and are difficult to argue with since they are so nit-picky. :( I'd be happy to work through them, but some of the issues I just fundamentally disagree with. Since I backed out the comments, I've decided not to spend the time arguing, but do think that documenting the differences between the precisions would be good. I started down this path because I thought expf was broken because it didn't match exp exactly... > > However, since he's implementing a new one, wouldn't that also have diffability issues too? Steve is implementing it :-). It would be completely different. It is already implemented, not quite perfectly, for expl(), and diffs between that version and the float and double versions are unreadable. Changing the float and double versions to be like it would make the diffs readable again. Imperfections in it include its documentation consisting mainly of a pointer to a much further-away place than a nearby source file -- to a paper by Tang which is still behind the ACM paywall AFAIK. It is slightly simpler and more general than the fdlibm version (no transformation through the apparently-magic R(z)), and well documented by the paper, so it is easier to understand iff you have seen the paper or know the general details. Starting from scratch, I wouldn't go this way. Translating the fdlibm exp() directly to expl() would have given a good enough version. Similarly for all functions in fdlibm. The double precision versions aren't perfect, but they are mostly good or very good. (It's interesting that they keep getting better with each generation of x86, because each generation has better support for the type of bit fiddling that the fdlibm functions like to do. Better means often taking 1/2 or 1/3 as many cycles relative to the 2002 generation of x86's, mainly by reducing pipeline stalls in instructions that fdlibm probably liked to use because they were fast on sparc in 1992, but were slow on x86 in 1992 and became relatively slower with pipelines on x86.) But Steve didn't understand the fdlibm version when he started, and didn't like the looks of it, so he wrote a completely different version. We now have a version that is so much better than the fdlibm version that it is silly to keep using the fdlibm version. It takes about the same time for expl() as for expf(), to create about 3 times as many accurate bits internally and deliver 64 of them (it would be useful to deliver more, but the API for this isn't established and expf() doesn't have any more to deliver). (This is for ld80; ld128 on at least sparc64 is so slow that it is unusable for almost all purposes and especially unusable for optimizing expf().) Normally, using the same algorithm, you have to work hard for long double precision to be less than 4 times slower than float precision. Note that i386 doesn't even use fdlibm for exp(). It uses the i387 for "efficiency". But with newer x86, even fdlibm's slow version is faster than the i387. We never used the i387 for expf() on i386 because optimizing expf() wasn't considered important until after x86's became new enough for their hardware expf to be slower than fdlibm software expf, though we almost imported this from NetBSD. The i387 is unusable for expl() on i386 since it is barely accurate enough for exp(). Lots of other i387 "optimized" versions on i386 should be removed. There are just a couple of them that are more efficient or more accurate (usually not both) than the fdlibm versions on modern x86. amd64 never used most of the ones that should be removed, though they would have been relatively more accurate for amd64. But it takes courage to axe working versions :-). Bruce