From owner-freebsd-numerics@FreeBSD.ORG Sun Aug 12 18:05:56 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 91816106566B for ; Sun, 12 Aug 2012 18:05:56 +0000 (UTC) (envelope-from stephen@missouri.edu) Received: from wilberforce.math.missouri.edu (wilberforce.math.missouri.edu [128.206.184.213]) by mx1.freebsd.org (Postfix) with ESMTP id 5350B8FC0A for ; Sun, 12 Aug 2012 18:05:56 +0000 (UTC) Received: from [127.0.0.1] (wilberforce.math.missouri.edu [128.206.184.213]) by wilberforce.math.missouri.edu (8.14.5/8.14.5) with ESMTP id q7CI5n7n002623 for ; Sun, 12 Aug 2012 13:05:49 -0500 (CDT) (envelope-from stephen@missouri.edu) Message-ID: <5027F07E.9060409@missouri.edu> Date: Sun, 12 Aug 2012 13:05:50 -0500 From: Stephen Montgomery-Smith User-Agent: Mozilla/5.0 (X11; Linux i686; rv:14.0) Gecko/20120714 Thunderbird/14.0 MIME-Version: 1.0 To: freebsd-numerics@freebsd.org References: <5017111E.6060003@missouri.edu> <501C361D.4010807@missouri.edu> <20120804165555.X1231@besplex.bde.org> <501D51D7.1020101@missouri.edu> <20120805030609.R3101@besplex.bde.org> <501D9C36.2040207@missouri.edu> <20120805175106.X3574@besplex.bde.org> <501EC015.3000808@missouri.edu> <20120805191954.GA50379@troutmask.apl.washington.edu> <20120807205725.GA10572@server.rulingia.com> <20120809025220.N4114@besplex.bde.org> In-Reply-To: <20120809025220.N4114@besplex.bde.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: Complex arg-trig functions X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 12 Aug 2012 18:05:56 -0000 Having brooded over the code for too many weeks, I now think I have finished my complex arg-trig functions. I have also written versions for float and long. So I am ready to have the code reviewed. http://people.freebsd.org/~stephen/ The long versions require a logl and a log1pl, which I faked using mpfr. The float versions are more complicated, because FLT_EPSILON is too close to the 4th root of FLT_MIN. It is simpler to make the float versions wrappers for the double versions. But I wrote the float versions anyway, just in case some purist insists that the wrapper approach is morally wrong. From owner-freebsd-numerics@FreeBSD.ORG Sun Aug 12 21:56:58 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id F41671065675 for ; Sun, 12 Aug 2012 21:56:57 +0000 (UTC) (envelope-from peter@rulingia.com) Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by mx1.freebsd.org (Postfix) with ESMTP id 5F17E8FC0C for ; Sun, 12 Aug 2012 21:56:56 +0000 (UTC) Received: from server.rulingia.com (c220-239-249-137.belrs5.nsw.optusnet.com.au [220.239.249.137]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CLusO9075316 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Mon, 13 Aug 2012 07:56:55 +1000 (EST) (envelope-from peter@rulingia.com) X-Bogosity: Ham, spamicity=0.000000 Received: from server.rulingia.com (localhost.rulingia.com [127.0.0.1]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CLum4O019986 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Mon, 13 Aug 2012 07:56:48 +1000 (EST) (envelope-from peter@server.rulingia.com) Received: (from peter@localhost) by server.rulingia.com (8.14.5/8.14.5/Submit) id q7CLumUZ019985 for freebsd-numerics@freebsd.org; Mon, 13 Aug 2012 07:56:48 +1000 (EST) (envelope-from peter) Date: Mon, 13 Aug 2012 07:56:48 +1000 From: Peter Jeremy To: freebsd-numerics@freebsd.org Message-ID: <20120812215648.GA19810@server.rulingia.com> References: <20120717225328.GA86902@server.rulingia.com> <20120717232740.GA95026@troutmask.apl.washington.edu> <20120718001337.GA87817@server.rulingia.com> <20120718123627.D1575@besplex.bde.org> <20120722121219.GC73662@server.rulingia.com> <500DAD41.5030104@missouri.edu> <20120724113214.G934@besplex.bde.org> <501204AD.30605@missouri.edu> <20120727032611.GB25690@server.rulingia.com> <20120728125824.GA26553@server.rulingia.com> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="Q68bSM7Ycu6FN28Q" Content-Disposition: inline In-Reply-To: <20120728125824.GA26553@server.rulingia.com> X-PGP-Key: http://www.rulingia.com/keys/peter.pgp User-Agent: Mutt/1.5.21 (2010-09-15) Subject: Re: Use of C99 extra long double math functions after r236148 X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 12 Aug 2012 21:56:58 -0000 --Q68bSM7Ycu6FN28Q Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On 2012-Jul-28 22:58:24 +1000, Peter Jeremy wro= te: >My test harness can be found at http://www.rulingia.com/~peter/ctest.c >There are no special compilation options, it just needs to be linked >with '-lm' (and '-ldl' on Linux). For normal use, just run the >executable - it will report any failures. For "finite" arguments, it >currently uses 3=CF=80/4 and 32769 other random numbers (the latter is >S_COUNT+1). stephen@ found that it wasn't checking long-double NAN results (it checked float results twice due to a typo). I've fixed that and can't see any other similar errors. An updated version can be found at http://www.rulingia.com/~peter/ctest.c Also, because it uses dlfunc() to locate functions for testing, all test functions have to be in shared libraries - it won't detect functions directly linked in. This means you need Makefile entries similar to: ctest: ctest.c libfoo.so cc ${CFLAGS} -Wl,-rpath=3D. ctest.c -o ctest -L. -lfoo -lm libfoo.so: foo.c cc ${CFLAGS} -fPIC -c foo.c cc ${CFLAGS} -shared -Wl,-soname,libfoo.so -o libfoo.so foo.o And, since I don't think I mentioned it before, it does bitwise comparisons of finite results. Particularly in cases where the result is a transcendental number, this can result in false positives where the result is out by a few ULP. You will need to interpret the actual report to decide whether the result is acceptable. --=20 Peter Jeremy --Q68bSM7Ycu6FN28Q Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.19 (FreeBSD) iEYEARECAAYFAlAoJqAACgkQ/opHv/APuIdqGACdE2bxIMKI1o1qIQKEcs3gaS1G 9qQAoKVbKzrHD2f9OKZ5C2HHMARh+u+0 =PGQt -----END PGP SIGNATURE----- --Q68bSM7Ycu6FN28Q-- From owner-freebsd-numerics@FreeBSD.ORG Sun Aug 12 22:35:55 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 3AA75106566B for ; Sun, 12 Aug 2012 22:35:55 +0000 (UTC) (envelope-from peter@rulingia.com) Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by mx1.freebsd.org (Postfix) with ESMTP id D51158FC08 for ; Sun, 12 Aug 2012 22:35:52 +0000 (UTC) Received: from server.rulingia.com (c220-239-249-137.belrs5.nsw.optusnet.com.au [220.239.249.137]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CMZq7R075408 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Mon, 13 Aug 2012 08:35:52 +1000 (EST) (envelope-from peter@rulingia.com) X-Bogosity: Ham, spamicity=0.000000 Received: from server.rulingia.com (localhost.rulingia.com [127.0.0.1]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CMZhSR020538 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Mon, 13 Aug 2012 08:35:43 +1000 (EST) (envelope-from peter@server.rulingia.com) Received: (from peter@localhost) by server.rulingia.com (8.14.5/8.14.5/Submit) id q7CMZgmr020537 for freebsd-numerics@freebsd.org; Mon, 13 Aug 2012 08:35:42 +1000 (EST) (envelope-from peter) Resent-From: Peter Jeremy Resent-Date: Mon, 13 Aug 2012 08:35:42 +1000 Resent-Message-ID: <20120812223542.GA20453@server.rulingia.com> Resent-To: freebsd-numerics@freebsd.org From: Peter Jeremy To: Steve Kargl Message-ID: <20120713114100.GB83006@server.rulingia.com> References: <20120529045612.GB4445@server.rulingia.com> <20120708124047.GA44061@zim.MIT.EDU> <210816F0-7ED7-4481-ABFF-C94A700A3EA0@bsdimp.com> <20120708233624.GA53462@troutmask.apl.washington.edu> <4FFBF16D.2030007@gwdg.de> <2A1DE516-ABB4-49D7-8C3D-2C4DA2D9FCF5@bsdimp.com> <20120711212009.GA15542@night.db.net> <20120711214346.GA9877@troutmask.apl.washington.edu> <20120711215414.GA16350@night.db.net> <20120711223247.GA9964@troutmask.apl.washington.edu> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="6sX45UoQRIJXqkqR" Content-Disposition: inline In-Reply-To: <20120711223247.GA9964@troutmask.apl.washington.edu> X-PGP-Key: http://www.rulingia.com/keys/peter.pgp User-Agent: Mutt/1.5.21 (2010-09-15) Cc: Diane Bruce , Rainer Hurling , David Schultz , Warner Losh , freebsd-current@freebsd.org Subject: Re: Use of C99 extra long double math functions after r236148 X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Date: Sun, 12 Aug 2012 22:35:55 -0000 X-Original-Date: Fri, 13 Jul 2012 21:41:00 +1000 X-List-Received-Date: Sun, 12 Aug 2012 22:35:55 -0000 --6sX45UoQRIJXqkqR Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On 2012-Jul-11 15:32:47 -0700, Steve Kargl wrote: >I know an approach to implementing many of the missing >functions. Are you willing to share this insight so someone else could do the work? > When I do find >some free time, I look at what is missing and start to >put together a new function. At the moment, it seems >that it takes 3+ years to get a new function written, >tested, and committed. And, from what I can see, much of this is done quietly - which opens up the possibility that two people might both implement the same code or that people will avoid the area in fear of treading on someone else's toes. As I said previously, I believe the existing wiki page could be improved to form a central co-ordinating point to show what what activity is (or isn't) occurring. >but most people seem to push the "easy button" and want >to grab either cephes or netlib's libm. There are >technical issues with this approach that I won't=20 >rehash again. Doing it properly requires significant effort by people with fairly specialised skills. Whilst the project has several people with the skills, it appears that none of them currently have the time. In the meantime, FreeBSD is taking free kicks from other FOSS groups that have gone down the quick-and-dirty path. AFAIK, none of the relevant standards (POSIX, IEEE754) have any precision requirements for functions other than +-*/ and sqrt() - all of which we have correctly implemented. I therefore believe that, for the remaining missing functions, the Project would be best served by committing the best code that is currently available under a suitable license and cleaning it up over time (as was done for the current libm). --=20 Peter Jeremy --6sX45UoQRIJXqkqR Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.19 (FreeBSD) iEYEARECAAYFAlAACUwACgkQ/opHv/APuIcwqgCgwLaUHwzv44xZgBxteeiYX9U/ uTgAnj55TtruaclDQ+wAXqqWQOqwcY1a =wEeu -----END PGP SIGNATURE----- --6sX45UoQRIJXqkqR-- From owner-freebsd-numerics@FreeBSD.ORG Sun Aug 12 22:36:42 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id D0EE71065674 for ; Sun, 12 Aug 2012 22:36:42 +0000 (UTC) (envelope-from peter@rulingia.com) Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by mx1.freebsd.org (Postfix) with ESMTP id 7358F8FC0C for ; Sun, 12 Aug 2012 22:36:42 +0000 (UTC) Received: from server.rulingia.com (c220-239-249-137.belrs5.nsw.optusnet.com.au [220.239.249.137]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CMag4Q075413 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Mon, 13 Aug 2012 08:36:42 +1000 (EST) (envelope-from peter@rulingia.com) X-Bogosity: Ham, spamicity=0.000000 Received: from server.rulingia.com (localhost.rulingia.com [127.0.0.1]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CMaZrT020558 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Mon, 13 Aug 2012 08:36:35 +1000 (EST) (envelope-from peter@server.rulingia.com) Received: (from peter@localhost) by server.rulingia.com (8.14.5/8.14.5/Submit) id q7CMaZZg020557 for freebsd-numerics@freebsd.org; Mon, 13 Aug 2012 08:36:35 +1000 (EST) (envelope-from peter) Resent-From: Peter Jeremy Resent-Date: Mon, 13 Aug 2012 08:36:35 +1000 Resent-Message-ID: <20120812223635.GB20453@server.rulingia.com> Resent-To: freebsd-numerics@freebsd.org From: Peter Jeremy To: David Schultz Message-ID: <20120716232519.GA66913@server.rulingia.com> References: <20120529045612.GB4445@server.rulingia.com> <20120711223247.GA9964@troutmask.apl.washington.edu> <20120713114100.GB83006@server.rulingia.com> <201207130818.38535.jhb@freebsd.org> <9EB2DA4F-19D7-4BA5-8811-D9451CB1D907@theravensnest.org> <20120713155805.GC81965@zim.MIT.EDU> <20120714120432.GA70706@server.rulingia.com> <20120717084457.U3890@besplex.bde.org> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="MfFXiAuoTsnnDAfZ" Content-Disposition: inline In-Reply-To: <20120717084457.U3890@besplex.bde.org> X-PGP-Key: http://www.rulingia.com/keys/peter.pgp User-Agent: Mutt/1.5.21 (2010-09-15) X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: Diane Bruce , Bruce Evans , John Baldwin , David Chisnall , Stephen Montgomery-Smith , Bruce Evans , Steve Kargl , Warner Losh Subject: Re: Use of C99 extra long double math functions after r236148 X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Date: Sun, 12 Aug 2012 22:36:42 -0000 X-Original-Date: Tue, 17 Jul 2012 09:25:19 +1000 X-List-Received-Date: Sun, 12 Aug 2012 22:36:42 -0000 --MfFXiAuoTsnnDAfZ Content-Type: multipart/mixed; boundary="W/nzBZO5zC0uMSeA" Content-Disposition: inline --W/nzBZO5zC0uMSeA Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On 2012-Jul-17 09:08:56 +1000, Bruce Evans wrote: >Somone offered to look at inverse trig/hyperbolic (?) complex functions. >Apparently they got scared. They are not nearly a simple matter of >using the formulas. At least not if you want decent accuracy and no surprises across the function domains. WG14/N1256 lists all the special cases, which helps. >On Sat, 14 Jul 2012, Peter Jeremy wrote: >> Attached is my suggestion for adding to the upcoming FreeBSD Status >> Report. Feel free to change the contact to myself or make other >> changes as you see fit and forward it. > >Nothing was attached :-). Sorry. I keep doing that. --=20 Peter Jeremy --W/nzBZO5zC0uMSeA-- --MfFXiAuoTsnnDAfZ Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.19 (FreeBSD) iEYEARECAAYFAlAEot8ACgkQ/opHv/APuIdgVgCeIXcUx7FvjcAg90CiZRRspgwp 6+8An04Ds6fhZ5/ovl7yqayTHkyQ/P0/ =tLIs -----END PGP SIGNATURE----- --MfFXiAuoTsnnDAfZ-- From owner-freebsd-numerics@FreeBSD.ORG Sun Aug 12 22:37:22 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 62CFD1065675 for ; Sun, 12 Aug 2012 22:37:22 +0000 (UTC) (envelope-from peter@rulingia.com) Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by mx1.freebsd.org (Postfix) with ESMTP id E2D178FC16 for ; Sun, 12 Aug 2012 22:37:21 +0000 (UTC) Received: from server.rulingia.com (c220-239-249-137.belrs5.nsw.optusnet.com.au [220.239.249.137]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CMbLIJ075419 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Mon, 13 Aug 2012 08:37:21 +1000 (EST) (envelope-from peter@rulingia.com) X-Bogosity: Ham, spamicity=0.000000 Received: from server.rulingia.com (localhost.rulingia.com [127.0.0.1]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CMbFRx020577 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Mon, 13 Aug 2012 08:37:15 +1000 (EST) (envelope-from peter@server.rulingia.com) Received: (from peter@localhost) by server.rulingia.com (8.14.5/8.14.5/Submit) id q7CMbFba020576 for freebsd-numerics@freebsd.org; Mon, 13 Aug 2012 08:37:15 +1000 (EST) (envelope-from peter) Resent-From: Peter Jeremy Resent-Date: Mon, 13 Aug 2012 08:37:15 +1000 Resent-Message-ID: <20120812223715.GC20453@server.rulingia.com> Resent-To: freebsd-numerics@freebsd.org Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q6H41Oc9069952 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Tue, 17 Jul 2012 14:01:24 +1000 (EST) (envelope-from sgk@troutmask.apl.washington.edu) Received: from troutmask.apl.washington.edu (troutmask.apl.washington.edu [128.95.76.21]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q6H41L5d065863 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Tue, 17 Jul 2012 14:01:23 +1000 (EST) (envelope-from sgk@troutmask.apl.washington.edu) Received: from troutmask.apl.washington.edu (localhost.apl.washington.edu [127.0.0.1]) by troutmask.apl.washington.edu (8.14.5/8.14.5) with ESMTP id q6H41Jl5086881; Mon, 16 Jul 2012 21:01:19 -0700 (PDT) (envelope-from sgk@troutmask.apl.washington.edu) Received: (from sgk@localhost) by troutmask.apl.washington.edu (8.14.5/8.14.5/Submit) id q6H41Ipk086880; Mon, 16 Jul 2012 21:01:18 -0700 (PDT) (envelope-from sgk) From: Steve Kargl To: Stephen Montgomery-Smith Message-ID: <20120717040118.GA86840@troutmask.apl.washington.edu> References: <20120711223247.GA9964@troutmask.apl.washington.edu> <20120713114100.GB83006@server.rulingia.com> <201207130818.38535.jhb@freebsd.org> <9EB2DA4F-19D7-4BA5-8811-D9451CB1D907@theravensnest.org> <20120713155805.GC81965@zim.MIT.EDU> <20120714120432.GA70706@server.rulingia.com> <20120717084457.U3890@besplex.bde.org> <5004A5C7.1040405@missouri.edu> <5004DEA9.1050001@missouri.edu> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <5004DEA9.1050001@missouri.edu> User-Agent: Mutt/1.4.2.3i Cc: Diane Bruce , John Baldwin , David Chisnall , Bruce Evans , Bruce Evans , David Schultz , Peter Jeremy , Warner Losh Subject: Re: Use of C99 extra long double math functions after r236148 X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Date: Sun, 12 Aug 2012 22:37:22 -0000 X-Original-Date: Mon, 16 Jul 2012 21:01:18 -0700 X-List-Received-Date: Sun, 12 Aug 2012 22:37:22 -0000 On Mon, Jul 16, 2012 at 10:40:25PM -0500, Stephen Montgomery-Smith wrote: > > I came up with pseudo code that looks a bit like this. > > complex casinh(complex z) { > double x = z.re, y = z.im; > > if (y==0) > return asinh(x); > if (x==0) { > if (fabs(y)<=1) return I*asin(y); > else return signum(y)* ( > log(fabs(y)+sqrt(y*y-1)) > + I*PI/2); Stop. Please see msun/src/math_private.h. You cannot use I in any expression. gcc in base and clang do not do the arithmetic correctly. See msun/src/s_ccosh.c for how one might approach writing these functions. Also, consult n1256.pdf for x,y = +-0, +-inf, nan. There are specific requirements that must be met. -- Steve From owner-freebsd-numerics@FreeBSD.ORG Sun Aug 12 22:37:39 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 4F5D7106566B for ; Sun, 12 Aug 2012 22:37:39 +0000 (UTC) (envelope-from peter@rulingia.com) Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by mx1.freebsd.org (Postfix) with ESMTP id CDE2A8FC08 for ; Sun, 12 Aug 2012 22:37:38 +0000 (UTC) Received: from server.rulingia.com (c220-239-249-137.belrs5.nsw.optusnet.com.au [220.239.249.137]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CMbcuM075423 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Mon, 13 Aug 2012 08:37:38 +1000 (EST) (envelope-from peter@rulingia.com) X-Bogosity: Ham, spamicity=0.000000 Received: from server.rulingia.com (localhost.rulingia.com [127.0.0.1]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CMbVlX020584 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Mon, 13 Aug 2012 08:37:31 +1000 (EST) (envelope-from peter@server.rulingia.com) Received: (from peter@localhost) by server.rulingia.com (8.14.5/8.14.5/Submit) id q7CMbV2c020583 for freebsd-numerics@freebsd.org; Mon, 13 Aug 2012 08:37:31 +1000 (EST) (envelope-from peter) Resent-From: Peter Jeremy Resent-Date: Mon, 13 Aug 2012 08:37:31 +1000 Resent-Message-ID: <20120812223731.GD20453@server.rulingia.com> Resent-To: freebsd-numerics@freebsd.org Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q6H4CXQW070041 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Tue, 17 Jul 2012 14:12:33 +1000 (EST) (envelope-from imp@bsdimp.com) Received: from mail-gg0-f177.google.com (mail-gg0-f177.google.com [209.85.161.177]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q6H4CUhu065892 (version=TLSv1/SSLv3 cipher=RC4-SHA bits=128 verify=OK) for ; Tue, 17 Jul 2012 14:12:33 +1000 (EST) (envelope-from imp@bsdimp.com) Received: by ggcs5 with SMTP id s5so6243697ggc.36 for ; Mon, 16 Jul 2012 21:12:24 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=sender:subject:mime-version:content-type:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to:x-mailer :x-gm-message-state; bh=kINev5HP/nExsHpfB9JvNN+ucoEDgSZ4ULGnyl+tuSk=; b=mCZdIPUcoCX+xklQV1FoKB8lIDnXqs8H1UM3OPPufSr7Q2LvSOXX649QcapRFk7QTk PfjVj6EtPKlfZHhmWPNfTMjSaMgFDyjI9vq5xcirH4RTW4WN0WSBF4DLepfG0D1IJ9MK QpZBkLqTScvFsl/Xz04vwdnMPrXbiidHlJ3c88ByXglXtJriu6EON/uBjCVdUMDjHQ+r O1AQJaem/7o0ksLgOuvznkfh/Z4Er94CnEPy1FjdJBc/YcLaXpntm0+VKdypaIzpCsgu ggY5oYG9DA7277bIWrDzEzzGwN+MjnyQdIUUXfllO3XQ5llhuMD/WTUwZZ3vDkZjjrPa YJUw== Received: by 10.66.76.231 with SMTP id n7mr1970636paw.68.1342498343866; Mon, 16 Jul 2012 21:12:23 -0700 (PDT) Received: from [172.16.1.217] ([4.53.41.66]) by mx.google.com with ESMTPS id ot4sm13181996pbb.65.2012.07.16.21.12.22 (version=TLSv1/SSLv3 cipher=OTHER); Mon, 16 Jul 2012 21:12:23 -0700 (PDT) Sender: Warner Losh Mime-Version: 1.0 (Apple Message framework v1084) Content-Type: text/plain; charset=us-ascii From: Warner Losh In-Reply-To: <20120717040118.GA86840@troutmask.apl.washington.edu> Message-Id: <6F750F84-34FF-4961-A2EA-F3E67A6872AE@bsdimp.com> References: <20120711223247.GA9964@troutmask.apl.washington.edu> <20120713114100.GB83006@server.rulingia.com> <201207130818.38535.jhb@freebsd.org> <9EB2DA4F-19D7-4BA5-8811-D9451CB1D907@theravensnest.org> <20120713155805.GC81965@zim.MIT.EDU> <20120714120432.GA70706@server.rulingia.com> <20120717084457.U3890@besplex.bde.org> <5004A5C7.1040405@missouri.edu> <5004DEA9.1050001@missouri.edu> <20120717040118.GA86840@troutmask.apl.washington.edu> To: Steve Kargl X-Mailer: Apple Mail (2.1084) X-Gm-Message-State: ALoCoQkGeHx4TgLmtguEy3nTXwlADeCEh75sl1lGJfgjth/HEmzfarhzj0KBE9kd8wm6EjB5ADvp Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by server.rulingia.com id q6H4CXQW070041 Cc: Diane Bruce , John Baldwin , David Chisnall , Stephen Montgomery-Smith , Bruce Evans , Bruce Evans , David Schultz , Peter Jeremy Subject: Re: Use of C99 extra long double math functions after r236148 X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Date: Sun, 12 Aug 2012 22:37:39 -0000 X-Original-Date: Mon, 16 Jul 2012 22:12:21 -0600 X-List-Received-Date: Sun, 12 Aug 2012 22:37:39 -0000 On Jul 16, 2012, at 10:01 PM, Steve Kargl wrote: > On Mon, Jul 16, 2012 at 10:40:25PM -0500, Stephen Montgomery-Smith wrote: >> >> I came up with pseudo code that looks a bit like this. >> >> complex casinh(complex z) { >> double x = z.re, y = z.im; >> >> if (y==0) >> return asinh(x); >> if (x==0) { >> if (fabs(y)<=1) return I*asin(y); >> else return signum(y)* ( >> log(fabs(y)+sqrt(y*y-1)) >> + I*PI/2); > > Stop. Please see msun/src/math_private.h. You cannot > use I in any expression. gcc in base and clang do not > do the arithmetic correctly. See msun/src/s_ccosh.c > for how one might approach writing these functions. > Also, consult n1256.pdf for x,y = +-0, +-inf, nan. > There are specific requirements that must be met. Yes. Pseudo code is OK for following the flow, but look at the exp code for why that's not entirely sufficient. You have to be extremely fussy about all kinds of things. Then again, exp is a lot more important to get right than the complex trig functions... The pseudo code is a good place to start, but it just the barest start in the integration process... Warner From owner-freebsd-numerics@FreeBSD.ORG Sun Aug 12 22:42:41 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 2734E106566B for ; Sun, 12 Aug 2012 22:42:41 +0000 (UTC) (envelope-from peter@rulingia.com) Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by mx1.freebsd.org (Postfix) with ESMTP id A40CC8FC12 for ; Sun, 12 Aug 2012 22:42:40 +0000 (UTC) Received: from server.rulingia.com (c220-239-249-137.belrs5.nsw.optusnet.com.au [220.239.249.137]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CMgeSx075438 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Mon, 13 Aug 2012 08:42:40 +1000 (EST) (envelope-from peter@rulingia.com) X-Bogosity: Ham, spamicity=0.000000 Received: from server.rulingia.com (localhost.rulingia.com [127.0.0.1]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CMgXsh020658 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Mon, 13 Aug 2012 08:42:33 +1000 (EST) (envelope-from peter@server.rulingia.com) Received: (from peter@localhost) by server.rulingia.com (8.14.5/8.14.5/Submit) id q7CMgXQM020657 for freebsd-numerics@freebsd.org; Mon, 13 Aug 2012 08:42:33 +1000 (EST) (envelope-from peter) Date: Mon, 13 Aug 2012 08:42:33 +1000 From: Peter Jeremy To: freebsd-numerics@freebsd.org Message-ID: <20120812224233.GA20516@server.rulingia.com> References: <20120529045612.GB4445@server.rulingia.com> <20120711223247.GA9964@troutmask.apl.washington.edu> <20120713114100.GB83006@server.rulingia.com> <201207130818.38535.jhb@freebsd.org> <9EB2DA4F-19D7-4BA5-8811-D9451CB1D907@theravensnest.org> <20120713155805.GC81965@zim.MIT.EDU> <20120714120432.GA70706@server.rulingia.com> <20120717084457.U3890@besplex.bde.org> <20120716232519.GA66913@server.rulingia.com> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="3MwIy2ne0vdjdPXF" Content-Disposition: inline In-Reply-To: <20120716232519.GA66913@server.rulingia.com> X-PGP-Key: http://www.rulingia.com/keys/peter.pgp User-Agent: Mutt/1.5.21 (2010-09-15) Subject: Re: Use of C99 extra long double math functions after r236148 X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 12 Aug 2012 22:42:41 -0000 --3MwIy2ne0vdjdPXF Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Following agreement by participants, I am pushing the contents of an off-list discuussion to ensure that it is available in the archives. Apologies to the participants who will have already seen this. --=20 Peter Jeremy --3MwIy2ne0vdjdPXF Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.19 (FreeBSD) iEYEARECAAYFAlAoMVkACgkQ/opHv/APuIemVACfe78wJIQNr/cHN2MogAMOBgvP SdEAoJ+otgOz7rsX9BZjLLvlntoBspYg =Svqq -----END PGP SIGNATURE----- --3MwIy2ne0vdjdPXF-- From owner-freebsd-numerics@FreeBSD.ORG Sun Aug 12 22:54:38 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id BD171106564A for ; Sun, 12 Aug 2012 22:54:38 +0000 (UTC) (envelope-from peter@rulingia.com) Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by mx1.freebsd.org (Postfix) with ESMTP id 248B48FC08 for ; Sun, 12 Aug 2012 22:54:36 +0000 (UTC) Received: from server.rulingia.com (c220-239-249-137.belrs5.nsw.optusnet.com.au [220.239.249.137]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CMsaZw075467 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Mon, 13 Aug 2012 08:54:36 +1000 (EST) (envelope-from peter@rulingia.com) X-Bogosity: Ham, spamicity=0.000000 Received: from server.rulingia.com (localhost.rulingia.com [127.0.0.1]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CMsSGr020792 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Mon, 13 Aug 2012 08:54:29 +1000 (EST) (envelope-from peter@server.rulingia.com) Received: (from peter@localhost) by server.rulingia.com (8.14.5/8.14.5/Submit) id q7CMsS3H020791 for freebsd-numerics@freebsd.org; Mon, 13 Aug 2012 08:54:28 +1000 (EST) (envelope-from peter) Resent-From: Peter Jeremy Resent-Date: Mon, 13 Aug 2012 08:54:28 +1000 Resent-Message-ID: <20120812225428.GE20453@server.rulingia.com> Resent-To: freebsd-numerics@freebsd.org From: Peter Jeremy Mail-Followup-To: freebsd-numerics@freebsd.org To: Steve Kargl Message-ID: <20120717042125.GF66913@server.rulingia.com> References: <20120713114100.GB83006@server.rulingia.com> <201207130818.38535.jhb@freebsd.org> <9EB2DA4F-19D7-4BA5-8811-D9451CB1D907@theravensnest.org> <20120713155805.GC81965@zim.MIT.EDU> <20120714120432.GA70706@server.rulingia.com> <20120717084457.U3890@besplex.bde.org> <5004A5C7.1040405@missouri.edu> <5004DEA9.1050001@missouri.edu> <20120717040118.GA86840@troutmask.apl.washington.edu> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="WlEyl6ow+jlIgNUh" Content-Disposition: inline In-Reply-To: <20120717040118.GA86840@troutmask.apl.washington.edu> X-PGP-Key: http://www.rulingia.com/keys/peter.pgp User-Agent: Mutt/1.5.21 (2010-09-15) Cc: Diane Bruce , John Baldwin , David Chisnall , Stephen Montgomery-Smith , Bruce Evans , Bruce Evans , David Schultz , Warner Losh Subject: Re: Use of C99 extra long double math functions after r236148 X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Date: Sun, 12 Aug 2012 22:54:38 -0000 X-Original-Date: Tue, 17 Jul 2012 14:21:25 +1000 X-List-Received-Date: Sun, 12 Aug 2012 22:54:38 -0000 --WlEyl6ow+jlIgNUh Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On 2012-Jul-16 21:01:18 -0700, Steve Kargl wrote: >On Mon, Jul 16, 2012 at 10:40:25PM -0500, Stephen Montgomery-Smith wrote: >>=20 >> I came up with pseudo code that looks a bit like this. =2E.. >Stop. Please see msun/src/math_private.h. You cannot >use I in any expression. Note the "pseudo code" reference. I agree that the actual code has to jump through hoops to avoid compiler issues with complex arithmetic but doing that for pseudocode just obfuscates it. >Also, consult n1256.pdf for x,y =3D +-0, +-inf, nan.=20 >There are specific requirements that must be met. Again, handling the special cases listed in G.6 is all just boilerplate code that we can take as assumed for pseudocode. IMO, it would be nice if we could come up with a formal, compact (one/line per rule) representation of G.6 that could be pre-processed into wrapper code that handles all the 0/Inf/NaN special-cases and then calls FOO_finite() which has the hand-written code to handle "normal" cases. --=20 Peter Jeremy --WlEyl6ow+jlIgNUh Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.19 (FreeBSD) iEYEARECAAYFAlAE6EUACgkQ/opHv/APuIf6qQCfW6VSxZMiXmQUuo9thoIW+con uW4An2GQ54eREWJTsW7C/nF46zrSgWaV =lDPy -----END PGP SIGNATURE----- --WlEyl6ow+jlIgNUh-- From owner-freebsd-numerics@FreeBSD.ORG Sun Aug 12 22:56:06 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id C52AB1065673 for ; Sun, 12 Aug 2012 22:56:06 +0000 (UTC) (envelope-from peter@rulingia.com) Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by mx1.freebsd.org (Postfix) with ESMTP id 5D8A18FC0A for ; Sun, 12 Aug 2012 22:56:05 +0000 (UTC) Received: from server.rulingia.com (c220-239-249-137.belrs5.nsw.optusnet.com.au [220.239.249.137]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CMu518075488 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Mon, 13 Aug 2012 08:56:05 +1000 (EST) (envelope-from peter@rulingia.com) X-Bogosity: Ham, spamicity=0.000000 Received: from server.rulingia.com (localhost.rulingia.com [127.0.0.1]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CMtx2J020844 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Mon, 13 Aug 2012 08:55:59 +1000 (EST) (envelope-from peter@server.rulingia.com) Received: (from peter@localhost) by server.rulingia.com (8.14.5/8.14.5/Submit) id q7CMtx2Z020843 for freebsd-numerics@freebsd.org; Mon, 13 Aug 2012 08:55:59 +1000 (EST) (envelope-from peter) Resent-From: Peter Jeremy Resent-Date: Mon, 13 Aug 2012 08:55:59 +1000 Resent-Message-ID: <20120812225559.GF20453@server.rulingia.com> Resent-To: freebsd-numerics@freebsd.org Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q6H4crIo070263 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Tue, 17 Jul 2012 14:38:54 +1000 (EST) (envelope-from sgk@troutmask.apl.washington.edu) Received: from troutmask.apl.washington.edu (troutmask.apl.washington.edu [128.95.76.21]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q6H4cos0065948 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Tue, 17 Jul 2012 14:38:52 +1000 (EST) (envelope-from sgk@troutmask.apl.washington.edu) Received: from troutmask.apl.washington.edu (localhost.apl.washington.edu [127.0.0.1]) by troutmask.apl.washington.edu (8.14.5/8.14.5) with ESMTP id q6H4cm3s087072; Mon, 16 Jul 2012 21:38:48 -0700 (PDT) (envelope-from sgk@troutmask.apl.washington.edu) Received: (from sgk@localhost) by troutmask.apl.washington.edu (8.14.5/8.14.5/Submit) id q6H4cmYs087071; Mon, 16 Jul 2012 21:38:48 -0700 (PDT) (envelope-from sgk) From: Steve Kargl Mail-Followup-To: freebsd-numerics@freebsd.org To: Peter Jeremy Message-ID: <20120717043848.GB87001@troutmask.apl.washington.edu> References: <201207130818.38535.jhb@freebsd.org> <9EB2DA4F-19D7-4BA5-8811-D9451CB1D907@theravensnest.org> <20120713155805.GC81965@zim.MIT.EDU> <20120714120432.GA70706@server.rulingia.com> <20120717084457.U3890@besplex.bde.org> <5004A5C7.1040405@missouri.edu> <5004DEA9.1050001@missouri.edu> <20120717040118.GA86840@troutmask.apl.washington.edu> <20120717042125.GF66913@server.rulingia.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20120717042125.GF66913@server.rulingia.com> User-Agent: Mutt/1.4.2.3i Cc: Diane Bruce , John Baldwin , David Chisnall , Stephen Montgomery-Smith , Bruce Evans , Bruce Evans , David Schultz , Warner Losh Subject: Re: Use of C99 extra long double math functions after r236148 X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Date: Sun, 12 Aug 2012 22:56:06 -0000 X-Original-Date: Mon, 16 Jul 2012 21:38:48 -0700 X-List-Received-Date: Sun, 12 Aug 2012 22:56:06 -0000 On Tue, Jul 17, 2012 at 02:21:25PM +1000, Peter Jeremy wrote: > On 2012-Jul-16 21:01:18 -0700, Steve Kargl wrote: > >On Mon, Jul 16, 2012 at 10:40:25PM -0500, Stephen Montgomery-Smith wrote: > >> > >> I came up with pseudo code that looks a bit like this. > ... > >Stop. Please see msun/src/math_private.h. You cannot > >use I in any expression. > > Note the "pseudo code" reference. I agree that the actual code has to > jump through hoops to avoid compiler issues with complex arithmetic > but doing that for pseudocode just obfuscates it. > > >Also, consult n1256.pdf for x,y = +-0, +-inf, nan. > >There are specific requirements that must be met. > > Again, handling the special cases listed in G.6 is all just > boilerplate code that we can take as assumed for pseudocode. IMO, it > would be nice if we could come up with a formal, compact (one/line per > rule) representation of G.6 that could be pre-processed into wrapper > code that handles all the 0/Inf/NaN special-cases and then calls > FOO_finite() which has the hand-written code to handle "normal" cases. > As someone who spent 10+ years getting sqrtl(), cbrtl(), ccosh(), sinl(), cosl(), tanl(), etc into FreeBSD, I respectfully disagree with your take that it is just boilerplate. Getting this stuff right is much harder than I think some people understand. Oh well, I'll back to lurking and working on things I need. -- Steve From owner-freebsd-numerics@FreeBSD.ORG Sun Aug 12 22:57:40 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 54F7F106564A for ; Sun, 12 Aug 2012 22:57:40 +0000 (UTC) (envelope-from peter@rulingia.com) Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by mx1.freebsd.org (Postfix) with ESMTP id CADF68FC18 for ; Sun, 12 Aug 2012 22:57:39 +0000 (UTC) Received: from server.rulingia.com (c220-239-249-137.belrs5.nsw.optusnet.com.au [220.239.249.137]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CMvc99075500 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Mon, 13 Aug 2012 08:57:38 +1000 (EST) (envelope-from peter@rulingia.com) X-Bogosity: Ham, spamicity=0.000000 Received: from server.rulingia.com (localhost.rulingia.com [127.0.0.1]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CMvWKd020879 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Mon, 13 Aug 2012 08:57:32 +1000 (EST) (envelope-from peter@server.rulingia.com) Received: (from peter@localhost) by server.rulingia.com (8.14.5/8.14.5/Submit) id q7CMvW9S020878 for freebsd-numerics@freebsd.org; Mon, 13 Aug 2012 08:57:32 +1000 (EST) (envelope-from peter) Resent-From: Peter Jeremy Resent-Date: Mon, 13 Aug 2012 08:57:32 +1000 Resent-Message-ID: <20120812225732.GG20453@server.rulingia.com> Resent-To: freebsd-numerics@freebsd.org From: Peter Jeremy Mail-Followup-To: freebsd-numerics@freebsd.org To: Steve Kargl Message-ID: <20120717225328.GA86902@server.rulingia.com> References: <9EB2DA4F-19D7-4BA5-8811-D9451CB1D907@theravensnest.org> <20120713155805.GC81965@zim.MIT.EDU> <20120714120432.GA70706@server.rulingia.com> <20120717084457.U3890@besplex.bde.org> <5004A5C7.1040405@missouri.edu> <5004DEA9.1050001@missouri.edu> <20120717040118.GA86840@troutmask.apl.washington.edu> <20120717042125.GF66913@server.rulingia.com> <20120717043848.GB87001@troutmask.apl.washington.edu> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="JP+T4n/bALQSJXh8" Content-Disposition: inline In-Reply-To: <20120717043848.GB87001@troutmask.apl.washington.edu> X-PGP-Key: http://www.rulingia.com/keys/peter.pgp User-Agent: Mutt/1.5.21 (2010-09-15) X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: Diane Bruce , John Baldwin , David Chisnall , Stephen Montgomery-Smith , Bruce Evans , Bruce Evans , David Schultz , Warner Losh Subject: Re: Use of C99 extra long double math functions after r236148 X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Date: Sun, 12 Aug 2012 22:57:40 -0000 X-Original-Date: Wed, 18 Jul 2012 08:53:28 +1000 X-List-Received-Date: Sun, 12 Aug 2012 22:57:40 -0000 --JP+T4n/bALQSJXh8 Content-Type: multipart/mixed; boundary="0OAP2g/MAC+5xKAE" Content-Disposition: inline --0OAP2g/MAC+5xKAE Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On 2012-Jul-16 21:38:48 -0700, Steve Kargl wrote: >On Tue, Jul 17, 2012 at 02:21:25PM +1000, Peter Jeremy wrote: >> Again, handling the special cases listed in G.6 is all just >> boilerplate code that we can take as assumed for pseudocode. IMO, it > >As someone who spent 10+ years getting sqrtl(), cbrtl(), ccosh(), sinl(), >cosl(), tanl(), etc into FreeBSD, I respectfully disagree with your >take that it is just boilerplate. Getting this stuff right is much harder >than I think some people understand. Oh well, I'll back to lurking and >working on things I need. OK. I'll admit that I haven't tried this before but attached is my first try at catan[h](). It should cover all the special casing according to WG14/N1256 and compiles if you add the relevant declarations to complex.h. I'd appreciate feedback. Notes on it: - The actual code at the end of catanh() is just one possible algorithm. It's not intended as final code and the final code is likely to need additional special case handling to minimise precision loss and prevent unwanted exceptions. - cpack(-cimag(r), -creal(r)) gives better code than -cpack(cimag(r), creal(r)) on i386 and identical code on amd64. - The fpclassify() macros are bitmasks on FreeBSD. Assuming this would allow (ci =3D=3D FP_ZERO || ci =3D=3D FP_NAN) to be simplified but = the standard only requires that they have distinct values and (eg) Solaris implements them as a series so the current code is more portable. --=20 Peter Jeremy --0OAP2g/MAC+5xKAE-- --JP+T4n/bALQSJXh8 Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.19 (FreeBSD) iEYEARECAAYFAlAF7OgACgkQ/opHv/APuIfXQACgiDp/QmvWDA0PLVQTSKJ2EwO+ XE0An1fLjJbEROoRhdKA94UapHEg4rn6 =L/Yi -----END PGP SIGNATURE----- --JP+T4n/bALQSJXh8-- From owner-freebsd-numerics@FreeBSD.ORG Sun Aug 12 22:57:45 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 8DCAA1065675 for ; Sun, 12 Aug 2012 22:57:45 +0000 (UTC) (envelope-from peter@rulingia.com) Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by mx1.freebsd.org (Postfix) with ESMTP id EEDBD8FC1D for ; Sun, 12 Aug 2012 22:57:44 +0000 (UTC) Received: from server.rulingia.com (c220-239-249-137.belrs5.nsw.optusnet.com.au [220.239.249.137]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CMviXl075503 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Mon, 13 Aug 2012 08:57:44 +1000 (EST) (envelope-from peter@rulingia.com) X-Bogosity: Ham, spamicity=0.000000 Received: from server.rulingia.com (localhost.rulingia.com [127.0.0.1]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CMvc3F020885 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Mon, 13 Aug 2012 08:57:38 +1000 (EST) (envelope-from peter@server.rulingia.com) Received: (from peter@localhost) by server.rulingia.com (8.14.5/8.14.5/Submit) id q7CMvcqD020884 for freebsd-numerics@freebsd.org; Mon, 13 Aug 2012 08:57:38 +1000 (EST) (envelope-from peter) Resent-From: Peter Jeremy Resent-Date: Mon, 13 Aug 2012 08:57:38 +1000 Resent-Message-ID: <20120812225738.GH20453@server.rulingia.com> Resent-To: freebsd-numerics@freebsd.org Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q6HNRjuA088033 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Wed, 18 Jul 2012 09:27:46 +1000 (EST) (envelope-from sgk@troutmask.apl.washington.edu) Received: from troutmask.apl.washington.edu (troutmask.apl.washington.edu [128.95.76.21]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q6HNRh0G070533 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Wed, 18 Jul 2012 09:27:45 +1000 (EST) (envelope-from sgk@troutmask.apl.washington.edu) Received: from troutmask.apl.washington.edu (localhost.apl.washington.edu [127.0.0.1]) by troutmask.apl.washington.edu (8.14.5/8.14.5) with ESMTP id q6HNRfrV095081; Tue, 17 Jul 2012 16:27:41 -0700 (PDT) (envelope-from sgk@troutmask.apl.washington.edu) Received: (from sgk@localhost) by troutmask.apl.washington.edu (8.14.5/8.14.5/Submit) id q6HNReZG095080; Tue, 17 Jul 2012 16:27:40 -0700 (PDT) (envelope-from sgk) From: Steve Kargl Mail-Followup-To: freebsd-numerics@freebsd.org To: Peter Jeremy Message-ID: <20120717232740.GA95026@troutmask.apl.washington.edu> References: <20120713155805.GC81965@zim.MIT.EDU> <20120714120432.GA70706@server.rulingia.com> <20120717084457.U3890@besplex.bde.org> <5004A5C7.1040405@missouri.edu> <5004DEA9.1050001@missouri.edu> <20120717040118.GA86840@troutmask.apl.washington.edu> <20120717042125.GF66913@server.rulingia.com> <20120717043848.GB87001@troutmask.apl.washington.edu> <20120717225328.GA86902@server.rulingia.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20120717225328.GA86902@server.rulingia.com> User-Agent: Mutt/1.4.2.3i Cc: Diane Bruce , John Baldwin , David Chisnall , Stephen Montgomery-Smith , Bruce Evans , Bruce Evans , David Schultz , Warner Losh Subject: Re: Use of C99 extra long double math functions after r236148 X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Date: Sun, 12 Aug 2012 22:57:45 -0000 X-Original-Date: Tue, 17 Jul 2012 16:27:40 -0700 X-List-Received-Date: Sun, 12 Aug 2012 22:57:45 -0000 On Wed, Jul 18, 2012 at 08:53:28AM +1000, Peter Jeremy wrote: > On 2012-Jul-16 21:38:48 -0700, Steve Kargl wrote: > >On Tue, Jul 17, 2012 at 02:21:25PM +1000, Peter Jeremy wrote: > >> Again, handling the special cases listed in G.6 is all just > >> boilerplate code that we can take as assumed for pseudocode. IMO, it > > > >As someone who spent 10+ years getting sqrtl(), cbrtl(), ccosh(), sinl(), > >cosl(), tanl(), etc into FreeBSD, I respectfully disagree with your > >take that it is just boilerplate. Getting this stuff right is much harder > >than I think some people understand. Oh well, I'll back to lurking and > >working on things I need. > > OK. I'll admit that I haven't tried this before but attached is my > first try at catan[h](). It should cover all the special casing > according to WG14/N1256 and compiles if you add the relevant > declarations to complex.h. I'd appreciate feedback. > > Notes on it: > - The actual code at the end of catanh() is just one possible > algorithm. It's not intended as final code and the final code is > likely to need additional special case handling to minimise > precision loss and prevent unwanted exceptions. > - cpack(-cimag(r), -creal(r)) gives better code than > -cpack(cimag(r), creal(r)) on i386 and identical code on amd64. > - The fpclassify() macros are bitmasks on FreeBSD. Assuming this > would allow (ci == FP_ZERO || ci == FP_NAN) to be simplified but the > standard only requires that they have distinct values and (eg) > Solaris implements them as a series so the current code is more > portable. > I won't have time to go over the code in detail until this weekend, but a quick peek showed some issues. The first is style. Although fdlibm has a rather interest coding style, new code should use KNF. /* * Calculate complex arc tangent using the identity: * catan(z) = -i catanh(iz) */ double complex catan(double complex z) { complex double r; r = catanh(cpack(cimag(z), creal(z))); I think you're missing a sign. Let z = x + i*y. Then, i*z = i*x+i*i*y = -y + i*x, yielding r = catanh(cpack(-cimag(z), creal(z))); return (cpack(-cimag(r), -creal(r))); Again, it seems a sign error has occurred. Let catanh(i*z) = u + i*v. Then, you have -i*catanh(i*z) = -i*u-i*i*v = v-i*u, yielding return (cpack(cimag(r), -creal(r))); } Of coure, I could be wrong. -- Steve From owner-freebsd-numerics@FreeBSD.ORG Sun Aug 12 22:57:51 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 3780D1065675 for ; Sun, 12 Aug 2012 22:57:51 +0000 (UTC) (envelope-from peter@rulingia.com) Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by mx1.freebsd.org (Postfix) with ESMTP id 978E98FC25 for ; Sun, 12 Aug 2012 22:57:50 +0000 (UTC) Received: from server.rulingia.com (c220-239-249-137.belrs5.nsw.optusnet.com.au [220.239.249.137]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CMvobC075506 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Mon, 13 Aug 2012 08:57:50 +1000 (EST) (envelope-from peter@rulingia.com) X-Bogosity: Ham, spamicity=0.000000 Received: from server.rulingia.com (localhost.rulingia.com [127.0.0.1]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CMvij0020892 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Mon, 13 Aug 2012 08:57:44 +1000 (EST) (envelope-from peter@server.rulingia.com) Received: (from peter@localhost) by server.rulingia.com (8.14.5/8.14.5/Submit) id q7CMviJ0020891 for freebsd-numerics@freebsd.org; Mon, 13 Aug 2012 08:57:44 +1000 (EST) (envelope-from peter) Resent-From: Peter Jeremy Resent-Date: Mon, 13 Aug 2012 08:57:44 +1000 Resent-Message-ID: <20120812225744.GI20453@server.rulingia.com> Resent-To: freebsd-numerics@freebsd.org From: Peter Jeremy Mail-Followup-To: freebsd-numerics@freebsd.org To: Steve Kargl Message-ID: <20120718001337.GA87817@server.rulingia.com> References: <20120713155805.GC81965@zim.MIT.EDU> <20120714120432.GA70706@server.rulingia.com> <20120717084457.U3890@besplex.bde.org> <5004A5C7.1040405@missouri.edu> <5004DEA9.1050001@missouri.edu> <20120717040118.GA86840@troutmask.apl.washington.edu> <20120717042125.GF66913@server.rulingia.com> <20120717043848.GB87001@troutmask.apl.washington.edu> <20120717225328.GA86902@server.rulingia.com> <20120717232740.GA95026@troutmask.apl.washington.edu> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="oC1+HKm2/end4ao3" Content-Disposition: inline In-Reply-To: <20120717232740.GA95026@troutmask.apl.washington.edu> X-PGP-Key: http://www.rulingia.com/keys/peter.pgp User-Agent: Mutt/1.5.21 (2010-09-15) X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: Diane Bruce , John Baldwin , David Chisnall , Stephen Montgomery-Smith , Bruce Evans , Bruce Evans , David Schultz , Warner Losh Subject: Re: Use of C99 extra long double math functions after r236148 X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Date: Sun, 12 Aug 2012 22:57:51 -0000 X-Original-Date: Wed, 18 Jul 2012 10:13:37 +1000 X-List-Received-Date: Sun, 12 Aug 2012 22:57:51 -0000 --oC1+HKm2/end4ao3 Content-Type: multipart/mixed; boundary="TB36FDmn/VVEgNH/" Content-Disposition: inline --TB36FDmn/VVEgNH/ Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On 2012-Jul-17 16:27:40 -0700, Steve Kargl wrote: >I won't have time to go over the code in detail until >this weekend, but a quick peek showed some issues. The >first is style. Although fdlibm has a rather interest >coding style, new code should use KNF. I hope that was only the function declaration lines. I think the rest is KNF. >I think you're missing a sign. Let z =3D x + i*y. >Then, i*z =3D i*x+i*i*y =3D -y + i*x, yielding Yes, I think you're right in both cases. I wasn't thinking clearly. (Something to automatically generate this sort of code would mean you only need to write it once). Try the attached (also at http://www.rulingia.com/~peter/catan.c ). And I'm aware it has UTF-8 in the comments. I can't quickly find anything on character sets in style(9) and would apppreciate a ruling on it. --=20 Peter Jeremy --TB36FDmn/VVEgNH/-- --oC1+HKm2/end4ao3 Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.19 (FreeBSD) iEYEARECAAYFAlAF/7EACgkQ/opHv/APuIdEVACfdN6+ABauEDyKCXwxmnVKy0IK SN8An0vC0SpUmF59NCBlUlInGFLEPzB4 =pTWJ -----END PGP SIGNATURE----- --oC1+HKm2/end4ao3-- From owner-freebsd-numerics@FreeBSD.ORG Sun Aug 12 22:57:56 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 9F2751065674 for ; Sun, 12 Aug 2012 22:57:56 +0000 (UTC) (envelope-from peter@rulingia.com) Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by mx1.freebsd.org (Postfix) with ESMTP id 382948FC0C for ; Sun, 12 Aug 2012 22:57:56 +0000 (UTC) Received: from server.rulingia.com (c220-239-249-137.belrs5.nsw.optusnet.com.au [220.239.249.137]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CMvuMB075510 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Mon, 13 Aug 2012 08:57:56 +1000 (EST) (envelope-from peter@rulingia.com) X-Bogosity: Ham, spamicity=0.000000 Received: from server.rulingia.com (localhost.rulingia.com [127.0.0.1]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CMvn5v020899 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Mon, 13 Aug 2012 08:57:49 +1000 (EST) (envelope-from peter@server.rulingia.com) Received: (from peter@localhost) by server.rulingia.com (8.14.5/8.14.5/Submit) id q7CMvnlu020898 for freebsd-numerics@freebsd.org; Mon, 13 Aug 2012 08:57:49 +1000 (EST) (envelope-from peter) Resent-From: Peter Jeremy Resent-Date: Mon, 13 Aug 2012 08:57:49 +1000 Resent-Message-ID: <20120812225749.GJ20453@server.rulingia.com> Resent-To: freebsd-numerics@freebsd.org Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q6I1gtMU089283 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Wed, 18 Jul 2012 11:42:56 +1000 (EST) (envelope-from sgk@troutmask.apl.washington.edu) Received: from troutmask.apl.washington.edu (troutmask.apl.washington.edu [128.95.76.21]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q6I1gra8070895 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Wed, 18 Jul 2012 11:42:54 +1000 (EST) (envelope-from sgk@troutmask.apl.washington.edu) Received: from troutmask.apl.washington.edu (localhost.apl.washington.edu [127.0.0.1]) by troutmask.apl.washington.edu (8.14.5/8.14.5) with ESMTP id q6I1gpWl095622; Tue, 17 Jul 2012 18:42:51 -0700 (PDT) (envelope-from sgk@troutmask.apl.washington.edu) Received: (from sgk@localhost) by troutmask.apl.washington.edu (8.14.5/8.14.5/Submit) id q6I1goO4095621; Tue, 17 Jul 2012 18:42:50 -0700 (PDT) (envelope-from sgk) From: Steve Kargl Mail-Followup-To: freebsd-numerics@freebsd.org To: Peter Jeremy Message-ID: <20120718014250.GA95603@troutmask.apl.washington.edu> References: <20120714120432.GA70706@server.rulingia.com> <20120717084457.U3890@besplex.bde.org> <5004A5C7.1040405@missouri.edu> <5004DEA9.1050001@missouri.edu> <20120717040118.GA86840@troutmask.apl.washington.edu> <20120717042125.GF66913@server.rulingia.com> <20120717043848.GB87001@troutmask.apl.washington.edu> <20120717225328.GA86902@server.rulingia.com> <20120717232740.GA95026@troutmask.apl.washington.edu> <20120718001337.GA87817@server.rulingia.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20120718001337.GA87817@server.rulingia.com> User-Agent: Mutt/1.4.2.3i Cc: Diane Bruce , John Baldwin , David Chisnall , Stephen Montgomery-Smith , Bruce Evans , Bruce Evans , David Schultz , Warner Losh Subject: Re: Use of C99 extra long double math functions after r236148 X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Date: Sun, 12 Aug 2012 22:57:56 -0000 X-Original-Date: Tue, 17 Jul 2012 18:42:50 -0700 X-List-Received-Date: Sun, 12 Aug 2012 22:57:56 -0000 On Wed, Jul 18, 2012 at 10:13:37AM +1000, Peter Jeremy wrote: > On 2012-Jul-17 16:27:40 -0700, Steve Kargl wrote: > >I won't have time to go over the code in detail until > >this weekend, but a quick peek showed some issues. The > >first is style. Although fdlibm has a rather interest > >coding style, new code should use KNF. > > I hope that was only the function declaration lines. I think > the rest is KNF. Comments were wrong, too. > >I think you're missing a sign. Let z = x + i*y. > >Then, i*z = i*x+i*i*y = -y + i*x, yielding > > Yes, I think you're right in both cases. I wasn't thinking clearly. > (Something to automatically generate this sort of code would mean you > only need to write it once). > > Try the attached (also at http://www.rulingia.com/~peter/catan.c ). > And I'm aware it has UTF-8 in the comments. I can't quickly find > anything on character sets in style(9) and would apppreciate a > ruling on it. I doubt style(9) says anything about UTF-8 as style(9) has been around a long time. Can you inline the code after your sig? It is easier to review and comment on; otherwise, I have to save the attachment, open it in a window, and copy-n-paste. It amy also be prudent to start a new thread as this subthread may get lost. The question is where. I'm not subscribed to hackers and Bruce is not subscribed to current. -- Steve From owner-freebsd-numerics@FreeBSD.ORG Sun Aug 12 22:58:14 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id E74891065675 for ; Sun, 12 Aug 2012 22:58:14 +0000 (UTC) (envelope-from peter@rulingia.com) Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by mx1.freebsd.org (Postfix) with ESMTP id 6BD708FC1A for ; Sun, 12 Aug 2012 22:58:14 +0000 (UTC) Received: from server.rulingia.com (c220-239-249-137.belrs5.nsw.optusnet.com.au [220.239.249.137]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CMwEMN075523 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Mon, 13 Aug 2012 08:58:14 +1000 (EST) (envelope-from peter@rulingia.com) X-Bogosity: Ham, spamicity=0.000000 Received: from server.rulingia.com (localhost.rulingia.com [127.0.0.1]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CMw7s9020926 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Mon, 13 Aug 2012 08:58:07 +1000 (EST) (envelope-from peter@server.rulingia.com) Received: (from peter@localhost) by server.rulingia.com (8.14.5/8.14.5/Submit) id q7CMw7WE020925 for freebsd-numerics@freebsd.org; Mon, 13 Aug 2012 08:58:07 +1000 (EST) (envelope-from peter) Resent-From: Peter Jeremy Resent-Date: Mon, 13 Aug 2012 08:58:07 +1000 Resent-Message-ID: <20120812225807.GL20453@server.rulingia.com> Resent-To: freebsd-numerics@freebsd.org From: Peter Jeremy Mail-Followup-To: freebsd-numerics@freebsd.org To: Bruce Evans Message-ID: <20120719213944.GA21199@server.rulingia.com> References: <20120717084457.U3890@besplex.bde.org> <5004A5C7.1040405@missouri.edu> <5004DEA9.1050001@missouri.edu> <20120717040118.GA86840@troutmask.apl.washington.edu> <20120717042125.GF66913@server.rulingia.com> <20120717043848.GB87001@troutmask.apl.washington.edu> <20120717225328.GA86902@server.rulingia.com> <20120717232740.GA95026@troutmask.apl.washington.edu> <20120718001337.GA87817@server.rulingia.com> <20120718123627.D1575@besplex.bde.org> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="tThc/1wpZn/ma/RB" Content-Disposition: inline In-Reply-To: <20120718123627.D1575@besplex.bde.org> X-PGP-Key: http://www.rulingia.com/keys/peter.pgp User-Agent: Mutt/1.5.21 (2010-09-15) Cc: Diane Bruce , John Baldwin , David Chisnall , Stephen Montgomery-Smith , Bruce Evans , Steve Kargl , David Schultz , Warner Losh Subject: Re: Use of C99 extra long double math functions after r236148 X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Date: Sun, 12 Aug 2012 22:58:15 -0000 X-Original-Date: Fri, 20 Jul 2012 07:39:44 +1000 X-List-Received-Date: Sun, 12 Aug 2012 22:58:15 -0000 --tThc/1wpZn/ma/RB Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On 2012-Jul-18 14:01:42 +1000, Bruce Evans wrote: >Another style point visible in this comment is how to write 'i' and >multiplication. Multiplication by juxtaposition (iz) doesn't work >near C code with long identifiers which might be named iz. It probably >requires all variable names to be 1 letter in a special font for this >use. I know phk@ believes we (programmers in general) should take advantage of not being tied to KSR33's any longer and use more descriptive symbols. Unicode offers: U+2111 "=E2=84=91" imaginary part U+2148 "=E2=85=88" DOUBLE-STRUCK ITALIC SMALL I * sometimes used for the imaginary unit U+2149 "=E2=85=89" DOUBLE-STRUCK ITALIC SMALL J * sometimes used for the imaginary unit but all of these look fairly ugly in the "fixed" font I'm using, as does U+221E "=E2=88=9E" INFINITY and I can't find anything that would represent NaN > I tried to use "I z" consistently in comments in c_ccosh*.c and >to get everyone to follow this convention, but there are already some >inconsistencies, and I now wonder if "z I" is better. The pari >presentation uses "*" and puts "I" last, and uses spaces for "+" but >not for "*" (e.g., "1 + 2*I"). And, thinking about it, I tend to say/think/write "i" as a suffix for literal numeric constants ("three i") but as a prefix for named constants ("i pi"), functions ("i sin theta") etc. Note that the latter is fairly unambiguous written or spoken whereas "sin theta i" needs explicit parentheses and/or operators to disambiguate it. Overall, my preference would be "x + I y" or "x + I*y". >The standard classification macros are good for developing things, but >they are very slow. All (?) committed complex functions use hard-coded >bit test. I notice that the functions are full of hard-coded magic constants. Would these be better as macros to: 1) Provide a description as to their purpose; and 2) Reduce differences between the float/long/double function bodies? > These are almost as easy to write as the classification macros. But not quite as clear to follow. I've also noticed that ccosh() puts the (hopefully more common) case where the argument is finite first - which should make it faster for typical use, whereas I wrote catanh() by peeling off all the exceptional conditions first so the code falls through to the "normal" case at the end. The approach I used is easier to write (and probably visually verify) but typically slower so I'll rearrange things along the lines of ccosh(). >% /* >% * catanh(+0 + i0) returns +0 + i0. >% * catanh(+0 + iNaN) returns +0 + iNaN. >% */ > >This looks like the description in C99. ccosh.c uses something like: Most of the comments were cut from n1256.pdf and cleaned up a bit. >This is arcane and I probably got it wrong in many cases. My hope was >that someday all of these comments could be turned into meta-info that >is used to generate test vectors and assertions and maybe man pages. >They don't belong in the code. But to generate test vectors and >assertions, they need to be very formal and correct. For man pages, >I think I prefer to hard-code the documentation but test that it agrees >with the meta-info. OTOH, I was hoping to write the descriptions formally and generate the code from them but that is also a difficult task. >% if (cr =3D=3D FP_NAN) { >% /* >% * catanh(NaN + iInf) returns =C2=B10 + i-1=C2=A7=C3=80/2 >% * the sign of the real part of the result is not >% * specified by the standard so return +0. >% */ > >The UTF is similar to in C99 where it is used for the "+-" amd "infinity" >symbols. It messes up n869.txt too (C and POSIX working group translations >to text are poor. IIRC, "+-" gets mangled to "+", and "infinity gets >mangled to "0"). > >Why Inf for the arg and not for the result? I don't understand this comment. According to n1256.pdf, there's only one case where an infinity is returned (1 + I*0) and I do have "Inf" in that comment. >Here are my current fixes for committed versions of complex functions. I'll take these comments into account where applicable to catanh() but committing (or PR'ing) them is a separate issue that I'll leave for now. --=20 Peter Jeremy --tThc/1wpZn/ma/RB Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.19 (FreeBSD) iEYEARECAAYFAlAIfqAACgkQ/opHv/APuIc9lACeNcJDUHATpKBpRviMHhGX5t5s WJ8AoIUZsst5GfrZKcg79S5BA2nQ/LbC =keU6 -----END PGP SIGNATURE----- --tThc/1wpZn/ma/RB-- From owner-freebsd-numerics@FreeBSD.ORG Sun Aug 12 22:58:22 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 31B1E106564A for ; Sun, 12 Aug 2012 22:58:22 +0000 (UTC) (envelope-from peter@rulingia.com) Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by mx1.freebsd.org (Postfix) with ESMTP id BE8F98FC14 for ; Sun, 12 Aug 2012 22:58:21 +0000 (UTC) Received: from server.rulingia.com (c220-239-249-137.belrs5.nsw.optusnet.com.au [220.239.249.137]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CMwLUw075527 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Mon, 13 Aug 2012 08:58:21 +1000 (EST) (envelope-from peter@rulingia.com) X-Bogosity: Ham, spamicity=0.000000 Received: from server.rulingia.com (localhost.rulingia.com [127.0.0.1]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CMwFMx020935 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Mon, 13 Aug 2012 08:58:15 +1000 (EST) (envelope-from peter@server.rulingia.com) Received: (from peter@localhost) by server.rulingia.com (8.14.5/8.14.5/Submit) id q7CMwF1S020934 for freebsd-numerics@freebsd.org; Mon, 13 Aug 2012 08:58:15 +1000 (EST) (envelope-from peter) Resent-From: Peter Jeremy Resent-Date: Mon, 13 Aug 2012 08:58:15 +1000 Resent-Message-ID: <20120812225815.GM20453@server.rulingia.com> Resent-To: freebsd-numerics@freebsd.org Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q6JNiUOk022480 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Fri, 20 Jul 2012 09:44:31 +1000 (EST) (envelope-from sgk@troutmask.apl.washington.edu) Received: from troutmask.apl.washington.edu (troutmask.apl.washington.edu [128.95.76.21]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q6JNiSej084612 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Fri, 20 Jul 2012 09:44:30 +1000 (EST) (envelope-from sgk@troutmask.apl.washington.edu) Received: from troutmask.apl.washington.edu (localhost.apl.washington.edu [127.0.0.1]) by troutmask.apl.washington.edu (8.14.5/8.14.5) with ESMTP id q6JNiQIL006309; Thu, 19 Jul 2012 16:44:26 -0700 (PDT) (envelope-from sgk@troutmask.apl.washington.edu) Received: (from sgk@localhost) by troutmask.apl.washington.edu (8.14.5/8.14.5/Submit) id q6JNiPHO006308; Thu, 19 Jul 2012 16:44:25 -0700 (PDT) (envelope-from sgk) From: Steve Kargl Mail-Followup-To: freebsd-numerics@freebsd.org To: Peter Jeremy Message-ID: <20120719234425.GA6280@troutmask.apl.washington.edu> References: <5004A5C7.1040405@missouri.edu> <5004DEA9.1050001@missouri.edu> <20120717040118.GA86840@troutmask.apl.washington.edu> <20120717042125.GF66913@server.rulingia.com> <20120717043848.GB87001@troutmask.apl.washington.edu> <20120717225328.GA86902@server.rulingia.com> <20120717232740.GA95026@troutmask.apl.washington.edu> <20120718001337.GA87817@server.rulingia.com> <20120718123627.D1575@besplex.bde.org> <20120719213944.GA21199@server.rulingia.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20120719213944.GA21199@server.rulingia.com> User-Agent: Mutt/1.4.2.3i Cc: Diane Bruce , John Baldwin , David Chisnall , Stephen Montgomery-Smith , Bruce Evans , Bruce Evans , David Schultz , Warner Losh Subject: Re: Use of C99 extra long double math functions after r236148 X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Date: Sun, 12 Aug 2012 22:58:22 -0000 X-Original-Date: Thu, 19 Jul 2012 16:44:25 -0700 X-List-Received-Date: Sun, 12 Aug 2012 22:58:22 -0000 On Fri, Jul 20, 2012 at 07:39:44AM +1000, Peter Jeremy wrote: > On 2012-Jul-18 14:01:42 +1000, Bruce Evans wrote: > > >The standard classification macros are good for developing things, but > >they are very slow. All (?) committed complex functions use hard-coded > >bit test. > > I notice that the functions are full of hard-coded magic constants. > Would these be better as macros to: > 1) Provide a description as to their purpose; and > 2) Reduce differences between the float/long/double function bodies? > I collected some of the float and double into a cheat sheet. Idioms used in libm with float type: int32_t xsb; u_int32_t hx; GET_FLOAT_WORD(hx, x); /* Get the sign bit of x */ xsb = (hx >> 31) & 1; /* high word of |x| */ hx &= 0x7fffffff; /* NaN */ if (hx > 0x7f800000) return (x + x); /* exp(+-inf) = {inf, 0} */ if (hx == 0x7f800000) return (xsb == 0) ? x : 0.0; /* subnormal */ if (hx < 0x00800000) Idioms used in lib with double type: u_int32_t hx, lx, xsb EXTRACT_WORDS(hx, lx, x); /* sign bit of x */ xsb = (hx >> 31) & 1; /* high word of |x| */ hx &= 0x7fffffff; /* subnormal */ if (hx < 0x00100000) /* Test for NaN and +-Inf. */ if (hx >= 0x7ff00000) { /* Is it a NaN? */ if (((hx & 0xfffff) | lx) != 0) return (x + x); /* It's an +-Inf. */ return ((xsb == 0) ? x : 0.0); /* exp(+-inf)={inf,0} */ } -- Steve From owner-freebsd-numerics@FreeBSD.ORG Sun Aug 12 22:58:35 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 09727106564A for ; Sun, 12 Aug 2012 22:58:35 +0000 (UTC) (envelope-from peter@rulingia.com) Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by mx1.freebsd.org (Postfix) with ESMTP id 96B518FC14 for ; Sun, 12 Aug 2012 22:58:34 +0000 (UTC) Received: from server.rulingia.com (c220-239-249-137.belrs5.nsw.optusnet.com.au [220.239.249.137]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CMwY8W075534 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Mon, 13 Aug 2012 08:58:34 +1000 (EST) (envelope-from peter@rulingia.com) X-Bogosity: Ham, spamicity=0.000000 Received: from server.rulingia.com (localhost.rulingia.com [127.0.0.1]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CMwSDk020953 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Mon, 13 Aug 2012 08:58:28 +1000 (EST) (envelope-from peter@server.rulingia.com) Received: (from peter@localhost) by server.rulingia.com (8.14.5/8.14.5/Submit) id q7CMwS4j020952 for freebsd-numerics@freebsd.org; Mon, 13 Aug 2012 08:58:28 +1000 (EST) (envelope-from peter) Resent-From: Peter Jeremy Resent-Date: Mon, 13 Aug 2012 08:58:28 +1000 Resent-Message-ID: <20120812225828.GO20453@server.rulingia.com> Resent-To: freebsd-numerics@freebsd.org Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q6K4U9Tk025066 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Fri, 20 Jul 2012 14:30:10 +1000 (EST) (envelope-from sgk@troutmask.apl.washington.edu) Received: from troutmask.apl.washington.edu (troutmask.apl.washington.edu [128.95.76.21]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q6K4U6Lg085380 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Fri, 20 Jul 2012 14:30:08 +1000 (EST) (envelope-from sgk@troutmask.apl.washington.edu) Received: from troutmask.apl.washington.edu (localhost.apl.washington.edu [127.0.0.1]) by troutmask.apl.washington.edu (8.14.5/8.14.5) with ESMTP id q6K4U5gK007441; Thu, 19 Jul 2012 21:30:05 -0700 (PDT) (envelope-from sgk@troutmask.apl.washington.edu) Received: (from sgk@localhost) by troutmask.apl.washington.edu (8.14.5/8.14.5/Submit) id q6K4U4B9007440; Thu, 19 Jul 2012 21:30:04 -0700 (PDT) (envelope-from sgk) From: Steve Kargl Mail-Followup-To: freebsd-numerics@freebsd.org To: Bruce Evans Message-ID: <20120720043004.GA7404@troutmask.apl.washington.edu> References: <20120717040118.GA86840@troutmask.apl.washington.edu> <20120717042125.GF66913@server.rulingia.com> <20120717043848.GB87001@troutmask.apl.washington.edu> <20120717225328.GA86902@server.rulingia.com> <20120717232740.GA95026@troutmask.apl.washington.edu> <20120718001337.GA87817@server.rulingia.com> <20120718123627.D1575@besplex.bde.org> <20120719213944.GA21199@server.rulingia.com> <20120719234425.GA6280@troutmask.apl.washington.edu> <20120720130309.P814@besplex.bde.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20120720130309.P814@besplex.bde.org> User-Agent: Mutt/1.4.2.3i Cc: Diane Bruce , John Baldwin , David Chisnall , Stephen Montgomery-Smith , Bruce Evans , David Schultz , Peter Jeremy , Warner Losh Subject: Re: Use of C99 extra long double math functions after r236148 X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Date: Sun, 12 Aug 2012 22:58:35 -0000 X-Original-Date: Thu, 19 Jul 2012 21:30:04 -0700 X-List-Received-Date: Sun, 12 Aug 2012 22:58:35 -0000 On Fri, Jul 20, 2012 at 02:12:20PM +1000, Bruce Evans wrote: > On Thu, 19 Jul 2012, Steve Kargl wrote: > > >I collected some of the float and double into a cheat sheet. > > > >Idioms used in libm with float type: > > > > int32_t xsb; > > u_int32_t hx; > > > > GET_FLOAT_WORD(hx, x); > > > > /* Get the sign bit of x */ > > xsb = (hx >> 31) & 1; > > Getting it without shifting it (hx & 0x8000) is more efficient and common. > You don't need to shift it in this example. I collected these from msun/src, when I was trying to understand the magic numbers. I suppose someone should audit the code for consistency. :-) laptop:kargl[219] grep " (hx>>31)&1" *c e_exp.c: xsb = (hx>>31)&1; /* sign bit of x */ e_expf.c: xsb = (hx>>31)&1; /* sign bit of x */ -- Steve From owner-freebsd-numerics@FreeBSD.ORG Sun Aug 12 23:00:18 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 181EB106566B for ; Sun, 12 Aug 2012 23:00:18 +0000 (UTC) (envelope-from peter@rulingia.com) Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by mx1.freebsd.org (Postfix) with ESMTP id B25DE8FC18 for ; Sun, 12 Aug 2012 23:00:17 +0000 (UTC) Received: from server.rulingia.com (c220-239-249-137.belrs5.nsw.optusnet.com.au [220.239.249.137]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CN0HiY075565 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Mon, 13 Aug 2012 09:00:17 +1000 (EST) (envelope-from peter@rulingia.com) X-Bogosity: Ham, spamicity=0.000000 Received: from server.rulingia.com (localhost.rulingia.com [127.0.0.1]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CN0BCn021024 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Mon, 13 Aug 2012 09:00:11 +1000 (EST) (envelope-from peter@server.rulingia.com) Received: (from peter@localhost) by server.rulingia.com (8.14.5/8.14.5/Submit) id q7CN0BIM021023 for freebsd-numerics@freebsd.org; Mon, 13 Aug 2012 09:00:11 +1000 (EST) (envelope-from peter) Resent-From: Peter Jeremy Resent-Date: Mon, 13 Aug 2012 09:00:11 +1000 Resent-Message-ID: <20120812230011.GR20453@server.rulingia.com> Resent-To: freebsd-numerics@freebsd.org From: Peter Jeremy Mail-Followup-To: freebsd-numerics@freebsd.org To: Bruce Evans Message-ID: <20120721003103.GA73662@server.rulingia.com> References: <20120717084457.U3890@besplex.bde.org> <5004A5C7.1040405@missouri.edu> <5004DEA9.1050001@missouri.edu> <20120717040118.GA86840@troutmask.apl.washington.edu> <20120717042125.GF66913@server.rulingia.com> <20120717043848.GB87001@troutmask.apl.washington.edu> <20120717225328.GA86902@server.rulingia.com> <20120717232740.GA95026@troutmask.apl.washington.edu> <20120718001337.GA87817@server.rulingia.com> <20120718123627.D1575@besplex.bde.org> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="3V7upXqbjpZ4EhLz" Content-Disposition: inline In-Reply-To: <20120718123627.D1575@besplex.bde.org> X-PGP-Key: http://www.rulingia.com/keys/peter.pgp User-Agent: Mutt/1.5.21 (2010-09-15) Cc: Diane Bruce , John Baldwin , David Chisnall , Stephen Montgomery-Smith , Bruce Evans , Steve Kargl , David Schultz , Warner Losh Subject: Re: Use of C99 extra long double math functions after r236148 X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Date: Sun, 12 Aug 2012 23:00:18 -0000 X-Original-Date: Sat, 21 Jul 2012 10:31:03 +1000 X-List-Received-Date: Sun, 12 Aug 2012 23:00:18 -0000 --3V7upXqbjpZ4EhLz Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Hi Bruce or das@ or Steve, I have a question on the following code from s_ccosh.c: % /* % * cosh(NaN + I NaN) =3D d(NaN) + I d(NaN). % * % * cosh(NaN +- I Inf) =3D d(NaN) + I d(NaN). % * Optionally raises the invalid floating-point exception. % * Choice =3D raise. % * % * cosh(NaN + I y) =3D d(NaN) + I d(NaN). % * Optionally raises the invalid floating-point exception for fini= te % * nonzero y. Choice =3D don't raise (except for signaling NaNs). % */ % return (cpack((x * x) * (y - y), (x + x) * (y - y))); x is always NaN so the real part presumably just needs to be quietened before returning - ie (x + x) would seem to be sufficient. Why does the code use ((x * x) * (y - y))? y has no restriction on its value so an arithmetic operation with x is a good way to convert it to a NaN. Wouldn't (y + x) be sufficient? My understanding is that: - Addition is generally faster than multiplication - Signs are irrelevant for NaN so merging the sign of x doesn't matter. - NaN + NaN returns the (quietened?) left-hand NaN - Inf + NaN returns the (quietened?) right-hand NaN - finite + NaN returns the (quietened?) right-hand NaN Also, whilst things like ((x + x) * (y - y)) are reasonably efficient on x86, few (if any) RISC architectures support exceptional conditions in hardware. My understanding is that SPARC would trap back into the userland handler (lib/libc/sparc64/fpu) on each operation unless both arguments and the result are normalised numbers. Explicitly fiddling with the FPU state would seem faster than multiple traps. --=20 Peter Jeremy --3V7upXqbjpZ4EhLz Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.19 (FreeBSD) iEYEARECAAYFAlAJ+EcACgkQ/opHv/APuIdyMwCfZAIdShOhLfYvF85audLwPDXc qv4An0zDk2V7gAxY9LrM1MHccIcgzeo9 =OkzB -----END PGP SIGNATURE----- --3V7upXqbjpZ4EhLz-- From owner-freebsd-numerics@FreeBSD.ORG Sun Aug 12 23:00:40 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id DBC6E1065677 for ; Sun, 12 Aug 2012 23:00:40 +0000 (UTC) (envelope-from peter@rulingia.com) Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by mx1.freebsd.org (Postfix) with ESMTP id 45EAD8FC12 for ; Sun, 12 Aug 2012 23:00:40 +0000 (UTC) Received: from server.rulingia.com (c220-239-249-137.belrs5.nsw.optusnet.com.au [220.239.249.137]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CN0evV075571 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Mon, 13 Aug 2012 09:00:40 +1000 (EST) (envelope-from peter@rulingia.com) X-Bogosity: Ham, spamicity=0.000000 Received: from server.rulingia.com (localhost.rulingia.com [127.0.0.1]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CN0XdK021040 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Mon, 13 Aug 2012 09:00:33 +1000 (EST) (envelope-from peter@server.rulingia.com) Received: (from peter@localhost) by server.rulingia.com (8.14.5/8.14.5/Submit) id q7CN0XNd021039 for freebsd-numerics@freebsd.org; Mon, 13 Aug 2012 09:00:33 +1000 (EST) (envelope-from peter) Resent-From: Peter Jeremy Resent-Date: Mon, 13 Aug 2012 09:00:33 +1000 Resent-Message-ID: <20120812230033.GT20453@server.rulingia.com> Resent-To: freebsd-numerics@freebsd.org From: Peter Jeremy Mail-Followup-To: freebsd-numerics@freebsd.org To: Bruce Evans Message-ID: <20120722121219.GC73662@server.rulingia.com> References: <20120717084457.U3890@besplex.bde.org> <5004A5C7.1040405@missouri.edu> <5004DEA9.1050001@missouri.edu> <20120717040118.GA86840@troutmask.apl.washington.edu> <20120717042125.GF66913@server.rulingia.com> <20120717043848.GB87001@troutmask.apl.washington.edu> <20120717225328.GA86902@server.rulingia.com> <20120717232740.GA95026@troutmask.apl.washington.edu> <20120718001337.GA87817@server.rulingia.com> <20120718123627.D1575@besplex.bde.org> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="E13BgyNx05feLLmH" Content-Disposition: inline In-Reply-To: <20120718123627.D1575@besplex.bde.org> X-PGP-Key: http://www.rulingia.com/keys/peter.pgp User-Agent: Mutt/1.5.21 (2010-09-15) Cc: Diane Bruce , John Baldwin , David Chisnall , Stephen Montgomery-Smith , Bruce Evans , Steve Kargl , David Schultz , Warner Losh Subject: Re: Use of C99 extra long double math functions after r236148 X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Date: Sun, 12 Aug 2012 23:00:41 -0000 X-Original-Date: Sun, 22 Jul 2012 22:12:19 +1000 X-List-Received-Date: Sun, 12 Aug 2012 23:00:41 -0000 --E13BgyNx05feLLmH Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable OK, here's my next try at the exception handling for catanh(). As before, the real code needs expansion to prevent precision loss. A have simplified the default (NaN + I*NaN) return from catanh() to the minimun to ensure that both real & imaginary parts return as NaN. I've been doing some experiments on mixing NaNs using x87, SSE, SPARC64 and ARM (last on Linux) and have come to the conclusion that there is no standard behaviour: Given x & y as NaNs, (x+y) can return either x or y, possibly with the sign bit from the other operand. depending on the FPU. Inline, as rquested by Steve: -------------------- /*Copyright...*/ #include __FBSDID("$FreeBSD$"); #include #include #include "math_private.h" /* * Hyperbolic arc-tangent of a complex argument z =3D x + I*y. * * Exceptional values are noted in the comments within the source code. * These values and the associated return value were taken from WG14/N1256. * It (and the code comments) generally only refers to behaviour in the * quadrant where both input signs are positive. This is extended to the * remaining quadrants by noting: * a) catanh(conj(z)) =3D=3D conj(catanh(z)) * b) catanh() is odd * therefore the result in each quadrant is the same with the signs of * each part copied from the input to the output. * * Special cases for catanh() based on WG14/N1256: * * y\x Inf 0 1 x NaN * Inf 0+I*=CF=80/2 0+I*=CF=80/2 0+I*=CF=80/2 0+I*=CF=80/2 = 0+I*=CF=80/2 * NaN 0+I*NaN 0+I*NaN NaN+I*NaN NaN+I*NaN NaN+I*NaN * 0 0+I*=CF=80/2 0+I*0 Inf+I*0 atanh(x) NaN+I*NaN * y 0+I*=CF=80/2 I*atan(y) [1] [1] NaN+I*NaN * * [1] clog((1+z)/(1-z))/2 or equivalent. */ double complex catanh(double complex z) { double x, y; /* Real & imaginary parts of argument */ int32_t hx, hy; /* MSW of binary real & imaginary parts */ int32_t ix, iy; /* hx & hy without sign bits */ int32_t lx, ly; /* LSW of binary real & imaginary parts */ x =3D creal(z); y =3D cimag(z); EXTRACT_WORDS(hx, lx, x); EXTRACT_WORDS(hy, ly, y); ix =3D 0x7fffffff & hx; iy =3D 0x7fffffff & hy; /* Handle pure real & imaginary cases */ if ((iy | ly) =3D=3D 0) { /* Imaginary part 0 */ /* z is real - return atanh(x) */ return (cpack(__ieee754_atanh(x), y)); } if ((ix | lx) =3D=3D 0) { /* Real part 0 */ /* z is imaginary - return I*atan(y) */ return (cpack(x, atan(y))); } /* Handle the mostly-non-exceptional cases where x and y are finite. */ if (ix < 0x7ff00000 && iy < 0x7ff00000) { /* Following is possible algorithm, not final implementation */ return ((clog(1.0 + z) - clog(1.0 - z)) / 2.0); } /* x and/or y are not finite */ if (((iy - 0x7ff00000) | ly) =3D=3D 0) { /* y is Inf */ /* * catanh(NaN + I*Inf) =3D 0 + I*=CF=80/2. * The sign of the real part of the result is not * specified by the standard so always return same as x. * catanh(Inf + I*Inf) =3D 0 + I*=CF=80/2. * catanh(finite + I*Inf) =3D 0 + I*=CF=80/2. */ return (cpack(copysign(0.0, x), copysign(M_PI_2, y))); } if (((ix - 0x7ff00000) | lx) =3D=3D 0) { /* x is Inf */ /* catanh(Inf + I*finite) =3D 0 + I*=CF=80/2 */ if (iy < 0x7ff00000) /* finite */ return (cpack(copysign(0.0, x), copysign(M_PI_2, y))); /* catanh(Inf + I*NaN) =3D +0 + I*NaN */ return (cpack(copysign(0.0, x), y+y)); } /* * At this point x and/or y are NaN and these all return NaN + I*NaN * * catanh(NaN + I*finite) =3D d(NaN) + I*dNaN * catanh(NaN + I*NaN) =3D d(NaN) + I*d(NaN) * catanh(finite + I*NaN) =3D dNaN + I*d(NaN) * * Raising "invalid" exception is optional. Choice =3D don't * raise, except for signalling NaNs. */ return(cpack(x + y, y + x)); } /* * Arc-tangent of a complex argument z =3D x + I*y. * * catan(z) =3D -I * catanh(I * z) * */ double complex catan(double complex z) { double complex r; /* Manually multiply by I to avoid compiler deficiencies. */ r =3D catanh(cpack(-cimag(z), creal(z))); return (cpack(cimag(r), -creal(r))); } --=20 Peter Jeremy --E13BgyNx05feLLmH Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.19 (FreeBSD) iEYEARECAAYFAlAL7iMACgkQ/opHv/APuIeydwCgr6tuTbwOgBsWgBvK+3bEa/AZ ovsAnjqXmugJGD+ByyBsIHPtRG1zL3pG =kKBH -----END PGP SIGNATURE----- --E13BgyNx05feLLmH-- From owner-freebsd-numerics@FreeBSD.ORG Sun Aug 12 23:01:01 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id B06A0106566B for ; Sun, 12 Aug 2012 23:01:01 +0000 (UTC) (envelope-from peter@rulingia.com) Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by mx1.freebsd.org (Postfix) with ESMTP id 442178FC15 for ; Sun, 12 Aug 2012 23:01:01 +0000 (UTC) Received: from server.rulingia.com (c220-239-249-137.belrs5.nsw.optusnet.com.au [220.239.249.137]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CN118J075578 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Mon, 13 Aug 2012 09:01:01 +1000 (EST) (envelope-from peter@rulingia.com) X-Bogosity: Ham, spamicity=0.000000 Received: from server.rulingia.com (localhost.rulingia.com [127.0.0.1]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CN0sjW021056 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Mon, 13 Aug 2012 09:00:54 +1000 (EST) (envelope-from peter@server.rulingia.com) Received: (from peter@localhost) by server.rulingia.com (8.14.5/8.14.5/Submit) id q7CN0swK021055 for freebsd-numerics@freebsd.org; Mon, 13 Aug 2012 09:00:54 +1000 (EST) (envelope-from peter) Resent-From: Peter Jeremy Resent-Date: Mon, 13 Aug 2012 09:00:54 +1000 Resent-Message-ID: <20120812230054.GU20453@server.rulingia.com> Resent-To: freebsd-numerics@freebsd.org From: Peter Jeremy Mail-Followup-To: freebsd-numerics@freebsd.org To: Bruce Evans Message-ID: <20120722220031.GA7791@server.rulingia.com> References: <5004A5C7.1040405@missouri.edu> <5004DEA9.1050001@missouri.edu> <20120717040118.GA86840@troutmask.apl.washington.edu> <20120717042125.GF66913@server.rulingia.com> <20120717043848.GB87001@troutmask.apl.washington.edu> <20120717225328.GA86902@server.rulingia.com> <20120717232740.GA95026@troutmask.apl.washington.edu> <20120718001337.GA87817@server.rulingia.com> <20120718123627.D1575@besplex.bde.org> <20120722121219.GC73662@server.rulingia.com> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="TB36FDmn/VVEgNH/" Content-Disposition: inline In-Reply-To: <20120722121219.GC73662@server.rulingia.com> X-PGP-Key: http://www.rulingia.com/keys/peter.pgp User-Agent: Mutt/1.5.21 (2010-09-15) Cc: Diane Bruce , John Baldwin , David Chisnall , Stephen Montgomery-Smith , Bruce Evans , Steve Kargl , David Schultz , Warner Losh Subject: Re: Use of C99 extra long double math functions after r236148 X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Date: Sun, 12 Aug 2012 23:01:01 -0000 X-Original-Date: Mon, 23 Jul 2012 08:00:31 +1000 X-List-Received-Date: Sun, 12 Aug 2012 23:01:01 -0000 --TB36FDmn/VVEgNH/ Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On 2012-Jul-22 22:12:19 +1000, Peter Jeremy wro= te: >A have simplified the default (NaN + I*NaN) return from catanh() to >the minimun to ensure that both real & imaginary parts return as NaN. >I've been doing some experiments on mixing NaNs using x87, SSE, SPARC64 >and ARM (last on Linux) and have come to the conclusion that there is >no standard behaviour: Given x & y as NaNs, (x+y) can return either >x or y, possibly with the sign bit from the other operand. depending >on the FPU. I've tried running my exception test program on Solaris/SPARC using SunStudio and it gives different results to FreeBSD/sparc64 in some cases so it looks like the FreeBSD/sparc64 exception handling code is also buggy. And, when the base gcc tries to shortcut floating point expressions and execute them at compile time, it also gets exception handling wrong in several cases (it'll correctly detect that a constant expression evaluates to Inf or NaN but, in many cases, the NaN it calculates is different to the x87 or SSE evaluation of the same arguments). --=20 Peter Jeremy --TB36FDmn/VVEgNH/ Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.19 (FreeBSD) iEYEARECAAYFAlAMd/8ACgkQ/opHv/APuIcS0gCgsrcNiDSYJUv7BJ1suNBzj19v YzUAoJwG8BJIx0RIXszST3Xr3/dsczIf =FwOq -----END PGP SIGNATURE----- --TB36FDmn/VVEgNH/-- From owner-freebsd-numerics@FreeBSD.ORG Sun Aug 12 23:01:32 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id CBEBA1065673 for ; Sun, 12 Aug 2012 23:01:32 +0000 (UTC) (envelope-from peter@rulingia.com) Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by mx1.freebsd.org (Postfix) with ESMTP id 5F7928FC08 for ; Sun, 12 Aug 2012 23:01:31 +0000 (UTC) Received: from server.rulingia.com (c220-239-249-137.belrs5.nsw.optusnet.com.au [220.239.249.137]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CN1VUg075583 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Mon, 13 Aug 2012 09:01:31 +1000 (EST) (envelope-from peter@rulingia.com) X-Bogosity: Ham, spamicity=0.000000 Received: from server.rulingia.com (localhost.rulingia.com [127.0.0.1]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CN1PVh021068 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Mon, 13 Aug 2012 09:01:25 +1000 (EST) (envelope-from peter@server.rulingia.com) Received: (from peter@localhost) by server.rulingia.com (8.14.5/8.14.5/Submit) id q7CN1P8E021067 for freebsd-numerics@freebsd.org; Mon, 13 Aug 2012 09:01:25 +1000 (EST) (envelope-from peter) Resent-From: Peter Jeremy Resent-Date: Mon, 13 Aug 2012 09:01:25 +1000 Resent-Message-ID: <20120812230125.GV20453@server.rulingia.com> Resent-To: freebsd-numerics@freebsd.org From: Peter Jeremy Mail-Followup-To: freebsd-numerics@freebsd.org To: Steve Kargl Message-ID: <20120722233134.GB8033@server.rulingia.com> References: <20120717042125.GF66913@server.rulingia.com> <20120717043848.GB87001@troutmask.apl.washington.edu> <20120717225328.GA86902@server.rulingia.com> <20120717232740.GA95026@troutmask.apl.washington.edu> <20120718001337.GA87817@server.rulingia.com> <20120718123627.D1575@besplex.bde.org> <20120722121219.GC73662@server.rulingia.com> <20120722220031.GA7791@server.rulingia.com> <20120722221614.GB53450@zim.MIT.EDU> <20120722231056.GA84338@troutmask.apl.washington.edu> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="DBIVS5p969aUjpLe" Content-Disposition: inline In-Reply-To: <20120722231056.GA84338@troutmask.apl.washington.edu> X-PGP-Key: http://www.rulingia.com/keys/peter.pgp User-Agent: Mutt/1.5.21 (2010-09-15) Cc: Diane Bruce , John Baldwin , David Chisnall , Stephen Montgomery-Smith , Bruce Evans , Bruce Evans , David Schultz , Warner Losh Subject: Re: Use of C99 extra long double math functions after r236148 X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Date: Sun, 12 Aug 2012 23:01:33 -0000 X-Original-Date: Mon, 23 Jul 2012 09:31:34 +1000 X-List-Received-Date: Sun, 12 Aug 2012 23:01:33 -0000 --DBIVS5p969aUjpLe Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On 2012-Jul-22 16:10:56 -0700, Steve Kargl wrote: >The above isn't necessarily true. The Fortran standards from >2003 and 2008, very care about NaN. Under certain conditions, >if one has something like=20 > > x =3D sin(NaN) > >in Fortran, then the returned NaN must be the one in the function >call. Even if it was a SNaN? My understanding is that SNaN should be quietened if they are used in any further floating point operations. > Having libm, do > > if (x =3D=3D NaN) > return (x + x); > >does/may not return the correct NaN. I presume you mean if (isnan(x)) return (x + x); Do you have a test case that shows that? As far as I can tell, all the FPUs I have access to will return a quietened variant of the input NaN in this case (ie, only the signalling bit is altered). --=20 Peter Jeremy --DBIVS5p969aUjpLe Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.19 (FreeBSD) iEYEARECAAYFAlAMjVYACgkQ/opHv/APuIcdBQCeIi9S9Yq+faBLz7W9dLuYbAVD O3cAn1psBIXlikmAG3Au8jyyhxwoGbEZ =D0AN -----END PGP SIGNATURE----- --DBIVS5p969aUjpLe-- From owner-freebsd-numerics@FreeBSD.ORG Sun Aug 12 23:02:04 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 4B6F41065673 for ; Sun, 12 Aug 2012 23:02:04 +0000 (UTC) (envelope-from peter@rulingia.com) Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by mx1.freebsd.org (Postfix) with ESMTP id C2B4E8FC0C for ; Sun, 12 Aug 2012 23:02:03 +0000 (UTC) Received: from server.rulingia.com (c220-239-249-137.belrs5.nsw.optusnet.com.au [220.239.249.137]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CN23Za075595 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Mon, 13 Aug 2012 09:02:03 +1000 (EST) (envelope-from peter@rulingia.com) X-Bogosity: Ham, spamicity=0.000000 Received: from server.rulingia.com (localhost.rulingia.com [127.0.0.1]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CN1vSY021093 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Mon, 13 Aug 2012 09:01:57 +1000 (EST) (envelope-from peter@server.rulingia.com) Received: (from peter@localhost) by server.rulingia.com (8.14.5/8.14.5/Submit) id q7CN1vFj021092 for freebsd-numerics@freebsd.org; Mon, 13 Aug 2012 09:01:57 +1000 (EST) (envelope-from peter) Resent-From: Peter Jeremy Resent-Date: Mon, 13 Aug 2012 09:01:57 +1000 Resent-Message-ID: <20120812230157.GY20453@server.rulingia.com> Resent-To: freebsd-numerics@freebsd.org Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q6N4v7VG011505 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Mon, 23 Jul 2012 14:57:07 +1000 (EST) (envelope-from stephen@missouri.edu) Received: from wilberforce.math.missouri.edu (wilberforce.math.missouri.edu [128.206.184.213]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q6N4v4d8010973 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Mon, 23 Jul 2012 14:57:07 +1000 (EST) (envelope-from stephen@missouri.edu) Received: from [127.0.0.1] (wilberforce.math.missouri.edu [128.206.184.213]) by wilberforce.math.missouri.edu (8.14.5/8.14.5) with ESMTP id q6N4ujY2008938; Sun, 22 Jul 2012 23:56:46 -0500 (CDT) (envelope-from stephen@missouri.edu) Message-ID: <500CD98E.9080103@missouri.edu> From: Stephen Montgomery-Smith Mail-Followup-To: freebsd-numerics@freebsd.org User-Agent: Mozilla/5.0 (X11; Linux i686; rv:14.0) Gecko/20120714 Thunderbird/14.0 MIME-Version: 1.0 To: Bruce Evans References: <5004A5C7.1040405@missouri.edu> <5004DEA9.1050001@missouri.edu> <20120717040118.GA86840@troutmask.apl.washington.edu> <20120717042125.GF66913@server.rulingia.com> <20120717043848.GB87001@troutmask.apl.washington.edu> <20120717225328.GA86902@server.rulingia.com> <20120717232740.GA95026@troutmask.apl.washington.edu> <20120718001337.GA87817@server.rulingia.com> <20120718123627.D1575@besplex.bde.org> <20120722121219.GC73662@server.rulingia.com> <20120722220031.GA7791@server.rulingia.com> <20120723141319.P1189@besplex.bde.org> In-Reply-To: <20120723141319.P1189@besplex.bde.org> Content-Type: multipart/mixed; boundary="------------050007040009010508090702" X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: Diane Bruce , John Baldwin , David Chisnall , Bruce Evans , Steve Kargl , David Schultz , Peter Jeremy , Warner Losh Subject: Re: Use of C99 extra long double math functions after r236148 X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Date: Sun, 12 Aug 2012 23:02:04 -0000 X-Original-Date: Sun, 22 Jul 2012 23:56:46 -0500 X-List-Received-Date: Sun, 12 Aug 2012 23:02:04 -0000 This is a multi-part message in MIME format. --------------050007040009010508090702 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit This is the work I have done to produce casinh, casin, cacos and cacosh. The latter two took me a lot more time than I expected. It took me a lot of time to try to find the correct branches. --------------050007040009010508090702-- From owner-freebsd-numerics@FreeBSD.ORG Sun Aug 12 23:02:09 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 2D56F1065674 for ; Sun, 12 Aug 2012 23:02:09 +0000 (UTC) (envelope-from peter@rulingia.com) Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by mx1.freebsd.org (Postfix) with ESMTP id B52EE8FC16 for ; Sun, 12 Aug 2012 23:02:08 +0000 (UTC) Received: from server.rulingia.com (c220-239-249-137.belrs5.nsw.optusnet.com.au [220.239.249.137]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CN28fY075596 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Mon, 13 Aug 2012 09:02:08 +1000 (EST) (envelope-from peter@rulingia.com) X-Bogosity: Ham, spamicity=0.000000 Received: from server.rulingia.com (localhost.rulingia.com [127.0.0.1]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CN22CC021104 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Mon, 13 Aug 2012 09:02:02 +1000 (EST) (envelope-from peter@server.rulingia.com) Received: (from peter@localhost) by server.rulingia.com (8.14.5/8.14.5/Submit) id q7CN22uN021103 for freebsd-numerics@freebsd.org; Mon, 13 Aug 2012 09:02:02 +1000 (EST) (envelope-from peter) Resent-From: Peter Jeremy Resent-Date: Mon, 13 Aug 2012 09:02:02 +1000 Resent-Message-ID: <20120812230202.GZ20453@server.rulingia.com> Resent-To: freebsd-numerics@freebsd.org Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q6NEbFXr030706 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Tue, 24 Jul 2012 00:37:16 +1000 (EST) (envelope-from stephen@missouri.edu) Received: from wilberforce.math.missouri.edu (wilberforce.math.missouri.edu [128.206.184.213]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q6NEbCIP012562 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Tue, 24 Jul 2012 00:37:14 +1000 (EST) (envelope-from stephen@missouri.edu) Received: from [127.0.0.1] (wilberforce.math.missouri.edu [128.206.184.213]) by wilberforce.math.missouri.edu (8.14.5/8.14.5) with ESMTP id q6NEaoEF046369; Mon, 23 Jul 2012 09:36:51 -0500 (CDT) (envelope-from stephen@missouri.edu) Message-ID: <500D6182.8010003@missouri.edu> From: Stephen Montgomery-Smith Mail-Followup-To: freebsd-numerics@freebsd.org User-Agent: Mozilla/5.0 (X11; Linux i686; rv:14.0) Gecko/20120714 Thunderbird/14.0 MIME-Version: 1.0 To: Bruce Evans References: <5004A5C7.1040405@missouri.edu> <5004DEA9.1050001@missouri.edu> <20120717040118.GA86840@troutmask.apl.washington.edu> <20120717042125.GF66913@server.rulingia.com> <20120717043848.GB87001@troutmask.apl.washington.edu> <20120717225328.GA86902@server.rulingia.com> <20120717232740.GA95026@troutmask.apl.washington.edu> <20120718001337.GA87817@server.rulingia.com> <20120718123627.D1575@besplex.bde.org> <20120722121219.GC73662@server.rulingia.com> <20120722220031.GA7791@server.rulingia.com> <20120723141319.P1189@besplex.bde.org> <500CD98E.9080103@missouri.edu> In-Reply-To: <500CD98E.9080103@missouri.edu> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: Diane Bruce , John Baldwin , David Chisnall , Bruce Evans , Steve Kargl , David Schultz , Peter Jeremy , Warner Losh Subject: Re: Use of C99 extra long double math functions after r236148 X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Date: Sun, 12 Aug 2012 23:02:09 -0000 X-Original-Date: Mon, 23 Jul 2012 09:36:50 -0500 X-List-Received-Date: Sun, 12 Aug 2012 23:02:09 -0000 On 07/22/2012 11:56 PM, Stephen Montgomery-Smith wrote: > This is the work I have done to produce casinh, casin, cacos and cacosh. > The latter two took me a lot more time than I expected. It took me a > lot of time to try to find the correct branches. > > Once I have cacos, cacosh turns out to be much easier than I thought: double complex cacosh(double complex z) { complex double w; w = cacos(z); if (signbit(cimag(w)) == 0) return cpack(cimag(w),-creal(w)); else return cpack(-cimag(w),creal(w)); } From owner-freebsd-numerics@FreeBSD.ORG Sun Aug 12 23:02:26 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 84CDB1065676 for ; Sun, 12 Aug 2012 23:02:26 +0000 (UTC) (envelope-from peter@rulingia.com) Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by mx1.freebsd.org (Postfix) with ESMTP id 1D8BE8FC0A for ; Sun, 12 Aug 2012 23:02:25 +0000 (UTC) Received: from server.rulingia.com (c220-239-249-137.belrs5.nsw.optusnet.com.au [220.239.249.137]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CN2PDT075606 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Mon, 13 Aug 2012 09:02:26 +1000 (EST) (envelope-from peter@rulingia.com) X-Bogosity: Ham, spamicity=0.000000 Received: from server.rulingia.com (localhost.rulingia.com [127.0.0.1]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CN2JI8021121 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Mon, 13 Aug 2012 09:02:19 +1000 (EST) (envelope-from peter@server.rulingia.com) Received: (from peter@localhost) by server.rulingia.com (8.14.5/8.14.5/Submit) id q7CN2JhM021120 for freebsd-numerics@freebsd.org; Mon, 13 Aug 2012 09:02:19 +1000 (EST) (envelope-from peter) Resent-From: Peter Jeremy Resent-Date: Mon, 13 Aug 2012 09:02:19 +1000 Resent-Message-ID: <20120812230219.GB20453@server.rulingia.com> Resent-To: freebsd-numerics@freebsd.org Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q6O2f28V064779 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Tue, 24 Jul 2012 12:41:03 +1000 (EST) (envelope-from stephen@missouri.edu) Received: from wilberforce.math.missouri.edu (wilberforce.math.missouri.edu [128.206.184.213]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q6O2exHq017785 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Tue, 24 Jul 2012 12:41:02 +1000 (EST) (envelope-from stephen@missouri.edu) Received: from [127.0.0.1] (wilberforce.math.missouri.edu [128.206.184.213]) by wilberforce.math.missouri.edu (8.14.5/8.14.5) with ESMTP id q6O2ZgQU092684; Mon, 23 Jul 2012 21:35:42 -0500 (CDT) (envelope-from stephen@missouri.edu) Message-ID: <500E09FE.8020607@missouri.edu> From: Stephen Montgomery-Smith Mail-Followup-To: freebsd-numerics@freebsd.org User-Agent: Mozilla/5.0 (X11; Linux i686; rv:14.0) Gecko/20120714 Thunderbird/14.0 MIME-Version: 1.0 To: Bruce Evans References: <5004A5C7.1040405@missouri.edu> <5004DEA9.1050001@missouri.edu> <20120717040118.GA86840@troutmask.apl.washington.edu> <20120717042125.GF66913@server.rulingia.com> <20120717043848.GB87001@troutmask.apl.washington.edu> <20120717225328.GA86902@server.rulingia.com> <20120717232740.GA95026@troutmask.apl.washington.edu> <20120718001337.GA87817@server.rulingia.com> <20120718123627.D1575@besplex.bde.org> <20120722121219.GC73662@server.rulingia.com> <20120722220031.GA7791@server.rulingia.com> <20120723141319.P1189@besplex.bde.org> <500CD98E.9080103@missouri.edu> <500D6182.8010003@missouri.edu> <20120724100014.I934@besplex.bde.org> In-Reply-To: <20120724100014.I934@besplex.bde.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: Diane Bruce , John Baldwin , David Chisnall , Bruce Evans , Steve Kargl , David Schultz , Peter Jeremy , Warner Losh Subject: Re: Use of C99 extra long double math functions after r236148 X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Date: Sun, 12 Aug 2012 23:02:26 -0000 X-Original-Date: Mon, 23 Jul 2012 21:35:42 -0500 X-List-Received-Date: Sun, 12 Aug 2012 23:02:26 -0000 I am committing the software - as is - to you guys. I feel that I worked extremely hard to get an error less than about 3.8 ULP (from 10,000,000 test cases) for the complex arc-trig and arc-hyperbolic functions. My skill is writing numerical software that works, with an awareness of where various numerical issues might arise. I am very proud of the work I have done in the last few days. Now this software requires a different skill set, getting it to conform to various styles, looking for smart efficiencies to overcome defects in the compiler, and checking that NaN's work properly. This is a skill set which you guys have in far greater amounts than I have. And it is something that doesn't engage my interest. If you find problems, like unwanted overflows/underflows or numerical errors occurring in edge cases, then by all means get back to me, and I will attempt to fix it. Thus far, the only fault I have found so far is in catanh, where I forgot to set the "inexact flag" when computing "hp" and "hm" when |y| is very small. I will still continue working hard on getting the numerics in clog to work correctly. Stephen From owner-freebsd-numerics@FreeBSD.ORG Sun Aug 12 23:02:44 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 698F9106564A for ; Sun, 12 Aug 2012 23:02:44 +0000 (UTC) (envelope-from peter@rulingia.com) Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by mx1.freebsd.org (Postfix) with ESMTP id A28518FC0C for ; Sun, 12 Aug 2012 23:02:43 +0000 (UTC) Received: from server.rulingia.com (c220-239-249-137.belrs5.nsw.optusnet.com.au [220.239.249.137]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CN2gHJ075613 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Mon, 13 Aug 2012 09:02:43 +1000 (EST) (envelope-from peter@rulingia.com) X-Bogosity: Ham, spamicity=0.000000 Received: from server.rulingia.com (localhost.rulingia.com [127.0.0.1]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CN2Z4e021136 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Mon, 13 Aug 2012 09:02:36 +1000 (EST) (envelope-from peter@server.rulingia.com) Received: (from peter@localhost) by server.rulingia.com (8.14.5/8.14.5/Submit) id q7CN2ZKX021135 for freebsd-numerics@freebsd.org; Mon, 13 Aug 2012 09:02:35 +1000 (EST) (envelope-from peter) Resent-From: Peter Jeremy Resent-Date: Mon, 13 Aug 2012 09:02:35 +1000 Resent-Message-ID: <20120812230235.GC20453@server.rulingia.com> Resent-To: freebsd-numerics@freebsd.org Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q6NK0RvM035117 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Tue, 24 Jul 2012 06:00:28 +1000 (EST) (envelope-from stephen@missouri.edu) Received: from wilberforce.math.missouri.edu (wilberforce.math.missouri.edu [128.206.184.213]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q6NK0Pia016761 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Tue, 24 Jul 2012 06:00:27 +1000 (EST) (envelope-from stephen@missouri.edu) Received: from [128.206.184.213] (wilberforce.math.missouri.edu [128.206.184.213]) by wilberforce.math.missouri.edu (8.14.5/8.14.5) with ESMTP id q6NK01os067386; Mon, 23 Jul 2012 15:00:02 -0500 (CDT) (envelope-from stephen@missouri.edu) Message-ID: <500DAD41.5030104@missouri.edu> From: Stephen Montgomery-Smith Mail-Followup-To: freebsd-numerics@freebsd.org User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:13.0) Gecko/20120628 Thunderbird/13.0.1 MIME-Version: 1.0 To: Peter Jeremy References: <20120717084457.U3890@besplex.bde.org> <5004A5C7.1040405@missouri.edu> <5004DEA9.1050001@missouri.edu> <20120717040118.GA86840@troutmask.apl.washington.edu> <20120717042125.GF66913@server.rulingia.com> <20120717043848.GB87001@troutmask.apl.washington.edu> <20120717225328.GA86902@server.rulingia.com> <20120717232740.GA95026@troutmask.apl.washington.edu> <20120718001337.GA87817@server.rulingia.com> <20120718123627.D1575@besplex.bde.org> <20120722121219.GC73662@server.rulingia.com> In-Reply-To: <20120722121219.GC73662@server.rulingia.com> Content-Type: multipart/mixed; boundary="------------060606060301090803040905" Cc: Diane Bruce , Bruce Evans , John Baldwin , David Chisnall , Bruce Evans , Steve Kargl , David Schultz , Warner Losh Subject: Re: Use of C99 extra long double math functions after r236148 X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Date: Sun, 12 Aug 2012 23:02:44 -0000 X-Original-Date: Mon, 23 Jul 2012 15:00:01 -0500 X-List-Received-Date: Sun, 12 Aug 2012 23:02:44 -0000 This is a multi-part message in MIME format. --------------060606060301090803040905 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit I just realized that catan(z) = reverse( catanh(reverse (z))), just like casin relates to casinh (remember reverse(x+I*y) = y+I*x). This is a consequence of catan and catanh being odd functions, as well as the standard relation catan(z) = -I*catanh(I*z). So I would modify Peter's code by taking out the minus signs. Maybe it would make a difference if the answers involved -0. On 07/22/12 07:12, Peter Jeremy wrote: > /* > * Arc-tangent of a complex argument z = x + I*y. > * > * catan(z) = reverse( catanh(reverse (z))) > * > */ > double complex > catan(double complex z) > { > double complex r; > > r = catanh(cpack(cimag(z), creal(z))); > return (cpack(cimag(r), creal(r))); > } > I am attaching code that computes all six arc-trig-hyp functions. Peter made a remark that catanh(z) would be hard to compute when |z|=1. Fortunately that is not the case, because when |z|=1, the imaginary part of catanh(z) is plus or minus PI/4. So we won't face the same problems that clog(z) has. I feel that I am done with these functions for now. I tried to change my comments to conform to the style given to me by Bruce. However spacing inside mathematical expressions is something where I am inconsistent. The functions still need a lot of work to handle -0, infs and NaNs correctly. I will leave that to you guys, because you seem so much better at it than me. I still don't understand why the proper test is "if (x!=x) return(x+x)" rather than "if (isnan(x)) return(NAN)". However, I think you might find that a lot of the handling when the input or output is infinity works without any changes. For example, catanh(1) seems to produce the correct answer, and maybe even catanh(1-0*I) will work better than expected. I am also attaching the test code. I run it like this, so that only ULPs greater than 3 appear: ./test3 | perl -lne '@a=split " ",$_;print if $a[2]>3 || $a[3]>3' and get outputs like this: 21987 atanh 3.13643 0.299452 0.443282 0.0665108 0.473296 0.0824781 67013 atan 0.377411 3.03922 -0.0170315 -0.442191 -0.0211662 -0.474753 70044 acosh 0.883474 3.23108 1.06353 0.107343 0.433696 0.242278 70044 acos 3.23108 0.883474 1.06353 0.107343 0.242278 -0.433696 96124 atan 0.509279 3.21631 -0.0346851 -0.461121 -0.0440054 -0.497841 The first example is the count of which example I am trying. The third and fourth entries are the ULPs of the real and imaginary parts of the answer. The 5th and 6th entries are the real and imaginary parts of the input, and the 7th and 8th entries are the real and imaginary parts of the answer. I use the unuran port to generate data where the x and y values of the inputs are normally distributed N(0,1). As you can see from the parts I commented out, I tried many variations (mostly to check edge cases: close to zero, very large, or close to branch cuts). In particular, if you want random data on the unit disk, use h=hypot(x,y); x/=h; y/=h; My next project will be to get clog(z) to work well when |z|=1. Stephen --------------060606060301090803040905 Content-Type: text/plain; charset=us-ascii; name="catrig.c" Content-Transfer-Encoding: base64 Content-Disposition: attachment; filename="catrig.c" I2luY2x1ZGUgPGNvbXBsZXguaD4KI2luY2x1ZGUgPGZsb2F0Lmg+CiNpbmNsdWRlIDxtYXRo Lmg+CgojaW5jbHVkZSAibWF0aF9wcml2YXRlLmgiCgovKgogKiBnY2MgZG9lc24ndCBpbXBs ZW1lbnQgY29tcGxleCBtdWx0aXBsaWNhdGlvbiBvciBkaXZpc2lvbiBjb3JyZWN0bHksCiAq IHNvIHdlIG5lZWQgdG8gaGFuZGxlIGluZmluaXRpZXMgc3BlY2lhbGx5LiBXZSB0dXJuIG9u IHRoaXMgcHJhZ21hIHRvCiAqIG5vdGlmeSBjb25mb3JtaW5nIGM5OSBjb21waWxlcnMgdGhh dCB0aGUgZmFzdC1idXQtaW5jb3JyZWN0IGNvZGUgdGhhdAogKiBnY2MgZ2VuZXJhdGVzIGlz IGFjY2VwdGFibGUsIHNpbmNlIHRoZSBzcGVjaWFsIGNhc2VzIGhhdmUgYWxyZWFkeSBiZWVu CiAqIGhhbmRsZWQuCiAqLwojcHJhZ21hCVNUREMgQ1hfTElNSVRFRF9SQU5HRQlPTgoKY29t cGxleCBkb3VibGUgY2xvZyhjb21wbGV4IGRvdWJsZSB6KTsKCnN0YXRpYyBjb25zdCBkb3Vi bGUKb25lID0gIDEuMDAwMDAwMDAwMDAwMDAwMDAwMDBlKzAwLCAvKiAweDNGRjAwMDAwLCAw eDAwMDAwMDAwICovCmh1Z2U9ICAxLjAwMDAwMDAwMDAwMDAwMDAwMDAwZSszMDA7CgovKgog KiBUZXN0aW5nIGluZGljYXRlcyB0aGF0IGFsbCB0aGVzZSBmdW5jdGlvbnMgYXJlIGFjY3Vy YXRlIHVwIHRvIDQgVUxQLgogKi8KCi8qCiAqIFRoZSBhbGdvcml0aG0gaXMgdmVyeSBjbG9z ZSB0byB0aGF0IGluICJJbXBsZW1lbnRpbmcgdGhlIGNvbXBsZXggYXJjc2luZQogKiBhbmQg YXJjY29zaW5lIGZ1bmN0aW9ucyB1c2luZyBleGNlcHRpb24gaGFuZGxpbmciIGJ5IFQuIEUu IEh1bGwsCiAqIFRob21hcyBGLiBGYWlyZ3JpZXZlLCBhbmQgUGluZyBUYWsgUGV0ZXIgVGFu ZywgcHVibGlzaGVkIGluIEFDTQogKiBUcmFuc2FjdGlvbnMgb24gTWF0aGVtYXRpY2FsIFNv ZnR3YXJlLCBWb2x1bWUgMjMgSXNzdWUgMywgMTk5NywgUGFnZXMKICogMjk5LTMzNSwgaHR0 cDovL2RsLmFjbS5vcmcvY2l0YXRpb24uY2ZtP2lkPTI3NTMyNAogKgogKiBjYXNpbmgoeCtp eSkgPSBzaWduKHgpKmxvZyhBK3NxcnQoQSpBLTEpKSArIHNpZ24oeSkqSSphc2luKEIpCiAq IHdoZXJlCiAqIEEgPSAwLjUofHorSXwgKyB8ei1JfCkgPSBmKHgsMSt5KSArIGYoeCwxLXkp ICsgMQogKiBCID0gMC41KHx6K0l8IC0gfHotSXwpCiAqIHogPSB4K0kqeQogKiBmKHgseSkg PSAwLjUqKGh5cG90KHgseSkteSkKICogV2UgYWxzbyB1c2UKICogYXNpbihCKSA9IGF0YW4y KHNxcnQoQSpBLXkqeSkseSkKICogQS15ID0gZih4LHkrMSkgKyBmKHgseS0xKS4KICoKICog TXVjaCBvZiB0aGUgZGlmZmljdWx0eSBjb21lcyBiZWNhdXNlIGNvbXB1dGluZyBmKHgseSkg bWF5IHByb2R1Y2UKICogdW5kZXJmbG93cy4KICovCgovKgogKiBSZXR1cm5zIDAuNSooaHlw b3QoeCx5KS15KS4gIEl0IGFzc3VtZXMgeCBpcyBwb3NpdGl2ZSwgYW5kIHRoYXQgeSBkb2Vz CiAqIG5vdCBzYXRpc2Z5IHRoZSBpbmVxdWFsaXRpZXMgMCA8IGZhYnMoeSkgPCAxZS0yMC4K ICogSWYgcmVwb3J0aW5nIHRoZSBhbnN3ZXIgcmlza3MgYW4gdW5kZXJmbG93LCB0aGUgdW5k ZXJmbG93IGZsYWcgaXMgc2V0LAogKmFuZCBpdCByZXR1cm5zIDAuNSooaHlwb3QoeCx5KS15 KS94L3guCiAqLwpzdGF0aWMgZG91YmxlIGYoZG91YmxlIHgsIGRvdWJsZSB5LCBpbnQgKnVu ZGVyZmxvdykgewoJaWYgKHg9PTApIHsKCQkqdW5kZXJmbG93ID0gMDsKCQlpZiAoeSA+IDAp CgkJCXJldHVybiAwOwoJCXJldHVybiAteTsKCX0KCWlmICh5PT0wKSB7CgkJKnVuZGVyZmxv dyA9IDA7CgkJcmV0dXJuIDAuNSp4OwoJfQoJaWYgKHggPCAxZS0xMDAgJiYgeCA8IHkpIHsK CQkqdW5kZXJmbG93ID0gMTsKCQlyZXR1cm4gMC41LyhoeXBvdCh4LHkpK3kpOwoJfQoJaWYg KHggPCB5KSB7CgkJKnVuZGVyZmxvdyA9IDA7CgkJcmV0dXJuIDAuNSp4KngvKGh5cG90KHgs eSkreSk7Cgl9CgkqdW5kZXJmbG93ID0gMDsKCXJldHVybiAwLjUqKGh5cG90KHgseSkteSk7 Cn0KCi8qCiAqIEFsbCB0aGUgaGFyZCB3b3JrIGlzIGNvbnRhaW5lZCBpbiB0aGlzIGZ1bmN0 aW9uLgogKiBVcG9uIHJldHVybjoKICogcnggPSBSZShjYXNpbmgoeCtJKnkpKQogKiBCX2dv b2QgaXMgc2V0IHRvIDEgaWYgdGhlIHZhbHVlIG9mIEIgaXMgdXNhYmxlLgogKiBJZiBCX2dv b2QgaXMgc2V0IHRvIDAsIEEybXkyID0gQSpBLXkqeS4KICovCnN0YXRpYyB2b2lkIGRvX2hh cmRfd29yayhkb3VibGUgeCwgZG91YmxlIHksIGRvdWJsZSAqcngsIGludCAqQl9nb29kLCBk b3VibGUgKkIsIGRvdWJsZSAqQTJteTIpCnsKCWRvdWJsZSBSLCBTLCBBLCBmcCwgZm07Cglp bnQgZnB1ZiwgZm11ZjsKCglSID0gaHlwb3QoeCx5KzEpOwoJUyA9IGh5cG90KHgseS0xKTsK CUEgPSAwLjUqKFIgKyBTKTsKCglpZiAoQSA8IDEwKSB7CgkJZnAgPSBmKHgsMSt5LCZmcHVm KTsKCQlmbSA9IGYoeCwxLXksJmZtdWYpOwoJCWlmIChmcHVmID09IDEgJiYgZm11ZiA9PSAx KSB7CgkJCWlmIChodWdlK3g+b25lKSAvKiBzZXQgaW5leGFjdCBmbGFnLiAqLwoJCQkJKnJ4 ID0gbG9nMXAoeCpzcXJ0KChmcCtmbSkqKEErMSkpKTsKCQl9IGVsc2UgaWYgKGZtdWYgPT0g MSkgewoJCQkvKiBPdmVyZmxvdyBub3QgcG9zc2libGUgYmVjYXVzZSBmcCA8IDFlNTAgYW5k IHggPiAxZS0xMDAuCgkJCSAgIFVuZGVyZmxvdyBub3QgcG9zc2libGUgYmVjYXVzZSBlaXRo ZXIgZm09MCBvciBmbQoJCQkgICBhcHByb3hpbWF0ZWx5IGJpZ2dlciB0aGFuIDFlLTIwMC4g Ki8KCQkJaWYgKGh1Z2UreD5vbmUpIC8qIHNldCBpbmV4YWN0IGZsYWcuICovCgkJCQkqcngg PSBsb2cxcChmcCtzcXJ0KHgpKnNxcnQoKGZwL3grZm0qeCkqKEErMSkpKTsKCQl9IGVsc2Ug aWYgKGZwdWYgPT0gMSkgewoJCQkvKiBTaW1pbGFyIGFyZ3VtZW50cyBhZ2FpbnN0IG92ZXIv dW5kZXJmbG93LiAqLwoJCQlpZiAoaHVnZSt4Pm9uZSkgLyogc2V0IGluZXhhY3QgZmxhZy4g Ki8KCQkJCSpyeCA9IGxvZzFwKGZtK3NxcnQoeCkqc3FydCgoZm0veCtmcCp4KSooQSsxKSkp OwoJCX0gZWxzZSB7CgkJCSpyeCA9IGxvZzFwKGZwICsgZm0gKyBzcXJ0KChmcCtmbSkqKEEr MSkpKTsKCQl9Cgl9IGVsc2UKCQkqcnggPSBsb2coQSArIHNxcnQoQSpBLTEpKTsKCgkqQiA9 IHkvQTsgLyogPSAwLjUqKFIgLSBTKSAqLwoJKkJfZ29vZCA9IDE7CgoJaWYgKCpCID4gMC41 KSB7CgkJKkJfZ29vZCA9IDA7CgkJZnAgPSBmKHgseSsxLCZmcHVmKTsKCQlmbSA9IGYoeCx5 LTEsJmZtdWYpOwoJCWlmIChmcHVmID09IDEgJiYgZm11ZiA9PSAxKQoJCQkqQTJteTIgPXgq c3FydCgoQSt5KSooZnArZm0pKTsKCQllbHNlIGlmIChmbXVmID09IDEpCgkJCS8qIE92ZXJm bG93IG5vdCBwb3NzaWJsZSBiZWNhdXNlIGZwIDwgMWU1MCBhbmQgeCA+IDFlLTEwMC4KCQkJ ICAgVW5kZXJmbG93IG5vdCBwb3NzaWJsZSBiZWNhdXNlIGVpdGhlciBmbT0wIG9yIGZtCgkJ CSAgIGFwcHJveGltYXRlbHkgYmlnZ2VyIHRoYW4gMWUtMjAwLiAqLwoJCQkqQTJteTIgPSBz cXJ0KHgpKnNxcnQoKEEreSkqKGZwL3grZm0qeCkpOwoJCWVsc2UgaWYgKGZwdWYgPT0gMSkK CQkJLyogU2ltaWxhciBhcmd1bWVudHMgYWdhaW5zdCBvdmVyL3VuZGVyZmxvdy4gKi8KCQkJ KkEybXkyID0gc3FydCh4KSpzcXJ0KChBK3kpKihmbS94K2ZwKngpKTsKCQllbHNlCgkJCSpB Mm15MiA9IHNxcnQoKEEreSkqKGZwK2ZtKSk7Cgl9Cn0KCmRvdWJsZSBjb21wbGV4CmNhc2lu aChkb3VibGUgY29tcGxleCB6KQp7Cglkb3VibGUgeCwgeSwgcngsIHJ5LCBCLCBBMm15MjsK CWludCBzeCwgc3k7CglpbnQgQl9nb29kOwoKCXggPSBjcmVhbCh6KTsKCXkgPSBjaW1hZyh6 KTsKCXN4ID0gc2lnbmJpdCh4KTsKCXN5ID0gc2lnbmJpdCh5KTsKCXggPSBmYWJzKHgpOwoJ eSA9IGZhYnMoeSk7CgoJaWYgKGNhYnMoeikgPiAxZTIwKSB7CgkJaWYgKGh1Z2UreD5vbmUp IHsgLyogc2V0IGluZXhhY3QgZmxhZy4gKi8KCQkJaWYgKHN4ID09IDApIHJldHVybiBjbG9n KDIqeik7CgkJCWlmIChzeCA9PSAxKSByZXR1cm4gLWNsb2coLTIqeik7CgkJfQoJfQoKCWlm IChjYWJzKHopIDwgMWUtMjApCgkJaWYgKGh1Z2UreD5vbmUpIC8qIHNldCBpbmV4YWN0IGZs YWcuICovCgkJCXJldHVybiB6OwoKCWRvX2hhcmRfd29yayh4LCB5LCAmcngsICZCX2dvb2Qs ICZCLCAmQTJteTIpOwoJaWYgKEJfZ29vZCkKCQlyeSA9IGFzaW4oQik7CgllbHNlCgkJcnkg PSBhdGFuMih5LEEybXkyKTsKCglpZiAoc3ggPT0gMSkgcnggPSAtcng7CglpZiAoc3kgPT0g MSkgcnkgPSAtcnk7CgoJcmV0dXJuIGNwYWNrKHJ4LHJ5KTsKfQoKLyoKICogY2FzaW4oeikg PSByZXZlcnNlKGNhc2luaChyZXZlcnNlKHopKSkKICogd2hlcmUgcmV2ZXJzZSh4K0kqeSkg PSB5K3gqSSA9IEkqY29uaih4K0kqeSkuCiAqLwoKZG91YmxlIGNvbXBsZXgKY2FzaW4oZG91 YmxlIGNvbXBsZXggeikKewoJY29tcGxleCByZXN1bHQ7CgoJcmVzdWx0ID0gY2FzaW5oKGNw YWNrKGNpbWFnKHopLGNyZWFsKHopKSk7CglyZXR1cm4gY3BhY2soY2ltYWcocmVzdWx0KSxj cmVhbChyZXN1bHQpKTsKfQoKLyoKICogY2Fjb3MoeikgPSBQSS8yIC0gY2FzaW4oeikKICog YnV0IGRvIHRoZSBjb21wdXRhdGlvbiBjYXJlZnVsbHkgc28gY2Fjb3MoeikgaXMgYWNjdXJh dGUgd2hlbiB6IGlzIGNsb3NlIHRvIDEuCiAqLwoKZG91YmxlIGNvbXBsZXgKY2Fjb3MoZG91 YmxlIGNvbXBsZXggeikKewoJZG91YmxlIHgsIHksIHJ4LCByeSwgQiwgQTJteTI7CglpbnQg c3gsIHN5OwoJaW50IEJfZ29vZDsKCWNvbXBsZXggdzsKCgl4ID0gY3JlYWwoeik7Cgl5ID0g Y2ltYWcoeik7CglzeCA9IHNpZ25iaXQoeCk7CglzeSA9IHNpZ25iaXQoeSk7Cgl4ID0gZmFi cyh4KTsKCXkgPSBmYWJzKHkpOwoKCWlmIChjYWJzKHopID4gMWUyMCkgewoJCWlmIChodWdl K3g+b25lKSB7IC8qIHNldCBpbmV4YWN0IGZsYWcuICovCgkJCXcgPSBjbG9nKDIqeik7CgkJ CWlmIChzaWduYml0KGNpbWFnKHcpKSA9PSAwKQoJCQkJcmV0dXJuIGNwYWNrKGNpbWFnKHcp LC1jcmVhbCh3KSk7CgkJCXJldHVybiBjcGFjaygtY2ltYWcodyksY3JlYWwodykpOwoJCX0K CX0KCglpZiAoY2Ficyh6KSA8IDFlLTEwKQoJCWlmIChodWdlK3g+b25lKSAvKiBzZXQgaW5l eGFjdCBmbGFnLiAqLwoJCQlyZXR1cm4gY3BhY2soTV9QSV8yLWNyZWFsKHopLC1jaW1hZyh6 KSk7CgoJZG9faGFyZF93b3JrKHksIHgsICZyeSwgJkJfZ29vZCwgJkIsICZBMm15Mik7Cglp ZiAoQl9nb29kKSB7CgkJaWYgKHN4PT0wKQoJCQlyeCA9IGFjb3MoQik7CgkJZWxzZQoJCQly eCA9IGFjb3MoLUIpOwoJfSBlbHNlIHsKCQlpZiAoc3g9PTApCgkJCXJ4ID0gYXRhbjIoQTJt eTIseCk7CgkJZWxzZQoJCQlyeCA9IGF0YW4yKEEybXkyLC14KTsKCX0KCglpZiAoc3k9PTAp IHJ5ID0gLXJ5OwoKCXJldHVybiBjcGFjayhyeCxyeSk7Cn0KCi8qCiAqIGNhY29zaCh6KSA9 IEkqY2Fjb3Moeikgb3IgLUkqY2Fjb3MoeikKICogd2hlcmUgdGhlIHNpZ24gaXMgY2hvc2Vu IHNvIFJlKGNhY29zaCh6KSkgPj0gMCAuCiAqLwoKZG91YmxlIGNvbXBsZXgKY2Fjb3NoKGRv dWJsZSBjb21wbGV4IHopCnsKCWNvbXBsZXggZG91YmxlIHc7CgoJdyA9IGNhY29zKHopOwoJ aWYgKHNpZ25iaXQoY2ltYWcodykpID09IDApCgkJcmV0dXJuIGNwYWNrKGNpbWFnKHcpLC1j cmVhbCh3KSk7CgllbHNlCgkJcmV0dXJuIGNwYWNrKC1jaW1hZyh3KSxjcmVhbCh3KSk7Cn0K Ci8qIAogKiBjYXRhbmgoeikgPSAwLjI1ICogbG9nKCh6KzEpLyh6LTEpKQogKiAgICAgICAg ICAgPSAwLjI1ICogbG9nKHx6KzF8L3x6LTF8KSArIDAuNSAqIEkgKiBhdGFuMigyeS8oMS14 KngteSp5KSkKICovCgpkb3VibGUgY29tcGxleApjYXRhbmgoZG91YmxlIGNvbXBsZXggeikK ewoJZG91YmxlIHgsIHksIHJ4LCByeSwgaHAsIGhtOwoKCXggPSBjcmVhbCh6KTsKCXkgPSBj aW1hZyh6KTsKCglpZiAoY2Ficyh6KSA8IDFlLTIwKQoJCWlmIChodWdlK3g+b25lKSAvKiBz ZXQgaW5leGFjdCBmbGFnLiAqLwoJCQlyZXR1cm4gejsKCglpZiAoY2Ficyh6KSA+IDFlMjAp CgkJaWYgKGh1Z2UreD5vbmUpIHsgLyogc2V0IGluZXhhY3QgZmxhZy4gKi8KCQkJaWYgKHNp Z25iaXQoeCkgPT0gMCkKCQkJCXJldHVybiBjcGFjaygwLE1fUElfMik7CgkJCXJldHVybiBj cGFjaygwLC1NX1BJXzIpOwoJfQoKCWlmIChmYWJzKHkpIDwgMWUtMTAwKSB7CgkJaHAgPSAo eCsxKSooeCsxKTsKCQlobSA9ICh4LTEpKih4LTEpOwoJfSBlbHNlIHsKCQlocCA9ICh4KzEp Kih4KzEpK3kqeTsgLyogfHorMXwgKi8KCQlobSA9ICh4LTEpKih4LTEpK3kqeTsgLyogfHot MXwgKi8KCX0KCglpZiAoaHAgPCAwLjUgfHwgaG0gPCAwLjUpCgkJcnggPSAwLjI1Kihsb2co aHAvaG0pKTsKCWVsc2UgaWYgKHggPiAwKQoJCXJ4ID0gMC4yNSpsb2cxcCg0KngvaG0pOwoJ ZWxzZQoJCXJ4ID0gLTAuMjUqbG9nMXAoLTQqeC9ocCk7CgoJaWYgKHg9PTEgfHwgeD09LTEp IHsKCQlpZiAoc2lnbmJpdCh5KSA9PSAwKQoJCQlyeSA9IGF0YW4yKDIsIC15KS8yOwoJCWVs c2UKCQkJcnkgPSBhdGFuMigtMiwgeSkvMjsKCX0gZWxzZSBpZiAoZmFicyh5KSA8IDFlLTEw MCkgewoJCWlmIChodWdlK3g+b25lKSAvKiBzZXQgaW5leGFjdCBmbGFnLiAqLwoJCQlyeSA9 IGF0YW4yKDIqeSwgKDEteCkqKDEreCkpLzI7Cgl9IGVsc2UKCQlyeSA9IGF0YW4yKDIqeSwg KDEteCkqKDEreCkteSp5KS8yOwoKCXJldHVybiBjcGFjayhyeCxyeSk7Cn0KCi8qCiAqIGNh dGFuKHopID0gcmV2ZXJzZShjYXRhbmgocmV2ZXJzZSh6KSkpCiAqIHdoZXJlIHJldmVyc2Uo eCtJKnkpID0geSt4KkkgPSBJKmNvbmooeCtJKnkpLgogKi8KCmRvdWJsZSBjb21wbGV4CmNh dGFuKGRvdWJsZSBjb21wbGV4IHopCnsKCWNvbXBsZXggcmVzdWx0OwoKCXJlc3VsdCA9IGNh dGFuaChjcGFjayhjaW1hZyh6KSxjcmVhbCh6KSkpOwoJcmV0dXJuIGNwYWNrKGNpbWFnKHJl c3VsdCksY3JlYWwocmVzdWx0KSk7Cn0K --------------060606060301090803040905 Content-Type: text/plain; charset=us-ascii; name="test3.c" Content-Transfer-Encoding: base64 Content-Disposition: attachment; filename="test3.c" I2luY2x1ZGUgPHN0ZGlvLmg+CiNpbmNsdWRlIDxzdGRsaWIuaD4KI2luY2x1ZGUgPHN5cy9j ZGVmcy5oPgojaW5jbHVkZSA8ZmxvYXQuaD4KI2luY2x1ZGUgPGZlbnYuaD4KI2luY2x1ZGUg ImNvbXBsZXguaCIKI2luY2x1ZGUgIm1hdGguaCIKI2luY2x1ZGUgIm1hdGhfcHJpdmF0ZS5o IgojaW5jbHVkZSAibXBmci5oIgojaW5jbHVkZSAibXBjLmgiCiNpbmNsdWRlIDx1bnVyYW4u aD4KCmNvbXBsZXggZG91YmxlIGNhc2luaChjb21wbGV4IGRvdWJsZSB6KTsKY29tcGxleCBk b3VibGUgY2FzaW4oY29tcGxleCBkb3VibGUgeik7CmNvbXBsZXggZG91YmxlIGNhY29zaChj b21wbGV4IGRvdWJsZSB6KTsKY29tcGxleCBkb3VibGUgY2Fjb3MoY29tcGxleCBkb3VibGUg eik7CmNvbXBsZXggZG91YmxlIGNhdGFuaChjb21wbGV4IGRvdWJsZSB6KTsKY29tcGxleCBk b3VibGUgY2F0YW4oY29tcGxleCBkb3VibGUgeik7CgptcGNfdCB6eiwgcnI7Cm1wZnJfdCBy eHgsIHJ5eSwgZXh4LCBleXk7Cgp2b2lkIGV2YWwoZG91YmxlIHgsZG91YmxlIHksY29tcGxl eCBkb3VibGUgKCpmKShjb21wbGV4IGRvdWJsZSksdm9pZCAoKm1wY19mKShtcGNfdCxtcGNf dCxtcGNfcm5kX3QpLGRvdWJsZSAqcngsZG91YmxlICpyeSxkb3VibGUgKmV4LGRvdWJsZSAq ZXkpIHsKICBjb21wbGV4IGRvdWJsZSByZXN1bHQ7CgogIHJlc3VsdCA9IGYoY3BhY2soeCx5 KSk7CiAgKnJ4ID0gY3JlYWwocmVzdWx0KTsKICAqcnkgPSBjaW1hZyhyZXN1bHQpOwoKICBt cGNfc2V0X2RfZCh6eiwgeCwgeSwgTVBDX1JORE5OKTsKICBtcGNfZihyciwgenosIE1QQ19S TkROTik7CgovKgogIG1wY19vdXRfc3RyKHN0ZG91dCwgMTAsIDEwMCwgenosIE1QQ19STkRO Tik7CiAgcHV0cygiIik7CiAgbXBjX291dF9zdHIoc3Rkb3V0LCAxMCwgMTAwLCByciwgTVBD X1JORE5OKTsKICBwdXRzKCIiKTsKKi8KCiAgbXBjX3JlYWwocnh4LCByciwgTVBGUl9STkRO KTsKICBtcGZyX3N1Yl9kKGV4eCxyeHgsKnJ4LE1QRlJfUk5ETik7CiAgbXBmcl9hYnMoZXh4 LGV4eCxNUEZSX1JORE4pOwogIG1wZnJfbXVsXzJzaShleHgsZXh4LC0gbXBmcl9nZXRfZXhw KHJ4eCkrREJMX01BTlRfRElHLE1QRlJfUk5ETik7CiAgKmV4ID0gbXBmcl9nZXRfZChleHgs TVBGUl9STkROKTsKCiAgbXBjX2ltYWcocnl5LCByciwgTVBGUl9STkROKTsKICBtcGZyX3N1 Yl9kKGV5eSxyeXksKnJ5LE1QRlJfUk5ETik7CiAgbXBmcl9hYnMoZXl5LGV5eSxNUEZSX1JO RE4pOwogIG1wZnJfbXVsXzJzaShleXksZXl5LC0gbXBmcl9nZXRfZXhwKHJ5eSkrREJMX01B TlRfRElHLE1QRlJfUk5ETik7CiAgKmV5ID0gbXBmcl9nZXRfZChleXksTVBGUl9STkROKTsK fQoKaW50IG1haW4oaW50IGFyZ2MsIGNoYXIgKiphcmd2KSB7CiAgZG91YmxlIHgseSxyeCxy eSxleCxleSxoOwogIFVOVVJfRElTVFIgKmRpc3RyOwogIFVOVVJfUEFSICpwYXI7CiAgVU5V Ul9HRU4gKmdlbjsKICBpbnQgY291bnQgPSAwOwoKICBkaXN0ciA9IHVudXJfZGlzdHJfbm9y bWFsKE5VTEwsMCk7CiAgcGFyID0gdW51cl9hdXRvX25ldyhkaXN0cik7CiAgZ2VuID0gdW51 cl9pbml0KHBhcik7CgogIG1wY19pbml0Mih6eiwzMDApOwogIG1wY19pbml0MihyciwzMDAp OwoKICBtcGZyX3NldF9kZWZhdWx0X3ByZWMoMzAwKTsKICBtcGZyX2luaXQocnh4KTsKICBt cGZyX2luaXQocnl5KTsKICBtcGZyX2luaXQoZXh4KTsKICBtcGZyX2luaXQoZXl5KTsKCiAg d2hpbGUgKDEpIHsKICAgIHggPSB1bnVyX3NhbXBsZV9jb250KGdlbik7CiAgICB5ID0gdW51 cl9zYW1wbGVfY29udChnZW4pOwovKgogICAgaCA9IGh5cG90KHgseSk7CiAgICB4ID0geC9o OwogICAgeSA9IHkvaDsKCiAgICB4ID0gMDsKICAgIHkgPSAxOwoKICAgIHggKz0gMWUtMjAw KnVudXJfc2FtcGxlX2NvbnQoZ2VuKTsKICAgIHkgKz0gMWUtMjAwKnVudXJfc2FtcGxlX2Nv bnQoZ2VuKTsKKi8KCiAgICBjb3VudCsrOwogICAgZXZhbCh4LHksY2FzaW5oLG1wY19hc2lu aCwmcngsJnJ5LCZleCwmZXkpOwogICAgcHJpbnRmKCIlZCBhc2luaCAlZyAlZyAlZyAlZyAl ZyAlZ1xuIixjb3VudCxleCxleSx4LHkscngscnkpOwogICAgZXZhbCh4LHksY2Fjb3NoLG1w Y19hY29zaCwmcngsJnJ5LCZleCwmZXkpOwogICAgcHJpbnRmKCIlZCBhY29zaCAlZyAlZyAl ZyAlZyAlZyAlZ1xuIixjb3VudCxleCxleSx4LHkscngscnkpOwogICAgZXZhbCh4LHksY2F0 YW5oLG1wY19hdGFuaCwmcngsJnJ5LCZleCwmZXkpOwogICAgcHJpbnRmKCIlZCBhdGFuaCAl ZyAlZyAlZyAlZyAlZyAlZ1xuIixjb3VudCxleCxleSx4LHkscngscnkpOwogICAgZXZhbCh4 LHksY2FzaW4sbXBjX2FzaW4sJnJ4LCZyeSwmZXgsJmV5KTsKICAgIHByaW50ZigiJWQgYXNp biAlZyAlZyAlZyAlZyAlZyAlZ1xuIixjb3VudCxleCxleSx4LHkscngscnkpOwogICAgZXZh bCh4LHksY2Fjb3MsbXBjX2Fjb3MsJnJ4LCZyeSwmZXgsJmV5KTsKICAgIHByaW50ZigiJWQg YWNvcyAlZyAlZyAlZyAlZyAlZyAlZ1xuIixjb3VudCxleCxleSx4LHkscngscnkpOwogICAg ZXZhbCh4LHksY2F0YW4sbXBjX2F0YW4sJnJ4LCZyeSwmZXgsJmV5KTsKICAgIHByaW50Zigi JWQgYXRhbiAlZyAlZyAlZyAlZyAlZyAlZ1xuIixjb3VudCxleCxleSx4LHkscngscnkpOwog IH0KfQo= --------------060606060301090803040905-- From owner-freebsd-numerics@FreeBSD.ORG Sun Aug 12 23:03:03 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id E167C1065672 for ; Sun, 12 Aug 2012 23:03:02 +0000 (UTC) (envelope-from peter@rulingia.com) Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by mx1.freebsd.org (Postfix) with ESMTP id 74EFA8FC12 for ; Sun, 12 Aug 2012 23:03:02 +0000 (UTC) Received: from server.rulingia.com (c220-239-249-137.belrs5.nsw.optusnet.com.au [220.239.249.137]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CN32df075621 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Mon, 13 Aug 2012 09:03:02 +1000 (EST) (envelope-from peter@rulingia.com) X-Bogosity: Ham, spamicity=0.000000 Received: from server.rulingia.com (localhost.rulingia.com [127.0.0.1]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CN2twn021153 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Mon, 13 Aug 2012 09:02:55 +1000 (EST) (envelope-from peter@server.rulingia.com) Received: (from peter@localhost) by server.rulingia.com (8.14.5/8.14.5/Submit) id q7CN2tGV021152 for freebsd-numerics@freebsd.org; Mon, 13 Aug 2012 09:02:55 +1000 (EST) (envelope-from peter) Resent-From: Peter Jeremy Resent-Date: Mon, 13 Aug 2012 09:02:55 +1000 Resent-Message-ID: <20120812230255.GE20453@server.rulingia.com> Resent-To: freebsd-numerics@freebsd.org Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q6R332r0028695 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Fri, 27 Jul 2012 13:03:03 +1000 (EST) (envelope-from stephen@missouri.edu) Received: from wilberforce.math.missouri.edu (wilberforce.math.missouri.edu [128.206.184.213]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q6R32wSq041896 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Fri, 27 Jul 2012 13:03:02 +1000 (EST) (envelope-from stephen@missouri.edu) Received: from [127.0.0.1] (wilberforce.math.missouri.edu [128.206.184.213]) by wilberforce.math.missouri.edu (8.14.5/8.14.5) with ESMTP id q6R325rH032219; Thu, 26 Jul 2012 22:02:07 -0500 (CDT) (envelope-from stephen@missouri.edu) Message-ID: <501204AD.30605@missouri.edu> From: Stephen Montgomery-Smith Mail-Followup-To: freebsd-numerics@freebsd.org User-Agent: Mozilla/5.0 (X11; Linux i686; rv:14.0) Gecko/20120714 Thunderbird/14.0 MIME-Version: 1.0 To: Bruce Evans References: <20120717084457.U3890@besplex.bde.org> <5004A5C7.1040405@missouri.edu> <5004DEA9.1050001@missouri.edu> <20120717040118.GA86840@troutmask.apl.washington.edu> <20120717042125.GF66913@server.rulingia.com> <20120717043848.GB87001@troutmask.apl.washington.edu> <20120717225328.GA86902@server.rulingia.com> <20120717232740.GA95026@troutmask.apl.washington.edu> <20120718001337.GA87817@server.rulingia.com> <20120718123627.D1575@besplex.bde.org> <20120722121219.GC73662@server.rulingia.com> <500DAD41.5030104@missouri.edu> <20120724113214.G934@besplex.bde.org> In-Reply-To: <20120724113214.G934@besplex.bde.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: Diane Bruce , John Baldwin , David Chisnall , Bruce Evans , Steve Kargl , David Schultz , Peter Jeremy , Warner Losh Subject: Re: Use of C99 extra long double math functions after r236148 X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Date: Sun, 12 Aug 2012 23:03:03 -0000 X-Original-Date: Thu, 26 Jul 2012 22:02:05 -0500 X-List-Received-Date: Sun, 12 Aug 2012 23:03:03 -0000 I am not getting many responses to the programs I submitted. I understand that you may be all very busy. The casinh algorithm in particular will take a lot of vetting. So I have submitted them in a PR so that they won't get lost. http://www.freebsd.org/cgi/query-pr.cgi?pr=170206 The only substantive changes I made was (1) make corrections in comments for catanh and (2) implement clogf by calling clog. (Unsurprisingly its relative error is less than or equal to 0.5 ULP.) If I used the incorrect category (I used bin), please tell me or change it for me. Thanks, Stephen From owner-freebsd-numerics@FreeBSD.ORG Sun Aug 12 23:03:11 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 01B6E106564A for ; Sun, 12 Aug 2012 23:03:11 +0000 (UTC) (envelope-from peter@rulingia.com) Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by mx1.freebsd.org (Postfix) with ESMTP id 8A1D38FC15 for ; Sun, 12 Aug 2012 23:03:10 +0000 (UTC) Received: from server.rulingia.com (c220-239-249-137.belrs5.nsw.optusnet.com.au [220.239.249.137]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CN3AoJ075624 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Mon, 13 Aug 2012 09:03:10 +1000 (EST) (envelope-from peter@rulingia.com) X-Bogosity: Ham, spamicity=0.000000 Received: from server.rulingia.com (localhost.rulingia.com [127.0.0.1]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CN34qF021160 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Mon, 13 Aug 2012 09:03:04 +1000 (EST) (envelope-from peter@server.rulingia.com) Received: (from peter@localhost) by server.rulingia.com (8.14.5/8.14.5/Submit) id q7CN34JY021159 for freebsd-numerics@freebsd.org; Mon, 13 Aug 2012 09:03:04 +1000 (EST) (envelope-from peter) Resent-From: Peter Jeremy Resent-Date: Mon, 13 Aug 2012 09:03:04 +1000 Resent-Message-ID: <20120812230304.GF20453@server.rulingia.com> Resent-To: freebsd-numerics@freebsd.org From: Peter Jeremy Mail-Followup-To: freebsd-numerics@freebsd.org To: Stephen Montgomery-Smith Message-ID: <20120727032611.GB25690@server.rulingia.com> References: <20120717042125.GF66913@server.rulingia.com> <20120717043848.GB87001@troutmask.apl.washington.edu> <20120717225328.GA86902@server.rulingia.com> <20120717232740.GA95026@troutmask.apl.washington.edu> <20120718001337.GA87817@server.rulingia.com> <20120718123627.D1575@besplex.bde.org> <20120722121219.GC73662@server.rulingia.com> <500DAD41.5030104@missouri.edu> <20120724113214.G934@besplex.bde.org> <501204AD.30605@missouri.edu> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="T4sUOijqQbZv57TR" Content-Disposition: inline In-Reply-To: <501204AD.30605@missouri.edu> X-PGP-Key: http://www.rulingia.com/keys/peter.pgp User-Agent: Mutt/1.5.21 (2010-09-15) Cc: Diane Bruce , Bruce Evans , John Baldwin , David Chisnall , Bruce Evans , Steve Kargl , David Schultz , Warner Losh Subject: Re: Use of C99 extra long double math functions after r236148 X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Date: Sun, 12 Aug 2012 23:03:11 -0000 X-Original-Date: Fri, 27 Jul 2012 13:26:11 +1000 X-List-Received-Date: Sun, 12 Aug 2012 23:03:11 -0000 --T4sUOijqQbZv57TR Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On 2012-Jul-26 22:02:05 -0500, Stephen Montgomery-Smith wrote: >I am not getting many responses to the programs I submitted. I=20 >understand that you may be all very busy. I've been writing a test harness to vet the special case handling of all the complex functions (excluding cpow so far). Basically, it's just Appendix G.6 of WG14/N1256 turned into a C array, plus code to actually run the tests & interpret the results. So far, it's about 1100 lines of which about 1/3 is the test cases and is intended to run on x86/armle/sparc and FreeBSD/Linux/Solaris (I'm using Solaris and, to a lesser extent, Linux as a cross-check on my interpretation of the text). Once I'm happy with it, I'll circulate it. I was initially hoping to make it commitable but 8-char tabs and 80-char lines would require lots of line wrapping that would make it harder for me to follow. >If I used the incorrect category (I used bin), please tell me or change=20 >it for me. bin looks correct. --=20 Peter Jeremy --T4sUOijqQbZv57TR Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.19 (FreeBSD) iEYEARECAAYFAlASClMACgkQ/opHv/APuIdeaACeNcyLmgxlNzUKaouRKge5DxqJ 8SMAoIaT3ASMNDGa4bycKEsu3tRGPSNq =EBdg -----END PGP SIGNATURE----- --T4sUOijqQbZv57TR-- From owner-freebsd-numerics@FreeBSD.ORG Sun Aug 12 23:03:24 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id EE092106566C for ; Sun, 12 Aug 2012 23:03:23 +0000 (UTC) (envelope-from peter@rulingia.com) Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by mx1.freebsd.org (Postfix) with ESMTP id 88F928FC0A for ; Sun, 12 Aug 2012 23:03:23 +0000 (UTC) Received: from server.rulingia.com (c220-239-249-137.belrs5.nsw.optusnet.com.au [220.239.249.137]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CN3Nu4075629 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Mon, 13 Aug 2012 09:03:23 +1000 (EST) (envelope-from peter@rulingia.com) X-Bogosity: Ham, spamicity=0.000000 Received: from server.rulingia.com (localhost.rulingia.com [127.0.0.1]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CN3HWb021176 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Mon, 13 Aug 2012 09:03:17 +1000 (EST) (envelope-from peter@server.rulingia.com) Received: (from peter@localhost) by server.rulingia.com (8.14.5/8.14.5/Submit) id q7CN3HE2021175 for freebsd-numerics@freebsd.org; Mon, 13 Aug 2012 09:03:17 +1000 (EST) (envelope-from peter) Resent-From: Peter Jeremy Resent-Date: Mon, 13 Aug 2012 09:03:17 +1000 Resent-Message-ID: <20120812230317.GG20453@server.rulingia.com> Resent-To: freebsd-numerics@freebsd.org Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q6R3tgT6029188 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Fri, 27 Jul 2012 13:55:42 +1000 (EST) (envelope-from stephen@missouri.edu) Received: from wilberforce.math.missouri.edu (wilberforce.math.missouri.edu [128.206.184.213]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q6R3tetC042028 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Fri, 27 Jul 2012 13:55:42 +1000 (EST) (envelope-from stephen@missouri.edu) Received: from [127.0.0.1] (wilberforce.math.missouri.edu [128.206.184.213]) by wilberforce.math.missouri.edu (8.14.5/8.14.5) with ESMTP id q6R3tGVr058639; Thu, 26 Jul 2012 22:55:17 -0500 (CDT) (envelope-from stephen@missouri.edu) Message-ID: <50121124.4000002@missouri.edu> From: Stephen Montgomery-Smith Mail-Followup-To: freebsd-numerics@freebsd.org User-Agent: Mozilla/5.0 (X11; Linux i686; rv:14.0) Gecko/20120714 Thunderbird/14.0 MIME-Version: 1.0 To: Peter Jeremy References: <20120717042125.GF66913@server.rulingia.com> <20120717043848.GB87001@troutmask.apl.washington.edu> <20120717225328.GA86902@server.rulingia.com> <20120717232740.GA95026@troutmask.apl.washington.edu> <20120718001337.GA87817@server.rulingia.com> <20120718123627.D1575@besplex.bde.org> <20120722121219.GC73662@server.rulingia.com> <500DAD41.5030104@missouri.edu> <20120724113214.G934@besplex.bde.org> <501204AD.30605@missouri.edu> <20120727032611.GB25690@server.rulingia.com> In-Reply-To: <20120727032611.GB25690@server.rulingia.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: Diane Bruce , Bruce Evans , John Baldwin , David Chisnall , Bruce Evans , Steve Kargl , David Schultz , Warner Losh Subject: Re: Use of C99 extra long double math functions after r236148 X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Date: Sun, 12 Aug 2012 23:03:24 -0000 X-Original-Date: Thu, 26 Jul 2012 22:55:16 -0500 X-List-Received-Date: Sun, 12 Aug 2012 23:03:24 -0000 On 07/26/2012 10:26 PM, Peter Jeremy wrote: > I've been writing a test harness to vet the special case handling of > all the complex functions (excluding cpow so far). Basically, it's > just Appendix G.6 of WG14/N1256 turned into a C array, plus code to > actually run the tests & interpret the results. So far, it's about > 1100 lines of which about 1/3 is the test cases and is intended to run > on x86/armle/sparc and FreeBSD/Linux/Solaris (I'm using Solaris and, > to a lesser extent, Linux as a cross-check on my interpretation of the > text). Once I'm happy with it, I'll circulate it. I was initially > hoping to make it commitable but 8-char tabs and 80-char lines would > require lots of line wrapping that would make it harder for me to > follow. On the subject of Linux, I tested the relative errors of the Linux versions of clog, casinh, etc. They performed rather badly. They really flunked the clog(z) for |z| close to 1 test. As for your test program, maybe you could run some script to change the indents to the 8-char tabs when you are done. It does sound like a useful program, and it would be nice if it were generally available in the FreeBSD source code. From owner-freebsd-numerics@FreeBSD.ORG Sun Aug 12 23:03:46 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 04537106566B for ; Sun, 12 Aug 2012 23:03:46 +0000 (UTC) (envelope-from peter@rulingia.com) Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by mx1.freebsd.org (Postfix) with ESMTP id 6487D8FC17 for ; Sun, 12 Aug 2012 23:03:45 +0000 (UTC) Received: from server.rulingia.com (c220-239-249-137.belrs5.nsw.optusnet.com.au [220.239.249.137]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CN3j6P075642 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Mon, 13 Aug 2012 09:03:45 +1000 (EST) (envelope-from peter@rulingia.com) X-Bogosity: Ham, spamicity=0.000000 Received: from server.rulingia.com (localhost.rulingia.com [127.0.0.1]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CN3ep1021203 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Mon, 13 Aug 2012 09:03:40 +1000 (EST) (envelope-from peter@server.rulingia.com) Received: (from peter@localhost) by server.rulingia.com (8.14.5/8.14.5/Submit) id q7CN3eQl021202 for freebsd-numerics@freebsd.org; Mon, 13 Aug 2012 09:03:40 +1000 (EST) (envelope-from peter) Resent-From: Peter Jeremy Resent-Date: Mon, 13 Aug 2012 09:03:40 +1000 Resent-Message-ID: <20120812230340.GJ20453@server.rulingia.com> Resent-To: freebsd-numerics@freebsd.org Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q6RDLe2L034885 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Fri, 27 Jul 2012 23:21:40 +1000 (EST) (envelope-from stephen@missouri.edu) Received: from wilberforce.math.missouri.edu (wilberforce.math.missouri.edu [128.206.184.213]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q6RDLcQv043856 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Fri, 27 Jul 2012 23:21:40 +1000 (EST) (envelope-from stephen@missouri.edu) Received: from [128.206.184.213] (wilberforce.math.missouri.edu [128.206.184.213]) by wilberforce.math.missouri.edu (8.14.5/8.14.5) with ESMTP id q6RDLD1O044993; Fri, 27 Jul 2012 08:21:14 -0500 (CDT) (envelope-from stephen@missouri.edu) Message-ID: <501295C9.6080108@missouri.edu> From: Stephen Montgomery-Smith Mail-Followup-To: freebsd-numerics@freebsd.org User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:13.0) Gecko/20120628 Thunderbird/13.0.1 MIME-Version: 1.0 To: Bruce Evans References: <20120717042125.GF66913@server.rulingia.com> <20120717043848.GB87001@troutmask.apl.washington.edu> <20120717225328.GA86902@server.rulingia.com> <20120717232740.GA95026@troutmask.apl.washington.edu> <20120718001337.GA87817@server.rulingia.com> <20120718123627.D1575@besplex.bde.org> <20120722121219.GC73662@server.rulingia.com> <500DAD41.5030104@missouri.edu> <20120724113214.G934@besplex.bde.org> <501204AD.30605@missouri.edu> <20120727032611.GB25690@server.rulingia.com> <20120727155521.E4712@besplex.bde.org> In-Reply-To: <20120727155521.E4712@besplex.bde.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: Diane Bruce , John Baldwin , David Chisnall , Bruce Evans , Steve Kargl , David Schultz , Peter Jeremy , Warner Losh Subject: Re: Use of C99 extra long double math functions after r236148 X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Date: Sun, 12 Aug 2012 23:03:46 -0000 X-Original-Date: Fri, 27 Jul 2012 08:21:13 -0500 X-List-Received-Date: Sun, 12 Aug 2012 23:03:46 -0000 On 07/27/12 02:27, Bruce Evans wrote: > On Fri, 27 Jul 2012, Peter Jeremy wrote: > >> On 2012-Jul-26 22:02:05 -0500, Stephen Montgomery-Smith >> wrote: >>> I am not getting many responses to the programs I submitted. I >>> understand that you may be all very busy. > > I'm still working on testing and fixing clog. Haven't got near the more > complex functions. > > For clog, the worst case that I've found so far has x^2+y^2-1 ~= 1e-47: > > x = 0.999999999999999555910790149937383830547332763671875000000000 > y = > 0.0000000298023223876953091912775497878893005143652317201485857367516 > (need high precision decimal or these rounded to 53 bits binary) > x^2+y^2-1 = 1.0947644252537633366591637369e-47 That is exactly 2^(-156). So maybe triple quad precision really is enough. > > so it needs more than tripled double precision for a brute force > evaluation, and the general case is probably worse. I'm working > on a rearrangement so that doubled double precision works in the > general case. Both your version and my version get this case right, > but mess up different much easier cases. It takes insanely great > accuracy to get even 1 bit in this case right, but now that we > have about 52 of 53 right, the work for getting the final bit > right is essentially the same as proving that the method works > in all cases. That's for x^2+y^2-1. log1p() is much harder. The general case is also the worst for the arc-trig functions. The edge cases seem to be computed with great accuracy. When I tell my friends I get approx 51 bit accuracy, they seem to think it is pretty good. > >> I've been writing a test harness to vet the special case handling of >> all the complex functions (excluding cpow so far). Basically, it's If I remember correctly, the C99 specification is very liberal in its requirements for cpow, so that cpow(x,y) = cexp(y*clog(x)) is compliant. > > I use a my test harness for float and double functions hacked for > complex functions where it doesn't work so well, and hackish pari > extensions. > >> just Appendix G.6 of WG14/N1256 turned into a C array, plus code to >> actually run the tests & interpret the results. So far, it's about >> 1100 lines of which about 1/3 is the test cases and is intended to run > > Yikes. My basic test program is getting too large and complex at 412 > lines. It basically just compares with a known good or different bad > function (with zillions of parameters to select the function and args). > It tests exceptional cases as a side effect. > > Do you do tests for fenv (i.e, that the specified exceptions and no > others are raised)? This might be a problem with external libraries. > The can deliver values and NaNs for comparison, but nothing requires > them to have the same fenv behaviour as the C library. Any tests > of fenv are likely to spew errors that you don't want to know about. Testing the fenv values seems like a really good idea to me. From owner-freebsd-numerics@FreeBSD.ORG Sun Aug 12 23:03:58 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 7F6CC106564A for ; Sun, 12 Aug 2012 23:03:58 +0000 (UTC) (envelope-from peter@rulingia.com) Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by mx1.freebsd.org (Postfix) with ESMTP id DEF768FC08 for ; Sun, 12 Aug 2012 23:03:57 +0000 (UTC) Received: from server.rulingia.com (c220-239-249-137.belrs5.nsw.optusnet.com.au [220.239.249.137]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CN3vMQ075647 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Mon, 13 Aug 2012 09:03:57 +1000 (EST) (envelope-from peter@rulingia.com) X-Bogosity: Ham, spamicity=0.000000 Received: from server.rulingia.com (localhost.rulingia.com [127.0.0.1]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CN3pmH021211 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Mon, 13 Aug 2012 09:03:51 +1000 (EST) (envelope-from peter@server.rulingia.com) Received: (from peter@localhost) by server.rulingia.com (8.14.5/8.14.5/Submit) id q7CN3puq021210 for freebsd-numerics@freebsd.org; Mon, 13 Aug 2012 09:03:51 +1000 (EST) (envelope-from peter) Resent-From: Peter Jeremy Resent-Date: Mon, 13 Aug 2012 09:03:51 +1000 Resent-Message-ID: <20120812230351.GK20453@server.rulingia.com> Resent-To: freebsd-numerics@freebsd.org From: Peter Jeremy Mail-Followup-To: freebsd-numerics@freebsd.org To: Bruce Evans Message-ID: <20120727210559.GF31169@server.rulingia.com> References: <20120717225328.GA86902@server.rulingia.com> <20120717232740.GA95026@troutmask.apl.washington.edu> <20120718001337.GA87817@server.rulingia.com> <20120718123627.D1575@besplex.bde.org> <20120722121219.GC73662@server.rulingia.com> <500DAD41.5030104@missouri.edu> <20120724113214.G934@besplex.bde.org> <501204AD.30605@missouri.edu> <20120727032611.GB25690@server.rulingia.com> <20120727155521.E4712@besplex.bde.org> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="wq9mPyueHGvFACwf" Content-Disposition: inline In-Reply-To: <20120727155521.E4712@besplex.bde.org> X-PGP-Key: http://www.rulingia.com/keys/peter.pgp User-Agent: Mutt/1.5.21 (2010-09-15) Cc: Diane Bruce , John Baldwin , David Chisnall , Stephen Montgomery-Smith , Bruce Evans , Steve Kargl , David Schultz , Warner Losh Subject: Re: Use of C99 extra long double math functions after r236148 X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Date: Sun, 12 Aug 2012 23:03:58 -0000 X-Original-Date: Sat, 28 Jul 2012 07:05:59 +1000 X-List-Received-Date: Sun, 12 Aug 2012 23:03:58 -0000 --wq9mPyueHGvFACwf Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On 2012-Jul-27 17:27:45 +1000, Bruce Evans wrote: >On Fri, 27 Jul 2012, Peter Jeremy wrote: > >> On 2012-Jul-26 22:02:05 -0500, Stephen Montgomery-Smith wrote: >>> I am not getting many responses to the programs I submitted. I >>> understand that you may be all very busy. > >I'm still working on testing and fixing clog. Haven't got near the more >complex functions. I'd suggest that clog() is the most difficult complex function. The others mostly build from other real and complex functions. >For clog, the worst case that I've found so far has x^2+y^2-1 ~=3D 1e-47: =2E.. >so it needs more than tripled double precision for a brute force >evaluation, and the general case is probably worse. I'm working >on a rearrangement so that doubled double precision works in the >general case. Whilst FreeBSD includes ld128 soft-float primitives, clogl() on sparc needs to provide a ld128 result - which implies ld240 in required for intermediate calculations (and, even on x86, ld128 won't provide sufficient additional precision for clogl()). How easy is it to determine which cases need higher precision? Can the majority of the cases be done in double precision or does everything need double double precision? Would the following approach work (all variables are double): x1 =3D (double)(float)x; /* x1 is top 24 bits of fraction */ x2 =3D x - x1; /* x2 is bottom 29 bits of fraction */ x3 =3D 2.0 * x1 * x2; /* 24b * 29b =3D 53b therefore this is exact = */ x1 *=3D x1; /* x1 now has 48 bits */ x2 *=3D x2; /* 58 bits rounded to 53 bits */ y1 =3D (double)(float)y; y2 =3D y - y1; y3 =3D 2.0 * y1 * y2; y1 *=3D y1; y2 *=3D y2; /* add low to high to minimise cancellation */ result =3D x2 + y2; result +=3D x3 + y3; result +=3D x1 + y1 - 1.0; The difficulty is that the final result summation needs to be rearranged to if the magnitudes of x and y are sufficiently different. For the case you presented, you need: result =3D y2; result +=3D x2 + y3; result +=3D x3 + y1; result +=3D x1 - 1.0; However this gives a result 1.094764425253763337e-47 without needing any more than double precision. >> just Appendix G.6 of WG14/N1256 turned into a C array, plus code to >> actually run the tests & interpret the results. So far, it's about >> 1100 lines of which about 1/3 is the test cases and is intended to run > >Yikes. My basic test program is getting too large and complex at 412 >lines. It basically just compares with a known good or different bad >function (with zillions of parameters to select the function and args). >It tests exceptional cases as a side effect. Portably supporting arm, sparc & x86 means copying structures from fpmath.h as well as various _fpmath.h and each family needs different code to display long double values. >Do you do tests for fenv (i.e, that the specified exceptions and no >others are raised)? This might be a problem with external libraries. I have included details of expected exceptions in the data array but haven't written code to actually check for them. >The can deliver values and NaNs for comparison, but nothing requires >them to have the same fenv behaviour as the C library. Currently, I just check that an expected NaN result is a NaN. I don't bother to verify that the NaN returned is tho "correct" NaN. On 2012-Jul-27 08:21:13 -0500, Stephen Montgomery-Smith wrote: >If I remember correctly, the C99 specification is very liberal in its=20 >requirements for cpow, so that >cpow(x,y) =3D cexp(y*clog(x)) >is compliant. The main reason I skipped cpow() is that it's dyadic, which means it needs a different test harness. And the lack of explicit requirements makes very difficult to determine when it should fail. One problem with cexp(y*clog(x)) is that you automatically lose about exponent-bits of precision. Ideally, you need to decompose it in much the same way as pow() - though it's messier. --=20 Peter Jeremy --wq9mPyueHGvFACwf Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.19 (FreeBSD) iEYEARECAAYFAlATArcACgkQ/opHv/APuIfMqQCguxTeQRKjIGXCoxC34wy6/RbU CswAoIrrosPW/oTy5+2tUrAKlVkJVM/j =4uf5 -----END PGP SIGNATURE----- --wq9mPyueHGvFACwf-- From owner-freebsd-numerics@FreeBSD.ORG Sun Aug 12 23:04:08 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 03241106566C for ; Sun, 12 Aug 2012 23:04:08 +0000 (UTC) (envelope-from peter@rulingia.com) Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by mx1.freebsd.org (Postfix) with ESMTP id 63F2D8FC0C for ; Sun, 12 Aug 2012 23:04:07 +0000 (UTC) Received: from server.rulingia.com (c220-239-249-137.belrs5.nsw.optusnet.com.au [220.239.249.137]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CN4702075652 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Mon, 13 Aug 2012 09:04:07 +1000 (EST) (envelope-from peter@rulingia.com) X-Bogosity: Ham, spamicity=0.000000 Received: from server.rulingia.com (localhost.rulingia.com [127.0.0.1]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CN3x4O021217 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Mon, 13 Aug 2012 09:04:00 +1000 (EST) (envelope-from peter@server.rulingia.com) Received: (from peter@localhost) by server.rulingia.com (8.14.5/8.14.5/Submit) id q7CN3xbr021216 for freebsd-numerics@freebsd.org; Mon, 13 Aug 2012 09:03:59 +1000 (EST) (envelope-from peter) Resent-From: Peter Jeremy Resent-Date: Mon, 13 Aug 2012 09:03:59 +1000 Resent-Message-ID: <20120812230359.GL20453@server.rulingia.com> Resent-To: freebsd-numerics@freebsd.org Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q6RN5QU8016982 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Sat, 28 Jul 2012 09:05:26 +1000 (EST) (envelope-from stephen@missouri.edu) Received: from wilberforce.math.missouri.edu (wilberforce.math.missouri.edu [128.206.184.213]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q6RN5OPD061602 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Sat, 28 Jul 2012 09:05:26 +1000 (EST) (envelope-from stephen@missouri.edu) Received: from [127.0.0.1] (wilberforce.math.missouri.edu [128.206.184.213]) by wilberforce.math.missouri.edu (8.14.5/8.14.5) with ESMTP id q6RN5114001170; Fri, 27 Jul 2012 18:05:02 -0500 (CDT) (envelope-from stephen@missouri.edu) Message-ID: <50131E9D.6030603@missouri.edu> From: Stephen Montgomery-Smith Mail-Followup-To: freebsd-numerics@freebsd.org User-Agent: Mozilla/5.0 (X11; Linux i686; rv:14.0) Gecko/20120714 Thunderbird/14.0 MIME-Version: 1.0 To: Peter Jeremy References: <20120717225328.GA86902@server.rulingia.com> <20120717232740.GA95026@troutmask.apl.washington.edu> <20120718001337.GA87817@server.rulingia.com> <20120718123627.D1575@besplex.bde.org> <20120722121219.GC73662@server.rulingia.com> <500DAD41.5030104@missouri.edu> <20120724113214.G934@besplex.bde.org> <501204AD.30605@missouri.edu> <20120727032611.GB25690@server.rulingia.com> <20120727155521.E4712@besplex.bde.org> <20120727210559.GF31169@server.rulingia.com> In-Reply-To: <20120727210559.GF31169@server.rulingia.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: Diane Bruce , Bruce Evans , John Baldwin , David Chisnall , Bruce Evans , Steve Kargl , David Schultz , Warner Losh Subject: Re: Use of C99 extra long double math functions after r236148 X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Date: Sun, 12 Aug 2012 23:04:08 -0000 X-Original-Date: Fri, 27 Jul 2012 18:05:01 -0500 X-List-Received-Date: Sun, 12 Aug 2012 23:04:08 -0000 Bruce and I have continued a discussion at http://www.freebsd.org/cgi/query-pr.cgi?pr=170206. I sure wish we could have a mailing list like freebsd-numerics to have these discussions. Then we could do things like change the subject line, so that we didn't have to follow everything. The idea of when you add a bunch of numbers that you start with the smallest first - I can see that being effective when all the numbers have the same signs. But I can also see it being quite disastrous when the numbers have different signs. I am trying to avoid "catastrophic loss of relative error" whereas you guys all seem to be after that last little bit of ULP. I guess it is a different mind set. If you are adding four numbers that are all positive, and you fail to add them from smallest to largest, the worst you will get is to increase your ULP to something like 4. But if you have numbers of different signs, like -1, 1, 1e-100 you definitely want to add the numbers of largest absolute value before you add the number with smallest absolute value. If you add from smallest to largest (either in the absolute value sense, or in the signed sense), you get a relative error of infinity. That is what I mean by "catastrophic loss of relative error." I remember having this long argument with Steve about how a ULP of 10 seemed just fine to me. And for most applications, when you compute clog(z), and |z| is close to 1, but z is not close to one, almost always you are not going to know the real and imaginary part of z exactly (it will be the result of some other computation), and so the huge relative error in computing clog(z) will represent a true lack of knowledge, not some artifact of the computational method. Indeed, that is why, even if I could compute clog(z) perfectly, I would never use clog to compute casinh(z) using the formula clog(z+csqrt(z*z+1)). This is because I also wouldn't have a good bound on |z+sqrt(z*z+1)|-1. All this talk about not wanting to be on the wrong part of the Riemann surface just seems like you are living in dream land. You could always come up with some kind of difficult analytic function for which near perfect resolution is going to be made difficult. For example, catanh(z) - I*PI/4 will never be calculable with good accuracy for |z| close to 1, unless you write a very special function for it. From owner-freebsd-numerics@FreeBSD.ORG Sun Aug 12 23:04:16 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 2E0621065673 for ; Sun, 12 Aug 2012 23:04:16 +0000 (UTC) (envelope-from peter@rulingia.com) Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by mx1.freebsd.org (Postfix) with ESMTP id ABAD18FC12 for ; Sun, 12 Aug 2012 23:04:15 +0000 (UTC) Received: from server.rulingia.com (c220-239-249-137.belrs5.nsw.optusnet.com.au [220.239.249.137]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CN4FP2075657 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Mon, 13 Aug 2012 09:04:15 +1000 (EST) (envelope-from peter@rulingia.com) X-Bogosity: Ham, spamicity=0.000000 Received: from server.rulingia.com (localhost.rulingia.com [127.0.0.1]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CN49Xn021232 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Mon, 13 Aug 2012 09:04:09 +1000 (EST) (envelope-from peter@server.rulingia.com) Received: (from peter@localhost) by server.rulingia.com (8.14.5/8.14.5/Submit) id q7CN490l021231 for freebsd-numerics@freebsd.org; Mon, 13 Aug 2012 09:04:09 +1000 (EST) (envelope-from peter) Resent-From: Peter Jeremy Resent-Date: Mon, 13 Aug 2012 09:04:09 +1000 Resent-Message-ID: <20120812230409.GM20453@server.rulingia.com> Resent-To: freebsd-numerics@freebsd.org Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q6SHn1F4051809 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Sun, 29 Jul 2012 03:49:02 +1000 (EST) (envelope-from imp@bsdimp.com) Received: from mail-ob0-f177.google.com (mail-ob0-f177.google.com [209.85.214.177]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q6SHmxRl069158 (version=TLSv1/SSLv3 cipher=RC4-SHA bits=128 verify=OK) for ; Sun, 29 Jul 2012 03:49:01 +1000 (EST) (envelope-from imp@bsdimp.com) Received: by obbta17 with SMTP id ta17so6531171obb.36 for ; Sat, 28 Jul 2012 10:48:52 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=sender:subject:mime-version:content-type:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to:x-mailer :x-gm-message-state; bh=55sKf1v9pOF1J3a03RfbDLJGZ7a6RMDm5KgiLkZK3ZA=; b=PsG3EETrIPhjsyj5f+gabOvw9K1TSoUZFQEqt1zaiscixQFbeLkec3UQkIQTflQOcA 62hREZaKCOPEEKUCyaHWW7u/ZD50K4mLKM9MfxowPorm7HM5NUKQZHyRrmSuhA05ZXhu X5leLgkLm9ITAVpKYQdyUbCx0LP0CCzCh6/uvWzHdyvM3Tt9aOEMXu0yvVyFQUU+W8u6 IzXbNMYVhripb2U5gJY5+LpOzJaK9YvZk3vJe8Sz4j0qqUiG7smjah/R5MPoAoziXHW6 oLQbJQhi9rxkFhsh6/eA7gXehPxAaFGglRn+QmS1E0rE1r0YEBdXhIdLQ6c8vm7KdFC1 dYxg== Received: by 10.50.159.135 with SMTP id xc7mr7589056igb.1.1343497731937; Sat, 28 Jul 2012 10:48:51 -0700 (PDT) Received: from 63.imp.bsdimp.com (50-78-194-198-static.hfc.comcastbusiness.net. [50.78.194.198]) by mx.google.com with ESMTPS id qo3sm4688822igc.8.2012.07.28.10.48.49 (version=TLSv1/SSLv3 cipher=OTHER); Sat, 28 Jul 2012 10:48:51 -0700 (PDT) Sender: Warner Losh Mime-Version: 1.0 (Apple Message framework v1084) Content-Type: text/plain; charset=us-ascii From: Warner Losh Mail-Followup-To: freebsd-numerics@freebsd.org In-Reply-To: <50131E9D.6030603@missouri.edu> Message-Id: <49EC58EE-BC23-486D-BC36-CADDC5058E4A@bsdimp.com> References: <20120717225328.GA86902@server.rulingia.com> <20120717232740.GA95026@troutmask.apl.washington.edu> <20120718001337.GA87817@server.rulingia.com> <20120718123627.D1575@besplex.bde.org> <20120722121219.GC73662@server.rulingia.com> <500DAD41.5030104@missouri.edu> <20120724113214.G934@besplex.bde.org> <501204AD.30605@missouri.edu> <20120727032611.GB25690@server.rulingia.com> <20120727155521.E4712@besplex.bde.org> <20120727210559.GF31169@server.rulingia.com> <50131E9D.6030603@missouri.edu> To: Stephen Montgomery-Smith X-Mailer: Apple Mail (2.1084) X-Gm-Message-State: ALoCoQnRlha6U02A9s8gkK7D94YrC+qh+DI4iMlrPOGHXQqseetkvyZw3U3V/mOKbRyCJolDne1b Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by server.rulingia.com id q6SHn1F4051809 Cc: Diane Bruce , Bruce Evans , John Baldwin , David Chisnall , Bruce Evans , Steve Kargl , David Schultz , Peter Jeremy Subject: Re: Use of C99 extra long double math functions after r236148 X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Date: Sun, 12 Aug 2012 23:04:16 -0000 X-Original-Date: Sat, 28 Jul 2012 11:48:49 -0600 X-List-Received-Date: Sun, 12 Aug 2012 23:04:16 -0000 On Jul 27, 2012, at 5:05 PM, Stephen Montgomery-Smith wrote: > I sure wish we could have a mailing list like freebsd-numerics to have these discussions. Then we could do things like change the subject line, so that we didn't have to follow everything. I just asked core if we can create one. I'll let you know if that works. Warner From owner-freebsd-numerics@FreeBSD.ORG Sun Aug 12 23:04:31 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 7369E106564A for ; Sun, 12 Aug 2012 23:04:31 +0000 (UTC) (envelope-from peter@rulingia.com) Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by mx1.freebsd.org (Postfix) with ESMTP id D47EA8FC16 for ; Sun, 12 Aug 2012 23:04:30 +0000 (UTC) Received: from server.rulingia.com (c220-239-249-137.belrs5.nsw.optusnet.com.au [220.239.249.137]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CN4Uk9075666 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Mon, 13 Aug 2012 09:04:30 +1000 (EST) (envelope-from peter@rulingia.com) X-Bogosity: Ham, spamicity=0.000000 Received: from server.rulingia.com (localhost.rulingia.com [127.0.0.1]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CN4NpN021249 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Mon, 13 Aug 2012 09:04:24 +1000 (EST) (envelope-from peter@server.rulingia.com) Received: (from peter@localhost) by server.rulingia.com (8.14.5/8.14.5/Submit) id q7CN4N0D021248 for freebsd-numerics@freebsd.org; Mon, 13 Aug 2012 09:04:23 +1000 (EST) (envelope-from peter) Resent-From: Peter Jeremy Resent-Date: Mon, 13 Aug 2012 09:04:23 +1000 Resent-Message-ID: <20120812230423.GO20453@server.rulingia.com> Resent-To: freebsd-numerics@freebsd.org From: Peter Jeremy Mail-Followup-To: freebsd-numerics@freebsd.org To: Stephen Montgomery-Smith Message-ID: <20120728125824.GA26553@server.rulingia.com> References: <20120717043848.GB87001@troutmask.apl.washington.edu> <20120717225328.GA86902@server.rulingia.com> <20120717232740.GA95026@troutmask.apl.washington.edu> <20120718001337.GA87817@server.rulingia.com> <20120718123627.D1575@besplex.bde.org> <20120722121219.GC73662@server.rulingia.com> <500DAD41.5030104@missouri.edu> <20120724113214.G934@besplex.bde.org> <501204AD.30605@missouri.edu> <20120727032611.GB25690@server.rulingia.com> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="yrj/dFKFPuw6o+aM" Content-Disposition: inline In-Reply-To: <20120727032611.GB25690@server.rulingia.com> X-PGP-Key: http://www.rulingia.com/keys/peter.pgp User-Agent: Mutt/1.5.21 (2010-09-15) Cc: Diane Bruce , Bruce Evans , John Baldwin , David Chisnall , Bruce Evans , Steve Kargl , David Schultz , Warner Losh Subject: Re: Use of C99 extra long double math functions after r236148 X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Date: Sun, 12 Aug 2012 23:04:31 -0000 X-Original-Date: Sat, 28 Jul 2012 22:58:24 +1000 X-List-Received-Date: Sun, 12 Aug 2012 23:04:31 -0000 --yrj/dFKFPuw6o+aM Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On 2012-Jul-27 13:26:11 +1000, Peter Jeremy wro= te: >I've been writing a test harness to vet the special case handling of >all the complex functions (excluding cpow so far). Basically, it's >just Appendix G.6 of WG14/N1256 turned into a C array, plus code to >actually run the tests & interpret the results. So far, it's about >1100 lines of which about 1/3 is the test cases and is intended to run >on x86/armle/sparc and FreeBSD/Linux/Solaris (I'm using Solaris and, >to a lesser extent, Linux as a cross-check on my interpretation of the >text). Once I'm happy with it, I'll circulate it. I was initially >hoping to make it commitable but 8-char tabs and 80-char lines would >require lots of line wrapping that would make it harder for me to >follow. My test harness can be found at http://www.rulingia.com/~peter/ctest.c There are no special compilation options, it just needs to be linked with '-lm' (and '-ldl' on Linux). For normal use, just run the executable - it will report any failures. For "finite" arguments, it currently uses 3=CF=80/4 and 32769 other random numbers (the latter is S_COUNT+1). It has two test modes for internal testing and debugging: '-v' verifies that all the argument & result strings are valid and that there's no duplication of argument vectors (for this purpose, it doesn't consider '0' as finite and will incorrectly report '1' as an invalid argument). '-r' prints all the double-precision test vectors used. This should generate 3604951 lines of output. The output should be reasonably self-explanatory except: - double-precision function names are printed with a trailing 'd' - an expected sign of '?' means "don't care". It reports no errors on OpenSolaris but does report a number of what appear to be valid errors on Linux. Whilst I was debugging the code, I found the following elisp useful for post-processing the output: (progn (downcase-region (point-min) (point-max)) (repl-regexp "^ [ ]c" "..c") (repl-regexp "^ c" ".c") (repl-regexp " *0x[0-9a-f]+ *" " ") (repl-regexp " *0x[0-9a-f]+$" "") (repl-regexp "infinit[y]" "inf") (repl-regexp "0\\.0+e\\+0+\\>" "zer") (repl-regexp "1\\.0+e\\+0+\\>" "one") (repl-regexp "3\\.14159[0-9]+e\\+00" "pi.") (repl-regexp "1\\.57079[0-9]+e\\+00" "p_2") (repl-regexp "7\\.85398[0-9]+e\\-01" "p_4") (repl-regexp "2\\.35619[0-9]+e\\+00" "3p4") (repl-regexp "[0-9]\\.[0-9]+e[-\\+][0-9]+" "fin") (repl-regexp "^ *\012" "") (repl-regexp "\012 *=3D" " =3D") (repl-regexp "\012 *expected: *" " # ") (repl-regexp "\012 *want *" " # ") (repl-regexp " +" " ") (repl-regexp "-\\+" " ") (repl-regexp " +$" "") (repl-regexp "\\([^)]\\)$" "\\1 %%") (repl-regexp "^\\(.*=3D \\)\\(.\\)\\(...\\)\\( .*# \\)\\(.\\)\\3\\(.*\\) %= %" "\\1\\2\\3\\4\\5\\3\\6 \\2\\5") (repl-regexp "\\([^)]\\)$" "\\1 %%") (repl-regexp "^\\(.*=3D .... \\)\\(.\\)\\(...\\)\\( # .... \\)\\(.\\)\\3\\= (.*\\) .." "\\1\\2\\3\\4\\5\\3\\6 \\2\\5") (repl-regexp "^\\(......\\)f\\(:.*\012\\)\\1d\\2\\1l\\2" "\\1x\\2") (repl-regexp "^\\(......\\)d\\(:.*\012\\)\\1f\\2\\1l\\2" "\\1x\\2") ) This turns the output into a series of lines like: =2E.ctanx: +3p4 +inf =3D +zer +one # -zer +one +- ++ fn ^ Argument Result Expected XX YY +- precision (f/d/l) or 'x' if all 3 affected XX and YY are the real and imaginary actual and expected result signs or '%' if the category differs between expected and actual. The above line (from the Linux output) means that ctan(3=CF=80/4 + I*Inf) returns (+0 + I*1) instead of (-0 + I*1) '+-' means that the signs of the real parts differ '++' means that the signs of the imaginary parts are both '+' Please let me know if you find any errors or have any comments. --=20 Peter Jeremy --yrj/dFKFPuw6o+aM Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.19 (FreeBSD) iEYEARECAAYFAlAT4fAACgkQ/opHv/APuIcf6gCfXRQ/V/lSc0x8rk0tjV07WW4y jT8AmgPM2+kWjiJXJM2o9IcptPaW4puI =y2fL -----END PGP SIGNATURE----- --yrj/dFKFPuw6o+aM-- From owner-freebsd-numerics@FreeBSD.ORG Sun Aug 12 23:04:36 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id E4AC01065673 for ; Sun, 12 Aug 2012 23:04:36 +0000 (UTC) (envelope-from peter@rulingia.com) Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by mx1.freebsd.org (Postfix) with ESMTP id 69F628FC17 for ; Sun, 12 Aug 2012 23:04:36 +0000 (UTC) Received: from server.rulingia.com (c220-239-249-137.belrs5.nsw.optusnet.com.au [220.239.249.137]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CN4atI075670 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Mon, 13 Aug 2012 09:04:36 +1000 (EST) (envelope-from peter@rulingia.com) X-Bogosity: Ham, spamicity=0.000000 Received: from server.rulingia.com (localhost.rulingia.com [127.0.0.1]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CN4TVP021260 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Mon, 13 Aug 2012 09:04:29 +1000 (EST) (envelope-from peter@server.rulingia.com) Received: (from peter@localhost) by server.rulingia.com (8.14.5/8.14.5/Submit) id q7CN4T6h021259 for freebsd-numerics@freebsd.org; Mon, 13 Aug 2012 09:04:29 +1000 (EST) (envelope-from peter) Resent-From: Peter Jeremy Resent-Date: Mon, 13 Aug 2012 09:04:29 +1000 Resent-Message-ID: <20120812230429.GP20453@server.rulingia.com> Resent-To: freebsd-numerics@freebsd.org Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q6SLxkGG072404 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Sun, 29 Jul 2012 07:59:47 +1000 (EST) (envelope-from stephen@missouri.edu) Received: from wilberforce.math.missouri.edu (wilberforce.math.missouri.edu [128.206.184.213]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q6SLxiSU072407 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Sun, 29 Jul 2012 07:59:46 +1000 (EST) (envelope-from stephen@missouri.edu) Received: from [127.0.0.1] (wilberforce.math.missouri.edu [128.206.184.213]) by wilberforce.math.missouri.edu (8.14.5/8.14.5) with ESMTP id q6SLxN3m068529; Sat, 28 Jul 2012 16:59:24 -0500 (CDT) (envelope-from stephen@missouri.edu) Message-ID: <501460BB.30806@missouri.edu> From: Stephen Montgomery-Smith Mail-Followup-To: freebsd-numerics@freebsd.org User-Agent: Mozilla/5.0 (X11; Linux i686; rv:14.0) Gecko/20120714 Thunderbird/14.0 MIME-Version: 1.0 To: Peter Jeremy References: <20120717043848.GB87001@troutmask.apl.washington.edu> <20120717225328.GA86902@server.rulingia.com> <20120717232740.GA95026@troutmask.apl.washington.edu> <20120718001337.GA87817@server.rulingia.com> <20120718123627.D1575@besplex.bde.org> <20120722121219.GC73662@server.rulingia.com> <500DAD41.5030104@missouri.edu> <20120724113214.G934@besplex.bde.org> <501204AD.30605@missouri.edu> <20120727032611.GB25690@server.rulingia.com> <20120728125824.GA26553@server.rulingia.com> In-Reply-To: <20120728125824.GA26553@server.rulingia.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit Cc: Diane Bruce , Bruce Evans , John Baldwin , David Chisnall , Bruce Evans , Steve Kargl , David Schultz , Warner Losh Subject: Re: Use of C99 extra long double math functions after r236148 X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Date: Sun, 12 Aug 2012 23:04:37 -0000 X-Original-Date: Sat, 28 Jul 2012 16:59:23 -0500 X-List-Received-Date: Sun, 12 Aug 2012 23:04:37 -0000 On 07/28/2012 07:58 AM, Peter Jeremy wrote: > On 2012-Jul-27 13:26:11 +1000, Peter Jeremy wrote: >> I've been writing a test harness to vet the special case handling of >> all the complex functions (excluding cpow so far). Basically, it's >> just Appendix G.6 of WG14/N1256 turned into a C array, plus code to >> actually run the tests & interpret the results. So far, it's about >> 1100 lines of which about 1/3 is the test cases and is intended to run >> on x86/armle/sparc and FreeBSD/Linux/Solaris (I'm using Solaris and, >> to a lesser extent, Linux as a cross-check on my interpretation of the >> text). Once I'm happy with it, I'll circulate it. I was initially >> hoping to make it commitable but 8-char tabs and 80-char lines would >> require lots of line wrapping that would make it harder for me to >> follow. > > My test harness can be found at http://www.rulingia.com/~peter/ctest.c > There are no special compilation options, it just needs to be linked > with '-lm' (and '-ldl' on Linux). For normal use, just run the > executable - it will report any failures. For "finite" arguments, it > currently uses 3Ï€/4 and 32769 other random numbers (the latter is > S_COUNT+1). > > It has two test modes for internal testing and debugging: > '-v' verifies that all the argument & result strings are valid and > that there's no duplication of argument vectors (for this purpose, it > doesn't consider '0' as finite and will incorrectly report '1' as an > invalid argument). > '-r' prints all the double-precision test vectors used. This should > generate 3604951 lines of output. > > The output should be reasonably self-explanatory except: > - double-precision function names are printed with a trailing 'd' > - an expected sign of '?' means "don't care". > > It reports no errors on OpenSolaris but does report a number of what > appear to be valid errors on Linux. > > Whilst I was debugging the code, I found the following elisp useful > for post-processing the output: > > (progn (downcase-region (point-min) (point-max)) > (repl-regexp "^ [ ]c" "..c") > (repl-regexp "^ c" ".c") > (repl-regexp " *0x[0-9a-f]+ *" " ") > (repl-regexp " *0x[0-9a-f]+$" "") > (repl-regexp "infinit[y]" "inf") > (repl-regexp "0\\.0+e\\+0+\\>" "zer") > (repl-regexp "1\\.0+e\\+0+\\>" "one") > (repl-regexp "3\\.14159[0-9]+e\\+00" "pi.") > (repl-regexp "1\\.57079[0-9]+e\\+00" "p_2") > (repl-regexp "7\\.85398[0-9]+e\\-01" "p_4") > (repl-regexp "2\\.35619[0-9]+e\\+00" "3p4") > (repl-regexp "[0-9]\\.[0-9]+e[-\\+][0-9]+" "fin") > (repl-regexp "^ *\012" "") > (repl-regexp "\012 *=" " =") > (repl-regexp "\012 *expected: *" " # ") > (repl-regexp "\012 *want *" " # ") > (repl-regexp " +" " ") > (repl-regexp "-\\+" " ") > (repl-regexp " +$" "") > (repl-regexp "\\([^)]\\)$" "\\1 %%") > (repl-regexp "^\\(.*= \\)\\(.\\)\\(...\\)\\( .*# \\)\\(.\\)\\3\\(.*\\) %%" "\\1\\2\\3\\4\\5\\3\\6 \\2\\5") > (repl-regexp "\\([^)]\\)$" "\\1 %%") > (repl-regexp "^\\(.*= .... \\)\\(.\\)\\(...\\)\\( # .... \\)\\(.\\)\\3\\(.*\\) .." "\\1\\2\\3\\4\\5\\3\\6 \\2\\5") > (repl-regexp "^\\(......\\)f\\(:.*\012\\)\\1d\\2\\1l\\2" "\\1x\\2") > (repl-regexp "^\\(......\\)d\\(:.*\012\\)\\1f\\2\\1l\\2" "\\1x\\2") > ) > > This turns the output into a series of lines like: > ..ctanx: +3p4 +inf = +zer +one # -zer +one +- ++ > fn ^ Argument Result Expected XX YY > +- precision (f/d/l) or 'x' if all 3 affected > > XX and YY are the real and imaginary actual and expected result signs > or '%' if the category differs between expected and actual. The above > line (from the Linux output) means that > ctan(3Ï€/4 + I*Inf) returns (+0 + I*1) instead of (-0 + I*1) > '+-' means that the signs of the real parts differ > '++' means that the signs of the imaginary parts are both '+' > > Please let me know if you find any errors or have any comments. > It is a really nice program. I tried it on the clog and catrig functions. I was able to get the catrig functions to completely comply with your program. See the diff at the end of http://www.freebsd.org/cgi/query-pr.cgi?pr=bin/170206 The clog program was already working, because Bruce had fixed it up. I forgot - does it check the fenv settings as well? It would be great if it does. From owner-freebsd-numerics@FreeBSD.ORG Sun Aug 12 23:04:44 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 9A5311065672 for ; Sun, 12 Aug 2012 23:04:44 +0000 (UTC) (envelope-from peter@rulingia.com) Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by mx1.freebsd.org (Postfix) with ESMTP id EF46E8FC18 for ; Sun, 12 Aug 2012 23:04:43 +0000 (UTC) Received: from server.rulingia.com (c220-239-249-137.belrs5.nsw.optusnet.com.au [220.239.249.137]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CN4hi2075673 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Mon, 13 Aug 2012 09:04:44 +1000 (EST) (envelope-from peter@rulingia.com) X-Bogosity: Ham, spamicity=0.000000 Received: from server.rulingia.com (localhost.rulingia.com [127.0.0.1]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CN4bLv021267 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Mon, 13 Aug 2012 09:04:37 +1000 (EST) (envelope-from peter@server.rulingia.com) Received: (from peter@localhost) by server.rulingia.com (8.14.5/8.14.5/Submit) id q7CN4bQl021266 for freebsd-numerics@freebsd.org; Mon, 13 Aug 2012 09:04:37 +1000 (EST) (envelope-from peter) Resent-From: Peter Jeremy Resent-Date: Mon, 13 Aug 2012 09:04:37 +1000 Resent-Message-ID: <20120812230437.GQ20453@server.rulingia.com> Resent-To: freebsd-numerics@freebsd.org Date: Sun, 29 Jul 2012 09:13:00 +1000 From: Peter Jeremy Mail-Followup-To: freebsd-numerics@freebsd.org To: Stephen Montgomery-Smith Message-ID: <20120728231300.GA20741@server.rulingia.com> References: <20120717232740.GA95026@troutmask.apl.washington.edu> <20120718001337.GA87817@server.rulingia.com> <20120718123627.D1575@besplex.bde.org> <20120722121219.GC73662@server.rulingia.com> <500DAD41.5030104@missouri.edu> <20120724113214.G934@besplex.bde.org> <501204AD.30605@missouri.edu> <20120727032611.GB25690@server.rulingia.com> <20120728125824.GA26553@server.rulingia.com> <501460BB.30806@missouri.edu> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="RnlQjJ0d97Da+TV1" Content-Disposition: inline In-Reply-To: <501460BB.30806@missouri.edu> X-PGP-Key: http://www.rulingia.com/keys/peter.pgp User-Agent: Mutt/1.5.21 (2010-09-15) Cc: Diane Bruce , Bruce Evans , John Baldwin , David Chisnall , Bruce Evans , Steve Kargl , David Schultz , Warner Losh Subject: Re: Use of C99 extra long double math functions after r236148 X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 12 Aug 2012 23:04:44 -0000 --RnlQjJ0d97Da+TV1 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On 2012-Jul-28 16:59:23 -0500, Stephen Montgomery-Smith wrote: >On 07/28/2012 07:58 AM, Peter Jeremy wrote: >> Whilst I was debugging the code, I found the following elisp useful >> for post-processing the output: >> >> (progn (downcase-region (point-min) (point-max)) >> (repl-regexp "^ [ ]c" "..c") >> (repl-regexp "^ c" ".c") =2E.. Oops, I forgot that repl-regexp is one of my private functions: (defun repl-regexp (from to) "Replace every occurrence of regexp FROM with TO in current buffer." (goto-char (point-min)) (while (search-forward-regexp from nil t) (replace-match to nil nil))) Note that it's safe to execute that progn in the buffer contaning ctest output. >It is a really nice program. Thanks. >I forgot - does it check the fenv settings as well? It would be great=20 >if it does. Not yet. That's my next task. I've also been thinking about how to do better than cpow(x,y) =3D cexp(y*clog(x)). --=20 Peter Jeremy --RnlQjJ0d97Da+TV1 Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.19 (FreeBSD) iEYEARECAAYFAlAUcfwACgkQ/opHv/APuIfZAQCgmtyP3h43SfeM4pXZdDZy0fmH ytkAn2GBeNDh403HQ2ggOe6IQFkhM23E =hsx+ -----END PGP SIGNATURE----- --RnlQjJ0d97Da+TV1-- From owner-freebsd-numerics@FreeBSD.ORG Sun Aug 12 23:04:56 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 04136106564A for ; Sun, 12 Aug 2012 23:04:56 +0000 (UTC) (envelope-from peter@rulingia.com) Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by mx1.freebsd.org (Postfix) with ESMTP id 921D08FC1C for ; Sun, 12 Aug 2012 23:04:55 +0000 (UTC) Received: from server.rulingia.com (c220-239-249-137.belrs5.nsw.optusnet.com.au [220.239.249.137]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CN4tPZ075679 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Mon, 13 Aug 2012 09:04:55 +1000 (EST) (envelope-from peter@rulingia.com) X-Bogosity: Ham, spamicity=0.000000 Received: from server.rulingia.com (localhost.rulingia.com [127.0.0.1]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CN4nha021278 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Mon, 13 Aug 2012 09:04:49 +1000 (EST) (envelope-from peter@server.rulingia.com) Received: (from peter@localhost) by server.rulingia.com (8.14.5/8.14.5/Submit) id q7CN4nW3021277 for freebsd-numerics@freebsd.org; Mon, 13 Aug 2012 09:04:49 +1000 (EST) (envelope-from peter) Resent-From: Peter Jeremy Resent-Date: Mon, 13 Aug 2012 09:04:49 +1000 Resent-Message-ID: <20120812230449.GR20453@server.rulingia.com> Resent-To: freebsd-numerics@freebsd.org Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q6T1HF2i074119 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Sun, 29 Jul 2012 11:17:15 +1000 (EST) (envelope-from stephen@missouri.edu) Received: from wilberforce.math.missouri.edu (wilberforce.math.missouri.edu [128.206.184.213]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q6T1HCPf072918 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Sun, 29 Jul 2012 11:17:14 +1000 (EST) (envelope-from stephen@missouri.edu) Received: from [127.0.0.1] (wilberforce.math.missouri.edu [128.206.184.213]) by wilberforce.math.missouri.edu (8.14.5/8.14.5) with ESMTP id q6T1GoY9099800; Sat, 28 Jul 2012 20:16:51 -0500 (CDT) (envelope-from stephen@missouri.edu) Message-ID: <50148F02.4020104@missouri.edu> Date: Sat, 28 Jul 2012 20:16:50 -0500 From: Stephen Montgomery-Smith Mail-Followup-To: freebsd-numerics@freebsd.org User-Agent: Mozilla/5.0 (X11; Linux i686; rv:14.0) Gecko/20120714 Thunderbird/14.0 MIME-Version: 1.0 To: Peter Jeremy References: <20120717232740.GA95026@troutmask.apl.washington.edu> <20120718001337.GA87817@server.rulingia.com> <20120718123627.D1575@besplex.bde.org> <20120722121219.GC73662@server.rulingia.com> <500DAD41.5030104@missouri.edu> <20120724113214.G934@besplex.bde.org> <501204AD.30605@missouri.edu> <20120727032611.GB25690@server.rulingia.com> <20120728125824.GA26553@server.rulingia.com> <501460BB.30806@missouri.edu> <20120728231300.GA20741@server.rulingia.com> In-Reply-To: <20120728231300.GA20741@server.rulingia.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: Diane Bruce , Bruce Evans , John Baldwin , David Chisnall , Bruce Evans , Steve Kargl , David Schultz , Warner Losh Subject: Re: Use of C99 extra long double math functions after r236148 X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 12 Aug 2012 23:04:56 -0000 On 07/28/2012 06:13 PM, Peter Jeremy wrote: > On 2012-Jul-28 16:59:23 -0500, Stephen Montgomery-Smith wrote: >> On 07/28/2012 07:58 AM, Peter Jeremy wrote: >>> Whilst I was debugging the code, I found the following elisp useful >>> for post-processing the output: >>> >>> (progn (downcase-region (point-min) (point-max)) >>> (repl-regexp "^ [ ]c" "..c") >>> (repl-regexp "^ c" ".c") > ... > > Oops, I forgot that repl-regexp is one of my private functions: > > (defun repl-regexp (from to) > "Replace every occurrence of regexp FROM with TO in current buffer." > (goto-char (point-min)) > (while (search-forward-regexp from nil t) > (replace-match to nil nil))) > > Note that it's safe to execute that progn in the buffer contaning ctest > output. > >> It is a really nice program. > > Thanks. > >> I forgot - does it check the fenv settings as well? It would be great >> if it does. > > Not yet. That's my next task. I've also been thinking about how to do > better than cpow(x,y) = cexp(y*clog(x)). > One thing your program doesn't check are things like: real part of casinh(-0+I*x) is -0 imaginary part of casinh(x-I*0) is -0 etc, where x is finite, non-zero. (This follows from casinh being odd and conjugate invariant.) From owner-freebsd-numerics@FreeBSD.ORG Sun Aug 12 23:05:19 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 49153106566C for ; Sun, 12 Aug 2012 23:05:19 +0000 (UTC) (envelope-from peter@rulingia.com) Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by mx1.freebsd.org (Postfix) with ESMTP id D60208FC14 for ; Sun, 12 Aug 2012 23:05:18 +0000 (UTC) Received: from server.rulingia.com (c220-239-249-137.belrs5.nsw.optusnet.com.au [220.239.249.137]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CN5IOu075689 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Mon, 13 Aug 2012 09:05:18 +1000 (EST) (envelope-from peter@rulingia.com) X-Bogosity: Ham, spamicity=0.000000 Received: from server.rulingia.com (localhost.rulingia.com [127.0.0.1]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CN5Bh5021305 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Mon, 13 Aug 2012 09:05:12 +1000 (EST) (envelope-from peter@server.rulingia.com) Received: (from peter@localhost) by server.rulingia.com (8.14.5/8.14.5/Submit) id q7CN5BCo021304 for freebsd-numerics@freebsd.org; Mon, 13 Aug 2012 09:05:11 +1000 (EST) (envelope-from peter) Resent-From: Peter Jeremy Resent-Date: Mon, 13 Aug 2012 09:05:11 +1000 Resent-Message-ID: <20120812230511.GS20453@server.rulingia.com> Resent-To: freebsd-numerics@freebsd.org Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q6H4Oc0F070162 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Tue, 17 Jul 2012 14:24:39 +1000 (EST) (envelope-from sgk@troutmask.apl.washington.edu) Received: from troutmask.apl.washington.edu (troutmask.apl.washington.edu [128.95.76.21]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q6H4OaYs065923 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Tue, 17 Jul 2012 14:24:37 +1000 (EST) (envelope-from sgk@troutmask.apl.washington.edu) Received: from troutmask.apl.washington.edu (localhost.apl.washington.edu [127.0.0.1]) by troutmask.apl.washington.edu (8.14.5/8.14.5) with ESMTP id q6H4OYOn087039; Mon, 16 Jul 2012 21:24:34 -0700 (PDT) (envelope-from sgk@troutmask.apl.washington.edu) Received: (from sgk@localhost) by troutmask.apl.washington.edu (8.14.5/8.14.5/Submit) id q6H4OYe9087038; Mon, 16 Jul 2012 21:24:34 -0700 (PDT) (envelope-from sgk) From: Steve Kargl Mail-Followup-To: freebsd-numerics@freebsd.org To: Warner Losh Message-ID: <20120717042434.GA87001@troutmask.apl.washington.edu> References: <201207130818.38535.jhb@freebsd.org> <9EB2DA4F-19D7-4BA5-8811-D9451CB1D907@theravensnest.org> <20120713155805.GC81965@zim.MIT.EDU> <20120714120432.GA70706@server.rulingia.com> <20120717084457.U3890@besplex.bde.org> <5004A5C7.1040405@missouri.edu> <5004DEA9.1050001@missouri.edu> <20120717040118.GA86840@troutmask.apl.washington.edu> <6F750F84-34FF-4961-A2EA-F3E67A6872AE@bsdimp.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <6F750F84-34FF-4961-A2EA-F3E67A6872AE@bsdimp.com> User-Agent: Mutt/1.4.2.3i Cc: Diane Bruce , John Baldwin , David Chisnall , Stephen Montgomery-Smith , Bruce Evans , Bruce Evans , David Schultz , Peter Jeremy Subject: Re: Use of C99 extra long double math functions after r236148 X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Date: Sun, 12 Aug 2012 23:05:19 -0000 X-Original-Date: Mon, 16 Jul 2012 21:24:34 -0700 X-List-Received-Date: Sun, 12 Aug 2012 23:05:19 -0000 On Mon, Jul 16, 2012 at 10:12:21PM -0600, Warner Losh wrote: > > On Jul 16, 2012, at 10:01 PM, Steve Kargl wrote: > > > On Mon, Jul 16, 2012 at 10:40:25PM -0500, Stephen Montgomery-Smith wrote: > >> > >> I came up with pseudo code that looks a bit like this. > >> > >> complex casinh(complex z) { > >> double x = z.re, y = z.im; > >> > >> if (y==0) > >> return asinh(x); > >> if (x==0) { > >> if (fabs(y)<=1) return I*asin(y); > >> else return signum(y)* ( > >> log(fabs(y)+sqrt(y*y-1)) > >> + I*PI/2); > > > > Stop. Please see msun/src/math_private.h. You cannot > > use I in any expression. gcc in base and clang do not > > do the arithmetic correctly. See msun/src/s_ccosh.c > > for how one might approach writing these functions. > > Also, consult n1256.pdf for x,y = +-0, +-inf, nan. > > There are specific requirements that must be met. > > Yes. Pseudo code is OK for following the flow, but look at the > exp code for why that's not entirely sufficient. You have to be > extremely fussy about all kinds of things. Then again, exp is a > lot more important to get right than the complex trig functions... > > The pseudo code is a good place to start, but it just the barest > start in the integration process... I can't tell whether you agree with my urging of caution or whether your being sarcastic. The pseudo-code as written simply does not apply once one looks at n1256.pdf. -- Steve From owner-freebsd-numerics@FreeBSD.ORG Sun Aug 12 23:05:57 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 41FBD106566B for ; Sun, 12 Aug 2012 23:05:57 +0000 (UTC) (envelope-from peter@rulingia.com) Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by mx1.freebsd.org (Postfix) with ESMTP id CF8CB8FC15 for ; Sun, 12 Aug 2012 23:05:56 +0000 (UTC) Received: from server.rulingia.com (c220-239-249-137.belrs5.nsw.optusnet.com.au [220.239.249.137]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CN5uOB075694 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Mon, 13 Aug 2012 09:05:56 +1000 (EST) (envelope-from peter@rulingia.com) X-Bogosity: Ham, spamicity=0.000000 Received: from server.rulingia.com (localhost.rulingia.com [127.0.0.1]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CN5oqF021321 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Mon, 13 Aug 2012 09:05:50 +1000 (EST) (envelope-from peter@server.rulingia.com) Received: (from peter@localhost) by server.rulingia.com (8.14.5/8.14.5/Submit) id q7CN5oBK021320 for freebsd-numerics@freebsd.org; Mon, 13 Aug 2012 09:05:50 +1000 (EST) (envelope-from peter) Resent-From: Peter Jeremy Resent-Date: Mon, 13 Aug 2012 09:05:50 +1000 Resent-Message-ID: <20120812230550.GT20453@server.rulingia.com> Resent-To: freebsd-numerics@freebsd.org Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q6H4vAX0070417 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Tue, 17 Jul 2012 14:57:11 +1000 (EST) (envelope-from stephen@missouri.edu) Received: from wilberforce.math.missouri.edu (wilberforce.math.missouri.edu [128.206.184.213]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q6H4v8SO065998 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Tue, 17 Jul 2012 14:57:10 +1000 (EST) (envelope-from stephen@missouri.edu) Received: from [127.0.0.1] (wilberforce.math.missouri.edu [128.206.184.213]) by wilberforce.math.missouri.edu (8.14.5/8.14.5) with ESMTP id q6H4uj4S025077; Mon, 16 Jul 2012 23:56:46 -0500 (CDT) (envelope-from stephen@missouri.edu) Message-ID: <5004F08E.4040501@missouri.edu> From: Stephen Montgomery-Smith Mail-Followup-To: freebsd-numerics@freebsd.org User-Agent: Mozilla/5.0 (X11; Linux i686; rv:13.0) Gecko/20120615 Thunderbird/13.0.1 MIME-Version: 1.0 To: Warner Losh References: <20120711223247.GA9964@troutmask.apl.washington.edu> <20120713114100.GB83006@server.rulingia.com> <201207130818.38535.jhb@freebsd.org> <9EB2DA4F-19D7-4BA5-8811-D9451CB1D907@theravensnest.org> <20120713155805.GC81965@zim.MIT.EDU> <20120714120432.GA70706@server.rulingia.com> <20120717084457.U3890@besplex.bde.org> <5004A5C7.1040405@missouri.edu> <5004DEA9.1050001@missouri.edu> <20120717040118.GA86840@troutmask.apl.washington.edu> <6F750F84-34FF-4961-A2EA-F3E67A6872AE@bsdimp.com> In-Reply-To: <6F750F84-34FF-4961-A2EA-F3E67A6872AE@bsdimp.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: Diane Bruce , Steve Kargl , John Baldwin , David Chisnall , Bruce Evans , Bruce Evans , David Schultz , Peter Jeremy Subject: Re: Use of C99 extra long double math functions after r236148 X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Date: Sun, 12 Aug 2012 23:05:57 -0000 X-Original-Date: Mon, 16 Jul 2012 23:56:46 -0500 X-List-Received-Date: Sun, 12 Aug 2012 23:05:57 -0000 On 07/16/2012 11:12 PM, Warner Losh wrote: > > On Jul 16, 2012, at 10:01 PM, Steve Kargl wrote: > >> On Mon, Jul 16, 2012 at 10:40:25PM -0500, Stephen Montgomery-Smith wrote: >>> >>> I came up with pseudo code that looks a bit like this. >>> >>> complex casinh(complex z) { >>> double x = z.re, y = z.im; >>> >>> if (y==0) >>> return asinh(x); >>> if (x==0) { >>> if (fabs(y)<=1) return I*asin(y); >>> else return signum(y)* ( >>> log(fabs(y)+sqrt(y*y-1)) >>> + I*PI/2); >> >> Stop. Please see msun/src/math_private.h. You cannot >> use I in any expression. gcc in base and clang do not >> do the arithmetic correctly. See msun/src/s_ccosh.c >> for how one might approach writing these functions. >> Also, consult n1256.pdf for x,y = +-0, +-inf, nan. >> There are specific requirements that must be met. > > Yes. Pseudo code is OK for following the flow, but look at the exp code for why that's not entirely sufficient. You have to be extremely fussy about all kinds of things. Then again, exp is a lot more important to get right than the complex trig functions... > > The pseudo code is a good place to start, but it just the barest start in the integration process... > > Warner OK, I'll have a go at making it proper code. But before I can do that, I notice that we don't have a clog function. The pseudo code is obvious: return log(hypot(z.re,z.im)) + I*atan2(z.re,z.im) so this will give me good practice at getting the difficult stuff correct. Give me a while, because I can see this isn't going to be totally straightforward. I'm going to have questions as I start going through this. From owner-freebsd-numerics@FreeBSD.ORG Sun Aug 12 23:06:05 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 6326B106564A for ; Sun, 12 Aug 2012 23:06:05 +0000 (UTC) (envelope-from peter@rulingia.com) Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by mx1.freebsd.org (Postfix) with ESMTP id AFD0D8FC0C for ; Sun, 12 Aug 2012 23:06:04 +0000 (UTC) Received: from server.rulingia.com (c220-239-249-137.belrs5.nsw.optusnet.com.au [220.239.249.137]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CN64gE075697 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Mon, 13 Aug 2012 09:06:04 +1000 (EST) (envelope-from peter@rulingia.com) X-Bogosity: Ham, spamicity=0.000000 Received: from server.rulingia.com (localhost.rulingia.com [127.0.0.1]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CN5wCK021327 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Mon, 13 Aug 2012 09:05:58 +1000 (EST) (envelope-from peter@server.rulingia.com) Received: (from peter@localhost) by server.rulingia.com (8.14.5/8.14.5/Submit) id q7CN5wwU021326 for freebsd-numerics@freebsd.org; Mon, 13 Aug 2012 09:05:58 +1000 (EST) (envelope-from peter) Resent-From: Peter Jeremy Resent-Date: Mon, 13 Aug 2012 09:05:58 +1000 Resent-Message-ID: <20120812230558.GU20453@server.rulingia.com> Resent-To: freebsd-numerics@freebsd.org Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q6H8sQu9072618 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Tue, 17 Jul 2012 18:54:26 +1000 (EST) (envelope-from theraven@freebsd.org) Received: from theravensnest.org (theraven.freebsd.your.org [216.14.102.27]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q6H8sNIr066628 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=FAIL) for ; Tue, 17 Jul 2012 18:54:25 +1000 (EST) (envelope-from theraven@freebsd.org) Received: from c120.sec.cl.cam.ac.uk (c120.sec.cl.cam.ac.uk [128.232.18.120]) (authenticated bits=0) by theravensnest.org (8.14.5/8.14.5) with ESMTP id q6H8rvEn090352 (version=TLSv1/SSLv3 cipher=DHE-DSS-AES128-SHA bits=128 verify=NO); Tue, 17 Jul 2012 08:53:58 GMT (envelope-from theraven@freebsd.org) Mime-Version: 1.0 (Apple Message framework v1278) Content-Type: text/plain; charset=us-ascii From: David Chisnall Mail-Followup-To: freebsd-numerics@freebsd.org In-Reply-To: <20120717040118.GA86840@troutmask.apl.washington.edu> Message-Id: <2026C2D3-E975-4DFD-9D50-2B7D9E894360@freebsd.org> References: <20120711223247.GA9964@troutmask.apl.washington.edu> <20120713114100.GB83006@server.rulingia.com> <201207130818.38535.jhb@freebsd.org> <9EB2DA4F-19D7-4BA5-8811-D9451CB1D907@theravensnest.org> <20120713155805.GC81965@zim.MIT.EDU> <20120714120432.GA70706@server.rulingia.com> <20120717084457.U3890@besplex.bde.org> <5004A5C7.1040405@missouri.edu> <5004DEA9.1050001@missouri.edu> <20120717040118.GA86840@troutmask.apl.washington.edu> To: Steve Kargl X-Mailer: Apple Mail (2.1278) Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by server.rulingia.com id q6H8sQu9072618 Cc: Diane Bruce , John Baldwin , Stephen Montgomery-Smith , Bruce Evans , Bruce Evans , David Schultz , Peter Jeremy , Warner Losh Subject: Re: Use of C99 extra long double math functions after r236148 X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Date: Sun, 12 Aug 2012 23:06:05 -0000 X-Original-Date: Tue, 17 Jul 2012 09:53:59 +0100 X-List-Received-Date: Sun, 12 Aug 2012 23:06:05 -0000 On 17 Jul 2012, at 05:01, Steve Kargl wrote: > gcc in base and clang do not > do the arithmetic correctly Please can you point me at the relevant clang PR, and I'll look at fixing this. David From owner-freebsd-numerics@FreeBSD.ORG Sun Aug 12 23:06:09 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 2CD8E106564A for ; Sun, 12 Aug 2012 23:06:09 +0000 (UTC) (envelope-from peter@rulingia.com) Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by mx1.freebsd.org (Postfix) with ESMTP id 799CC8FC17 for ; Sun, 12 Aug 2012 23:06:08 +0000 (UTC) Received: from server.rulingia.com (c220-239-249-137.belrs5.nsw.optusnet.com.au [220.239.249.137]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CN68FU075698 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Mon, 13 Aug 2012 09:06:08 +1000 (EST) (envelope-from peter@rulingia.com) X-Bogosity: Ham, spamicity=0.000000 Received: from server.rulingia.com (localhost.rulingia.com [127.0.0.1]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CN62j2021333 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Mon, 13 Aug 2012 09:06:02 +1000 (EST) (envelope-from peter@server.rulingia.com) Received: (from peter@localhost) by server.rulingia.com (8.14.5/8.14.5/Submit) id q7CN62h0021332 for freebsd-numerics@freebsd.org; Mon, 13 Aug 2012 09:06:02 +1000 (EST) (envelope-from peter) Resent-From: Peter Jeremy Resent-Date: Mon, 13 Aug 2012 09:06:02 +1000 Resent-Message-ID: <20120812230602.GV20453@server.rulingia.com> Resent-To: freebsd-numerics@freebsd.org Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q6HDxfZs075637 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Tue, 17 Jul 2012 23:59:42 +1000 (EST) (envelope-from sgk@troutmask.apl.washington.edu) Received: from troutmask.apl.washington.edu (troutmask.apl.washington.edu [128.95.76.21]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q6HDxdBW067442 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Tue, 17 Jul 2012 23:59:41 +1000 (EST) (envelope-from sgk@troutmask.apl.washington.edu) Received: from troutmask.apl.washington.edu (localhost.apl.washington.edu [127.0.0.1]) by troutmask.apl.washington.edu (8.14.5/8.14.5) with ESMTP id q6HDxcdv089352; Tue, 17 Jul 2012 06:59:38 -0700 (PDT) (envelope-from sgk@troutmask.apl.washington.edu) Received: (from sgk@localhost) by troutmask.apl.washington.edu (8.14.5/8.14.5/Submit) id q6HDxbEf089351; Tue, 17 Jul 2012 06:59:37 -0700 (PDT) (envelope-from sgk) From: Steve Kargl Mail-Followup-To: freebsd-numerics@freebsd.org To: David Chisnall Message-ID: <20120717135937.GA89332@troutmask.apl.washington.edu> References: <201207130818.38535.jhb@freebsd.org> <9EB2DA4F-19D7-4BA5-8811-D9451CB1D907@theravensnest.org> <20120713155805.GC81965@zim.MIT.EDU> <20120714120432.GA70706@server.rulingia.com> <20120717084457.U3890@besplex.bde.org> <5004A5C7.1040405@missouri.edu> <5004DEA9.1050001@missouri.edu> <20120717040118.GA86840@troutmask.apl.washington.edu> <2026C2D3-E975-4DFD-9D50-2B7D9E894360@freebsd.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <2026C2D3-E975-4DFD-9D50-2B7D9E894360@freebsd.org> User-Agent: Mutt/1.4.2.3i Cc: Diane Bruce , John Baldwin , Stephen Montgomery-Smith , Bruce Evans , Bruce Evans , David Schultz , Peter Jeremy , Warner Losh Subject: Re: Use of C99 extra long double math functions after r236148 X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Date: Sun, 12 Aug 2012 23:06:09 -0000 X-Original-Date: Tue, 17 Jul 2012 06:59:37 -0700 X-List-Received-Date: Sun, 12 Aug 2012 23:06:09 -0000 On Tue, Jul 17, 2012 at 09:53:59AM +0100, David Chisnall wrote: > On 17 Jul 2012, at 05:01, Steve Kargl wrote: > > > gcc in base and clang do not > > do the arithmetic correctly > > Please can you point me at the relevant clang PR, and I'll look at fixing this. > > David http://llvm.org/bugs/show_bug.cgi?id=8532 -- Steve From owner-freebsd-numerics@FreeBSD.ORG Sun Aug 12 23:06:31 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id C7D731065675 for ; Sun, 12 Aug 2012 23:06:31 +0000 (UTC) (envelope-from peter@rulingia.com) Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by mx1.freebsd.org (Postfix) with ESMTP id 4C5A88FC0C for ; Sun, 12 Aug 2012 23:06:31 +0000 (UTC) Received: from server.rulingia.com (c220-239-249-137.belrs5.nsw.optusnet.com.au [220.239.249.137]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CN6V3o075713 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Mon, 13 Aug 2012 09:06:31 +1000 (EST) (envelope-from peter@rulingia.com) X-Bogosity: Ham, spamicity=0.000000 Received: from server.rulingia.com (localhost.rulingia.com [127.0.0.1]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CN6ON2021362 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Mon, 13 Aug 2012 09:06:24 +1000 (EST) (envelope-from peter@server.rulingia.com) Received: (from peter@localhost) by server.rulingia.com (8.14.5/8.14.5/Submit) id q7CN6O5D021361 for freebsd-numerics@freebsd.org; Mon, 13 Aug 2012 09:06:24 +1000 (EST) (envelope-from peter) Resent-From: Peter Jeremy Resent-Date: Mon, 13 Aug 2012 09:06:24 +1000 Resent-Message-ID: <20120812230624.GY20453@server.rulingia.com> Resent-To: freebsd-numerics@freebsd.org Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q6HD2Otk075222 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Tue, 17 Jul 2012 23:02:25 +1000 (EST) (envelope-from stephen@missouri.edu) Received: from wilberforce.math.missouri.edu (wilberforce.math.missouri.edu [128.206.184.213]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q6HD2MLE067318 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Tue, 17 Jul 2012 23:02:24 +1000 (EST) (envelope-from stephen@missouri.edu) Received: from [127.0.0.1] (wilberforce.math.missouri.edu [128.206.184.213]) by wilberforce.math.missouri.edu (8.14.5/8.14.5) with ESMTP id q6HD1xUl056214; Tue, 17 Jul 2012 08:02:00 -0500 (CDT) (envelope-from stephen@missouri.edu) Message-ID: <50056247.2000800@missouri.edu> From: Stephen Montgomery-Smith Mail-Followup-To: freebsd-numerics@freebsd.org User-Agent: Mozilla/5.0 (X11; Linux i686; rv:13.0) Gecko/20120615 Thunderbird/13.0.1 MIME-Version: 1.0 To: Bruce Evans References: <20120529045612.GB4445@server.rulingia.com> <20120711223247.GA9964@troutmask.apl.washington.edu> <20120713114100.GB83006@server.rulingia.com> <201207130818.38535.jhb@freebsd.org> <9EB2DA4F-19D7-4BA5-8811-D9451CB1D907@theravensnest.org> <20120713155805.GC81965@zim.MIT.EDU> <20120714120432.GA70706@server.rulingia.com> <20120717084457.U3890@besplex.bde.org> <5004A5C7.1040405@missouri.edu> <5004DEA9.1050001@missouri.edu> <20120717200931.U6624@besplex.bde.org> In-Reply-To: <20120717200931.U6624@besplex.bde.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: Diane Bruce , John Baldwin , David Chisnall , Bruce Evans , Steve Kargl , David Schultz , Peter Jeremy , Warner Losh Subject: Re: Use of C99 extra long double math functions after r236148 X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Date: Sun, 12 Aug 2012 23:06:31 -0000 X-Original-Date: Tue, 17 Jul 2012 08:01:59 -0500 X-List-Received-Date: Sun, 12 Aug 2012 23:06:31 -0000 On 07/17/2012 06:13 AM, Bruce Evans wrote: > On Mon, 16 Jul 2012, Stephen Montgomery-Smith wrote: > >> On 07/16/2012 06:37 PM, Stephen Montgomery-Smith wrote: >>> ... >>> The difficulty is that catanh, cacosh, and cacosh do not have unique >>> answers unless one makes some kind of accepted convention as to what >>> they should be. >>> >>> I did find some definitions at >>> http://publib.boulder.ibm.com/infocenter/zos/v1r11/index.jsp?topic=/com.ibm.zos.r11.bpxbd00/catan.htm. >>> >>> I'm going to guess that there is a typo in that the range of the >>> imaginary part should be [0,i pi], because the usual convention is that >>> acos of a real number is a real number between 0 and pi. > > I think C99 specificies the branch cus and boundary behaviour completely. > >>> We might get lucky, and find that the definitions of csqrt and clog in >>> the C99 standard are already set up so that the naive formulas for >>> cacosh, etc, just work. But whether they do or whether they don't, I >>> think I can do it. (As a first guess, I think that catanh and casinh >>> will work "out of the box" but cacosh is going to take a bit more work.) > > See below what happened for naive formulars for ccosh. > >>> Also casin, catan and cacos are essentially the same as casinh, etc, >>> using formulas like sin(ix) = i sinh(x). (The hardest part is to avoid >>> making a sign error.) >> >> I came up with pseudo code that looks a bit like this. >> >> complex casinh(complex z) { >> double x = z.re, y = z.im; >> >> if (y==0) >> return asinh(x); >> if (x==0) { >> if (fabs(y)<=1) return I*asin(y); >> else return signum(y)* ( >> log(fabs(y)+sqrt(y*y-1)) >> + I*PI/2); >> } >> if (x>0) >> return clog(z+csqrt(z*z+1)); >> else >> return -clog(-z+csqrt(z*z+1)); >> } > > I translated this to pari. There was a sign error for the log() term > (assuming that pari asinh() has the same branch cuts as C99): > > % \p 100 > % % PI = Pi; > % clog(z) = log(z); > % csqrt(z) = sqrt(z); > % fabs(x) = abs(x); > % signum(x) = sign(x); > % % casinh(z) = > % { > % local(x, y); > % % x = real(z); > % y = imag(z); > % if (y == 0, > % return (asinh(x)); > % ); > % if (x == 0, > % if (fabs(y) <= 1, > % return (I * asin(y)); > % , > % return (signum(y) * > % (-log(fabs(y) + sqrt(y * y - 1)) + I * PI / 2)); > % ); > % ); > % if (x > 0, > % return (clog(z + csqrt(z * z + 1))); > % , > % return (-clog(-z + csqrt(z * z + 1))); > % ); > % } > % % { > % forstep (x = -10, 10, 0.1, > % forstep (y = -10, 10, 0.1, > % z = x + I*y; > % r = casinh(z) - asinh(z); > % if (abs(r) > 1e-30, > % print("z = " z); > % print("casinh(z) = " casinh(z)); > % print(" asinh(z) = " asinh(z)); > % print("diff = " r); > % ); > % ); > % ); > % } > > (No differences found after fixing the sign.) > > Pari of course does all the calculations almost perfectly accurately (I > told it to provide 100 decimal digits). Most multi-precision calculators > can do the same. So pari can be used to develop the logic, but it is hard > to use it develop accurate routines for limited precision. > > The most obvious immediate difficulty in translating the above into C is > that y*y and z*z may overflow when the result shouldn't. hypot() and > cabs() handle similar problems, with remarkably large complications. > C99 with IEEE754 FP actually handles some aspects of overflow better > than pari. E.g., exp(10^9) gives oveflow in pari, with no way of handling > it AFAIK, while in C99 + IEEE754 it gives +Inf with FE_OVERFLOW. > > Bruce > > Excellent. I think I will use pari to write the test code to check the ULP of the result. And I'll look into using hypot, or its logic, to compute sqrt(y*y-1). From owner-freebsd-numerics@FreeBSD.ORG Sun Aug 12 23:06:37 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id AF41D106566B for ; Sun, 12 Aug 2012 23:06:37 +0000 (UTC) (envelope-from peter@rulingia.com) Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by mx1.freebsd.org (Postfix) with ESMTP id 1ABD58FC08 for ; Sun, 12 Aug 2012 23:06:36 +0000 (UTC) Received: from server.rulingia.com (c220-239-249-137.belrs5.nsw.optusnet.com.au [220.239.249.137]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CN6aat075716 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Mon, 13 Aug 2012 09:06:37 +1000 (EST) (envelope-from peter@rulingia.com) X-Bogosity: Ham, spamicity=0.000000 Received: from server.rulingia.com (localhost.rulingia.com [127.0.0.1]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CN6U4H021369 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Mon, 13 Aug 2012 09:06:30 +1000 (EST) (envelope-from peter@server.rulingia.com) Received: (from peter@localhost) by server.rulingia.com (8.14.5/8.14.5/Submit) id q7CN6Ugc021368 for freebsd-numerics@freebsd.org; Mon, 13 Aug 2012 09:06:30 +1000 (EST) (envelope-from peter) Resent-From: Peter Jeremy Resent-Date: Mon, 13 Aug 2012 09:06:30 +1000 Resent-Message-ID: <20120812230630.GZ20453@server.rulingia.com> Resent-To: freebsd-numerics@freebsd.org Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q6IF8Ebf098382 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Thu, 19 Jul 2012 01:08:14 +1000 (EST) (envelope-from stephen@missouri.edu) Received: from wilberforce.math.missouri.edu (wilberforce.math.missouri.edu [128.206.184.213]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q6IF8BGY073152 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Thu, 19 Jul 2012 01:08:13 +1000 (EST) (envelope-from stephen@missouri.edu) Received: from [127.0.0.1] (wilberforce.math.missouri.edu [128.206.184.213]) by wilberforce.math.missouri.edu (8.14.5/8.14.5) with ESMTP id q6IF7fou016669; Wed, 18 Jul 2012 10:07:42 -0500 (CDT) (envelope-from stephen@missouri.edu) Message-ID: <5006D13D.2080702@missouri.edu> From: Stephen Montgomery-Smith Mail-Followup-To: freebsd-numerics@freebsd.org User-Agent: Mozilla/5.0 (X11; Linux i686; rv:13.0) Gecko/20120615 Thunderbird/13.0.1 MIME-Version: 1.0 To: Bruce Evans References: <20120529045612.GB4445@server.rulingia.com> <20120711223247.GA9964@troutmask.apl.washington.edu> <20120713114100.GB83006@server.rulingia.com> <201207130818.38535.jhb@freebsd.org> <9EB2DA4F-19D7-4BA5-8811-D9451CB1D907@theravensnest.org> <20120713155805.GC81965@zim.MIT.EDU> <20120714120432.GA70706@server.rulingia.com> <20120717084457.U3890@besplex.bde.org> <5004A5C7.1040405@missouri.edu> <5004DEA9.1050001@missouri.edu> <20120717200931.U6624@besplex.bde.org> In-Reply-To: <20120717200931.U6624@besplex.bde.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: Diane Bruce , John Baldwin , David Chisnall , Bruce Evans , Steve Kargl , David Schultz , Peter Jeremy , Warner Losh Subject: Re: Use of C99 extra long double math functions after r236148 X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Date: Sun, 12 Aug 2012 23:06:37 -0000 X-Original-Date: Wed, 18 Jul 2012 10:07:41 -0500 X-List-Received-Date: Sun, 12 Aug 2012 23:06:37 -0000 I went on a long road trip yesterday, so I didn't get any code written, but I did have a lot of thoughts about clog and casinh. First, the naive formula (here z=x+I*y) clog(z) = cpack(log(hypot(x,y)),atan2(x,y)) is going to work in a lot more edge cases then one might expect. This is because hypot and atan2, especially atan2, already do a rather good job getting the edge cases right. I am thinking in particular of when x or y are 0 or -0, or one of them is infinity or -infinity. So writing this code will be quite a bit easier than I was expecting. It looks like I will just have to worry about when both x and y are infinite, or when one or both of them are nan. Next, concerning casinh: On 07/17/2012 06:13 AM, Bruce Evans wrote: > I translated this to pari. There was a sign error for the log() term > (assuming that pari asinh() has the same branch cuts as C99): I couldn't spot which sign error Bruce had changed. However I expect it has something to do with what happens when x=0 and fabs(y)>1. This is the reasonable choice of the branch cut. What I think the value should be is casinh(z) = cpack( signum(x)*sqrt(fabs(y)+sqrt(y^2-1)), signum(y)*PI) where the value of signum(x) depends on whether x is 0 or -0. (I might add that I checked against the Mathematica ArcSinh function, and this does NOT follow the above rule. But the document Steve pointed me to says that casinh(conj(z)) = conj(casinh(z)) which means that we cannot follow the Mathematica conventions.) > The most obvious immediate difficulty in translating the above into C is > that y*y and z*z may overflow when the result shouldn't. This will be a lot easier than I originally expected. When we are in conditions when overflow might occur, we can simply make the approximations sqrt(y*y-1) = y csqrt(z*z+1) = signum(x)*z because in floating point arithmetic, these will not be approximations, but true exactly. And I am thinking that the test I will use for when to use these approximations will be (y==y+1) and (z==z+1) respectively. (I would use (z*z==z*z+1) but that test has the overflow problem.) Finally, I want to tell you guys that the reason I used the code: if (x>0) return clog(z+csqrt(z*z+1)); else return -clog(-z+csqrt(z*z+1)); is this. Both formulas are mathematically exactly the same. This is true even if one takes into account the branch cuts for csqrt and clog. The difference between the two formulas is numerical errors. For example, if x<0 and z has very large magnitude, then csqrt(z*z+1) will be very close to -z. In fact in floating point arithmetic, if the magnitude of z is sufficiently large, they will be the same. However, as I am typing this, I realize that the code should really be if (w!=z+1) { w = z*z+1; if (signum(creal(w))==1) return clog(z+csqrt(w)); else return -clog(-z+csqrt(w)); } else /* if (z==z+1) */ { if (x>0) return clog(2*z); else return -clog(-2*z); } where the signum function is defined so that signum(0)==1 and signum(-0)=-1. Next: cacosh and cacos. I had presumed the formula cacosh(z) = I*cacos(z) which can be true depending on how one defines the branch cuts. But this formula won't satisfy the C99 standard which mandates cacosh(conj(z)) = conj(cacosh(z)) That is why in earlier posts I thought there was a mistake in the online documentation, and the range of outputs of cacosh should satisfy imaginary part in [0,pi] rather than [-pi,pi]. Anyway, I am just posting this update so that you see I am thinking about it. I might get the project done in days, but it might also be months. It depends upon how much other stuff I have going on. From owner-freebsd-numerics@FreeBSD.ORG Sun Aug 12 23:07:34 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 154CC106566C for ; Sun, 12 Aug 2012 23:07:34 +0000 (UTC) (envelope-from peter@rulingia.com) Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by mx1.freebsd.org (Postfix) with ESMTP id 9CAF48FC16 for ; Sun, 12 Aug 2012 23:07:33 +0000 (UTC) Received: from server.rulingia.com (c220-239-249-137.belrs5.nsw.optusnet.com.au [220.239.249.137]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CN7XYe075726 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Mon, 13 Aug 2012 09:07:33 +1000 (EST) (envelope-from peter@rulingia.com) X-Bogosity: Ham, spamicity=0.000000 Received: from server.rulingia.com (localhost.rulingia.com [127.0.0.1]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CN7Rkk021400 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Mon, 13 Aug 2012 09:07:27 +1000 (EST) (envelope-from peter@server.rulingia.com) Received: (from peter@localhost) by server.rulingia.com (8.14.5/8.14.5/Submit) id q7CN7Rld021399 for freebsd-numerics@freebsd.org; Mon, 13 Aug 2012 09:07:27 +1000 (EST) (envelope-from peter) Resent-From: Peter Jeremy Resent-Date: Mon, 13 Aug 2012 09:07:27 +1000 Resent-Message-ID: <20120812230727.GA20453@server.rulingia.com> Resent-To: freebsd-numerics@freebsd.org Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q6IKuUBN005285 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Thu, 19 Jul 2012 06:56:31 +1000 (EST) (envelope-from sgk@troutmask.apl.washington.edu) Received: from troutmask.apl.washington.edu (troutmask.apl.washington.edu [128.95.76.21]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q6IKuSf4078098 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Thu, 19 Jul 2012 06:56:29 +1000 (EST) (envelope-from sgk@troutmask.apl.washington.edu) Received: from troutmask.apl.washington.edu (localhost.apl.washington.edu [127.0.0.1]) by troutmask.apl.washington.edu (8.14.5/8.14.5) with ESMTP id q6IKuQ5O000438; Wed, 18 Jul 2012 13:56:26 -0700 (PDT) (envelope-from sgk@troutmask.apl.washington.edu) Received: (from sgk@localhost) by troutmask.apl.washington.edu (8.14.5/8.14.5/Submit) id q6IKuPFC000437; Wed, 18 Jul 2012 13:56:25 -0700 (PDT) (envelope-from sgk) From: Steve Kargl Mail-Followup-To: freebsd-numerics@freebsd.org To: Stephen Montgomery-Smith Message-ID: <20120718205625.GA409@troutmask.apl.washington.edu> References: <201207130818.38535.jhb@freebsd.org> <9EB2DA4F-19D7-4BA5-8811-D9451CB1D907@theravensnest.org> <20120713155805.GC81965@zim.MIT.EDU> <20120714120432.GA70706@server.rulingia.com> <20120717084457.U3890@besplex.bde.org> <5004A5C7.1040405@missouri.edu> <5004DEA9.1050001@missouri.edu> <20120717200931.U6624@besplex.bde.org> <5006D13D.2080702@missouri.edu> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <5006D13D.2080702@missouri.edu> User-Agent: Mutt/1.4.2.3i Cc: Diane Bruce , John Baldwin , David Chisnall , Bruce Evans , Bruce Evans , David Schultz , Peter Jeremy , Warner Losh Subject: Re: Use of C99 extra long double math functions after r236148 X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Date: Sun, 12 Aug 2012 23:07:34 -0000 X-Original-Date: Wed, 18 Jul 2012 13:56:25 -0700 X-List-Received-Date: Sun, 12 Aug 2012 23:07:34 -0000 On Wed, Jul 18, 2012 at 10:07:41AM -0500, Stephen Montgomery-Smith wrote: > > >The most obvious immediate difficulty in translating the above into C is > >that y*y and z*z may overflow when the result shouldn't. > > This will be a lot easier than I originally expected. When we are in > conditions when overflow might occur, we can simply make the approximations > sqrt(y*y-1) = y > csqrt(z*z+1) = signum(x)*z > because in floating point arithmetic, these will not be approximations, > but true exactly. And I am thinking that the test I will use for when > to use these approximations will be (y==y+1) and (z==z+1) respectively. > (I would use (z*z==z*z+1) but that test has the overflow problem.) I could be mistaken, but I believe that you need to raise the inexact flag with these approximations because in fact you are doing floating point math. -- Steve From owner-freebsd-numerics@FreeBSD.ORG Sun Aug 12 23:07:48 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 4AFB31065674 for ; Sun, 12 Aug 2012 23:07:48 +0000 (UTC) (envelope-from peter@rulingia.com) Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by mx1.freebsd.org (Postfix) with ESMTP id D614A8FC18 for ; Sun, 12 Aug 2012 23:07:47 +0000 (UTC) Received: from server.rulingia.com (c220-239-249-137.belrs5.nsw.optusnet.com.au [220.239.249.137]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CN7laJ075730 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Mon, 13 Aug 2012 09:07:47 +1000 (EST) (envelope-from peter@rulingia.com) X-Bogosity: Ham, spamicity=0.000000 Received: from server.rulingia.com (localhost.rulingia.com [127.0.0.1]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CN7fI8021406 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Mon, 13 Aug 2012 09:07:41 +1000 (EST) (envelope-from peter@server.rulingia.com) Received: (from peter@localhost) by server.rulingia.com (8.14.5/8.14.5/Submit) id q7CN7fWI021405 for freebsd-numerics@freebsd.org; Mon, 13 Aug 2012 09:07:41 +1000 (EST) (envelope-from peter) Resent-From: Peter Jeremy Resent-Date: Mon, 13 Aug 2012 09:07:41 +1000 Resent-Message-ID: <20120812230741.GB20453@server.rulingia.com> Resent-To: freebsd-numerics@freebsd.org Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q6IL9U0g005424 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Thu, 19 Jul 2012 07:09:31 +1000 (EST) (envelope-from stephen@missouri.edu) Received: from wilberforce.math.missouri.edu (wilberforce.math.missouri.edu [128.206.184.213]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q6IL9R7T078135 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Thu, 19 Jul 2012 07:09:30 +1000 (EST) (envelope-from stephen@missouri.edu) Received: from [127.0.0.1] (wilberforce.math.missouri.edu [128.206.184.213]) by wilberforce.math.missouri.edu (8.14.5/8.14.5) with ESMTP id q6IL96ot040144; Wed, 18 Jul 2012 16:09:06 -0500 (CDT) (envelope-from stephen@missouri.edu) Message-ID: <500725F2.7060603@missouri.edu> From: Stephen Montgomery-Smith Mail-Followup-To: freebsd-numerics@freebsd.org User-Agent: Mozilla/5.0 (X11; Linux i686; rv:13.0) Gecko/20120615 Thunderbird/13.0.1 MIME-Version: 1.0 To: Steve Kargl References: <201207130818.38535.jhb@freebsd.org> <9EB2DA4F-19D7-4BA5-8811-D9451CB1D907@theravensnest.org> <20120713155805.GC81965@zim.MIT.EDU> <20120714120432.GA70706@server.rulingia.com> <20120717084457.U3890@besplex.bde.org> <5004A5C7.1040405@missouri.edu> <5004DEA9.1050001@missouri.edu> <20120717200931.U6624@besplex.bde.org> <5006D13D.2080702@missouri.edu> <20120718205625.GA409@troutmask.apl.washington.edu> In-Reply-To: <20120718205625.GA409@troutmask.apl.washington.edu> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: Diane Bruce , John Baldwin , David Chisnall , Bruce Evans , Bruce Evans , David Schultz , Peter Jeremy , Warner Losh Subject: Re: Use of C99 extra long double math functions after r236148 X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Date: Sun, 12 Aug 2012 23:07:48 -0000 X-Original-Date: Wed, 18 Jul 2012 16:09:06 -0500 X-List-Received-Date: Sun, 12 Aug 2012 23:07:48 -0000 On 07/18/2012 03:56 PM, Steve Kargl wrote: > On Wed, Jul 18, 2012 at 10:07:41AM -0500, Stephen Montgomery-Smith wrote: >> >>> The most obvious immediate difficulty in translating the above into C is >>> that y*y and z*z may overflow when the result shouldn't. >> >> This will be a lot easier than I originally expected. When we are in >> conditions when overflow might occur, we can simply make the approximations >> sqrt(y*y-1) = y >> csqrt(z*z+1) = signum(x)*z >> because in floating point arithmetic, these will not be approximations, >> but true exactly. And I am thinking that the test I will use for when >> to use these approximations will be (y==y+1) and (z==z+1) respectively. >> (I would use (z*z==z*z+1) but that test has the overflow problem.) > > I could be mistaken, but I believe that you need to raise the > inexact flag with these approximations because in fact you > are doing floating point math. > Thanks for this observation. I am looking through the C99 standard, trying to understand the inexact flag. But I am struggling to interpret it. Am I to understand that the inexact flag should be set anytime a floating point operation produces an answer that is not guaranteed exact? For example, should 1.0/3.0 and sqrt(2.0) raise the inexact flag? From owner-freebsd-numerics@FreeBSD.ORG Sun Aug 12 23:08:19 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 6CFB9106566B for ; Sun, 12 Aug 2012 23:08:19 +0000 (UTC) (envelope-from peter@rulingia.com) Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by mx1.freebsd.org (Postfix) with ESMTP id 131CD8FC0C for ; Sun, 12 Aug 2012 23:08:18 +0000 (UTC) Received: from server.rulingia.com (c220-239-249-137.belrs5.nsw.optusnet.com.au [220.239.249.137]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CN8IJj075736 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Mon, 13 Aug 2012 09:08:18 +1000 (EST) (envelope-from peter@rulingia.com) X-Bogosity: Ham, spamicity=0.000000 Received: from server.rulingia.com (localhost.rulingia.com [127.0.0.1]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CN8Cbm021425 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Mon, 13 Aug 2012 09:08:12 +1000 (EST) (envelope-from peter@server.rulingia.com) Received: (from peter@localhost) by server.rulingia.com (8.14.5/8.14.5/Submit) id q7CN8CfN021424 for freebsd-numerics@freebsd.org; Mon, 13 Aug 2012 09:08:12 +1000 (EST) (envelope-from peter) Resent-From: Peter Jeremy Resent-Date: Mon, 13 Aug 2012 09:08:12 +1000 Resent-Message-ID: <20120812230812.GC20453@server.rulingia.com> Resent-To: freebsd-numerics@freebsd.org From: Peter Jeremy Mail-Followup-To: freebsd-numerics@freebsd.org To: Stephen Montgomery-Smith Message-ID: <20120718224222.GA6022@server.rulingia.com> References: <20120713155805.GC81965@zim.MIT.EDU> <20120714120432.GA70706@server.rulingia.com> <20120717084457.U3890@besplex.bde.org> <5004A5C7.1040405@missouri.edu> <5004DEA9.1050001@missouri.edu> <20120717200931.U6624@besplex.bde.org> <5006D13D.2080702@missouri.edu> <20120718205625.GA409@troutmask.apl.washington.edu> <500725F2.7060603@missouri.edu> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="5vNYLRcllDrimb99" Content-Disposition: inline In-Reply-To: <500725F2.7060603@missouri.edu> X-PGP-Key: http://www.rulingia.com/keys/peter.pgp User-Agent: Mutt/1.5.21 (2010-09-15) Cc: Diane Bruce , Steve Kargl , John Baldwin , David Chisnall , Bruce Evans , Bruce Evans , David Schultz , Warner Losh Subject: Re: Use of C99 extra long double math functions after r236148 X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Date: Sun, 12 Aug 2012 23:08:19 -0000 X-Original-Date: Thu, 19 Jul 2012 08:42:22 +1000 X-List-Received-Date: Sun, 12 Aug 2012 23:08:19 -0000 --5vNYLRcllDrimb99 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On 2012-Jul-18 10:07:41 -0500, Stephen Montgomery-Smith wrote: >I went on a long road trip yesterday, so I didn't get any code written, >but I did have a lot of thoughts about clog and casinh. Can I suggest you have a read through "Implementing the Complex Arcsine and Arccosine Functions Using Exception Handling" by T. E. Hull Thomas F. Fairgrieve and Ping Tak Peter Tang, ACM Transactions on Mathematical Software, Vol. 23, No. 3, September 1997. Based on a quick skim, it includes fairly detailed pseudocode, together with an error analysis. On 2012-Jul-18 16:09:06 -0500, Stephen Montgomery-Smith wrote: >Am I to understand that the inexact flag should be set anytime a=20 >floating point operation produces an answer that is not guaranteed=20 >exact? My understanding is, yes. For the transcendental functions, that means the inexact flag should almost always be raised and the problem becomes when not to raise it. Eg sin(0) =3D=3D 0 and presumably doesn't set the inexact flag. > For example, should 1.0/3.0 and sqrt(2.0) raise the inexact flag? Yes and yes. I notice our sqrtl() actually tests the inexact flag of an intermediate calculation to determine the correct rounding for the result. I've also found that Abramowitz and Stegun "Handbook of Mathematical Functions", 10th printing, is available online at http://people.maths.ox.ac.uk/~macdonald/aands/index.html and various mirrors. I'm still looking for a copy of Cody & Waite. BTW, thanks to Steve & Bruce for the comments on my code. I'll clean it up and have another try but that will probably take a couple of days. --=20 Peter Jeremy --5vNYLRcllDrimb99 Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.19 (FreeBSD) iEYEARECAAYFAlAHO84ACgkQ/opHv/APuIc2ngCgh7TDm5e3FUwcnq43FqJ9naeL dtEAn04/LfA0fG+dz/usJc7Tai8MvlNA =LzcB -----END PGP SIGNATURE----- --5vNYLRcllDrimb99-- From owner-freebsd-numerics@FreeBSD.ORG Sun Aug 12 23:08:24 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 47C6B106564A for ; Sun, 12 Aug 2012 23:08:24 +0000 (UTC) (envelope-from peter@rulingia.com) Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by mx1.freebsd.org (Postfix) with ESMTP id D176F8FC12 for ; Sun, 12 Aug 2012 23:08:23 +0000 (UTC) Received: from server.rulingia.com (c220-239-249-137.belrs5.nsw.optusnet.com.au [220.239.249.137]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CN8N1K075737 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Mon, 13 Aug 2012 09:08:23 +1000 (EST) (envelope-from peter@rulingia.com) X-Bogosity: Ham, spamicity=0.000000 Received: from server.rulingia.com (localhost.rulingia.com [127.0.0.1]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CN8HGO021431 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Mon, 13 Aug 2012 09:08:17 +1000 (EST) (envelope-from peter@server.rulingia.com) Received: (from peter@localhost) by server.rulingia.com (8.14.5/8.14.5/Submit) id q7CN8H5D021430 for freebsd-numerics@freebsd.org; Mon, 13 Aug 2012 09:08:17 +1000 (EST) (envelope-from peter) Resent-From: Peter Jeremy Resent-Date: Mon, 13 Aug 2012 09:08:17 +1000 Resent-Message-ID: <20120812230817.GD20453@server.rulingia.com> Resent-To: freebsd-numerics@freebsd.org Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q6IN1hRg006417 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Thu, 19 Jul 2012 09:01:44 +1000 (EST) (envelope-from stephen@missouri.edu) Received: from wilberforce.math.missouri.edu (wilberforce.math.missouri.edu [128.206.184.213]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q6IN1frE078418 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Thu, 19 Jul 2012 09:01:43 +1000 (EST) (envelope-from stephen@missouri.edu) Received: from [127.0.0.1] (wilberforce.math.missouri.edu [128.206.184.213]) by wilberforce.math.missouri.edu (8.14.5/8.14.5) with ESMTP id q6IN1Jlr061471; Wed, 18 Jul 2012 18:01:20 -0500 (CDT) (envelope-from stephen@missouri.edu) Message-ID: <5007403F.8000909@missouri.edu> From: Stephen Montgomery-Smith Mail-Followup-To: freebsd-numerics@freebsd.org User-Agent: Mozilla/5.0 (X11; Linux i686; rv:13.0) Gecko/20120615 Thunderbird/13.0.1 MIME-Version: 1.0 To: Peter Jeremy References: <20120713155805.GC81965@zim.MIT.EDU> <20120714120432.GA70706@server.rulingia.com> <20120717084457.U3890@besplex.bde.org> <5004A5C7.1040405@missouri.edu> <5004DEA9.1050001@missouri.edu> <20120717200931.U6624@besplex.bde.org> <5006D13D.2080702@missouri.edu> <20120718205625.GA409@troutmask.apl.washington.edu> <500725F2.7060603@missouri.edu> <20120718224222.GA6022@server.rulingia.com> In-Reply-To: <20120718224222.GA6022@server.rulingia.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: Diane Bruce , Steve Kargl , John Baldwin , David Chisnall , Bruce Evans , Bruce Evans , David Schultz , Warner Losh Subject: Re: Use of C99 extra long double math functions after r236148 X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Date: Sun, 12 Aug 2012 23:08:24 -0000 X-Original-Date: Wed, 18 Jul 2012 18:01:19 -0500 X-List-Received-Date: Sun, 12 Aug 2012 23:08:24 -0000 On 07/18/2012 05:42 PM, Peter Jeremy wrote: > On 2012-Jul-18 10:07:41 -0500, Stephen Montgomery-Smith wrote: >> I went on a long road trip yesterday, so I didn't get any code written, >> but I did have a lot of thoughts about clog and casinh. > > Can I suggest you have a read through "Implementing the Complex > Arcsine and Arccosine Functions Using Exception Handling" by > T. E. Hull Thomas F. Fairgrieve and Ping Tak Peter Tang, ACM > Transactions on Mathematical Software, Vol. 23, No. 3, September 1997. > Based on a quick skim, it includes fairly detailed pseudocode, > together with an error analysis. OK, I will do that. My pseudo code is different in that I use clog, and they do not. I'll probably go with my approach, but use lots of ideas from this paper as seems appropriate. I might even try both approaches and see which seems to be the winner, > > On 2012-Jul-18 16:09:06 -0500, Stephen Montgomery-Smith wrote: >> Am I to understand that the inexact flag should be set anytime a >> floating point operation produces an answer that is not guaranteed >> exact? > > My understanding is, yes. For the transcendental functions, that > means the inexact flag should almost always be raised and the problem > becomes when not to raise it. Eg sin(0) == 0 and presumably doesn't > set the inexact flag. > >> For example, should 1.0/3.0 and sqrt(2.0) raise the inexact flag? > > Yes and yes. I notice our sqrtl() actually tests the inexact flag of > an intermediate calculation to determine the correct rounding for the > result. Thank you for the clarification. I will definitely set the inexact flag under the circumstances pointed out by Steve. Otherwise, I will rely on clog and csqrt to set the inexact flag. (And clog will depend upon log, hypot and atan2 to set the inexact flag.) From owner-freebsd-numerics@FreeBSD.ORG Sun Aug 12 23:08:31 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 9EF23106566B for ; Sun, 12 Aug 2012 23:08:31 +0000 (UTC) (envelope-from peter@rulingia.com) Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by mx1.freebsd.org (Postfix) with ESMTP id 31A558FC17 for ; Sun, 12 Aug 2012 23:08:31 +0000 (UTC) Received: from server.rulingia.com (c220-239-249-137.belrs5.nsw.optusnet.com.au [220.239.249.137]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CN8Va1075742 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Mon, 13 Aug 2012 09:08:31 +1000 (EST) (envelope-from peter@rulingia.com) X-Bogosity: Ham, spamicity=0.000000 Received: from server.rulingia.com (localhost.rulingia.com [127.0.0.1]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CN8OUW021439 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Mon, 13 Aug 2012 09:08:24 +1000 (EST) (envelope-from peter@server.rulingia.com) Received: (from peter@localhost) by server.rulingia.com (8.14.5/8.14.5/Submit) id q7CN8Ou1021438 for freebsd-numerics@freebsd.org; Mon, 13 Aug 2012 09:08:24 +1000 (EST) (envelope-from peter) Resent-From: Peter Jeremy Resent-Date: Mon, 13 Aug 2012 09:08:24 +1000 Resent-Message-ID: <20120812230824.GE20453@server.rulingia.com> Resent-To: freebsd-numerics@freebsd.org Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q6INFBef006542 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Thu, 19 Jul 2012 09:15:11 +1000 (EST) (envelope-from stephen@missouri.edu) Received: from wilberforce.math.missouri.edu (wilberforce.math.missouri.edu [128.206.184.213]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q6INF8cb078450 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Thu, 19 Jul 2012 09:15:10 +1000 (EST) (envelope-from stephen@missouri.edu) Received: from [127.0.0.1] (wilberforce.math.missouri.edu [128.206.184.213]) by wilberforce.math.missouri.edu (8.14.5/8.14.5) with ESMTP id q6INElB3062344; Wed, 18 Jul 2012 18:14:49 -0500 (CDT) (envelope-from stephen@missouri.edu) Message-ID: <50074368.1080109@missouri.edu> From: Stephen Montgomery-Smith Mail-Followup-To: freebsd-numerics@freebsd.org User-Agent: Mozilla/5.0 (X11; Linux i686; rv:13.0) Gecko/20120615 Thunderbird/13.0.1 MIME-Version: 1.0 To: Peter Jeremy References: <20120713155805.GC81965@zim.MIT.EDU> <20120714120432.GA70706@server.rulingia.com> <20120717084457.U3890@besplex.bde.org> <5004A5C7.1040405@missouri.edu> <5004DEA9.1050001@missouri.edu> <20120717200931.U6624@besplex.bde.org> <5006D13D.2080702@missouri.edu> <20120718205625.GA409@troutmask.apl.washington.edu> <500725F2.7060603@missouri.edu> <20120718224222.GA6022@server.rulingia.com> In-Reply-To: <20120718224222.GA6022@server.rulingia.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: Diane Bruce , Steve Kargl , John Baldwin , David Chisnall , Bruce Evans , Bruce Evans , David Schultz , Warner Losh Subject: Re: Use of C99 extra long double math functions after r236148 X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Date: Sun, 12 Aug 2012 23:08:31 -0000 X-Original-Date: Wed, 18 Jul 2012 18:14:48 -0500 X-List-Received-Date: Sun, 12 Aug 2012 23:08:31 -0000 On 07/18/2012 05:42 PM, Peter Jeremy wrote: > On 2012-Jul-18 10:07:41 -0500, Stephen Montgomery-Smith wrote: >> I went on a long road trip yesterday, so I didn't get any code written, >> but I did have a lot of thoughts about clog and casinh. > > Can I suggest you have a read through "Implementing the Complex > Arcsine and Arccosine Functions Using Exception Handling" by > T. E. Hull Thomas F. Fairgrieve and Ping Tak Peter Tang, ACM > Transactions on Mathematical Software, Vol. 23, No. 3, September 1997. > Based on a quick skim, it includes fairly detailed pseudocode, > together with an error analysis. The other option - if you want to write the casinh, etc functions, following this paper, I would be happy to bow out. I feel that what I learned about casinh as a mathematical function in the last few days has been worth my time, even if it doesn't end up being used. From owner-freebsd-numerics@FreeBSD.ORG Sun Aug 12 23:08:37 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 3A0FA106566B for ; Sun, 12 Aug 2012 23:08:37 +0000 (UTC) (envelope-from peter@rulingia.com) Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by mx1.freebsd.org (Postfix) with ESMTP id C7CA28FC18 for ; Sun, 12 Aug 2012 23:08:36 +0000 (UTC) Received: from server.rulingia.com (c220-239-249-137.belrs5.nsw.optusnet.com.au [220.239.249.137]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CN8awn075747 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Mon, 13 Aug 2012 09:08:36 +1000 (EST) (envelope-from peter@rulingia.com) X-Bogosity: Ham, spamicity=0.000000 Received: from server.rulingia.com (localhost.rulingia.com [127.0.0.1]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CN8U3X021450 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Mon, 13 Aug 2012 09:08:30 +1000 (EST) (envelope-from peter@server.rulingia.com) Received: (from peter@localhost) by server.rulingia.com (8.14.5/8.14.5/Submit) id q7CN8UE1021449 for freebsd-numerics@freebsd.org; Mon, 13 Aug 2012 09:08:30 +1000 (EST) (envelope-from peter) Resent-From: Peter Jeremy Resent-Date: Mon, 13 Aug 2012 09:08:30 +1000 Resent-Message-ID: <20120812230830.GF20453@server.rulingia.com> Resent-To: freebsd-numerics@freebsd.org Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q6J325Y8008702 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Thu, 19 Jul 2012 13:02:05 +1000 (EST) (envelope-from sgk@troutmask.apl.washington.edu) Received: from troutmask.apl.washington.edu (troutmask.apl.washington.edu [128.95.76.21]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q6J323Xc079080 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Thu, 19 Jul 2012 13:02:05 +1000 (EST) (envelope-from sgk@troutmask.apl.washington.edu) Received: from troutmask.apl.washington.edu (localhost.apl.washington.edu [127.0.0.1]) by troutmask.apl.washington.edu (8.14.5/8.14.5) with ESMTP id q6J322bt001454; Wed, 18 Jul 2012 20:02:02 -0700 (PDT) (envelope-from sgk@troutmask.apl.washington.edu) Received: (from sgk@localhost) by troutmask.apl.washington.edu (8.14.5/8.14.5/Submit) id q6J321bn001453; Wed, 18 Jul 2012 20:02:01 -0700 (PDT) (envelope-from sgk) From: Steve Kargl Mail-Followup-To: freebsd-numerics@freebsd.org To: Peter Jeremy Message-ID: <20120719030201.GB1376@troutmask.apl.washington.edu> References: <20120713155805.GC81965@zim.MIT.EDU> <20120714120432.GA70706@server.rulingia.com> <20120717084457.U3890@besplex.bde.org> <5004A5C7.1040405@missouri.edu> <5004DEA9.1050001@missouri.edu> <20120717200931.U6624@besplex.bde.org> <5006D13D.2080702@missouri.edu> <20120718205625.GA409@troutmask.apl.washington.edu> <500725F2.7060603@missouri.edu> <20120718224222.GA6022@server.rulingia.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20120718224222.GA6022@server.rulingia.com> User-Agent: Mutt/1.4.2.3i Cc: Diane Bruce , John Baldwin , David Chisnall , Stephen Montgomery-Smith , Bruce Evans , Bruce Evans , David Schultz , Warner Losh Subject: Re: Use of C99 extra long double math functions after r236148 X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Date: Sun, 12 Aug 2012 23:08:37 -0000 X-Original-Date: Wed, 18 Jul 2012 20:02:01 -0700 X-List-Received-Date: Sun, 12 Aug 2012 23:08:37 -0000 On Thu, Jul 19, 2012 at 08:42:22AM +1000, Peter Jeremy wrote: > On 2012-Jul-18 10:07:41 -0500, Stephen Montgomery-Smith wrote: > >I went on a long road trip yesterday, so I didn't get any code written, > >but I did have a lot of thoughts about clog and casinh. > > Can I suggest you have a read through "Implementing the Complex > Arcsine and Arccosine Functions Using Exception Handling" by > T. E. Hull Thomas F. Fairgrieve and Ping Tak Peter Tang, ACM > Transactions on Mathematical Software, Vol. 23, No. 3, September 1997. > Based on a quick skim, it includes fairly detailed pseudocode, > together with an error analysis. It's always good to searh the literature. > > On 2012-Jul-18 16:09:06 -0500, Stephen Montgomery-Smith wrote: > >Am I to understand that the inexact flag should be set anytime a > >floating point operation produces an answer that is not guaranteed > >exact? > > My understanding is, yes. For the transcendental functions, that > means the inexact flag should almost always be raised and the problem > becomes when not to raise it. Eg sin(0) == 0 and presumably doesn't > set the inexact flag. > > > For example, should 1.0/3.0 and sqrt(2.0) raise the inexact flag? > > Yes and yes. I notice our sqrtl() actually tests the inexact flag of > an intermediate calculation to determine the correct rounding for the > result. sqrt() is special in that IEEE 754 requires that it return a correctly rounded result in all rounding modes. See src/e_asin.c where one cause an inexact to occur. You'll find code fragments like if(ix<0x3e500000) { /* if |x| < 2**-26 */ if(huge+x>one) return x;/* return x with inexact if x!=0*/ } huge+x causes the inexact flag to be raised and the condition is always true. > I've also found that Abramowitz and Stegun "Handbook of Mathematical > Functions", 10th printing, is available online at > http://people.maths.ox.ac.uk/~macdonald/aands/index.html > and various mirrors. I'm still looking for a copy of Cody & Waite. NIST recently revised A&S. You can get to online at http://dlmf.nist.gov/ -- Steve From owner-freebsd-numerics@FreeBSD.ORG Sun Aug 12 23:08:42 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 815B51065672 for ; Sun, 12 Aug 2012 23:08:42 +0000 (UTC) (envelope-from peter@rulingia.com) Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by mx1.freebsd.org (Postfix) with ESMTP id 1A7208FC0C for ; Sun, 12 Aug 2012 23:08:41 +0000 (UTC) Received: from server.rulingia.com (c220-239-249-137.belrs5.nsw.optusnet.com.au [220.239.249.137]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CN8fF0075750 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Mon, 13 Aug 2012 09:08:42 +1000 (EST) (envelope-from peter@rulingia.com) X-Bogosity: Ham, spamicity=0.000000 Received: from server.rulingia.com (localhost.rulingia.com [127.0.0.1]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CN8Zp7021460 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Mon, 13 Aug 2012 09:08:35 +1000 (EST) (envelope-from peter@server.rulingia.com) Received: (from peter@localhost) by server.rulingia.com (8.14.5/8.14.5/Submit) id q7CN8ZIL021459 for freebsd-numerics@freebsd.org; Mon, 13 Aug 2012 09:08:35 +1000 (EST) (envelope-from peter) Resent-From: Peter Jeremy Resent-Date: Mon, 13 Aug 2012 09:08:35 +1000 Resent-Message-ID: <20120812230835.GG20453@server.rulingia.com> Resent-To: freebsd-numerics@freebsd.org Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q6J2ro0A008593 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Thu, 19 Jul 2012 12:53:51 +1000 (EST) (envelope-from sgk@troutmask.apl.washington.edu) Received: from troutmask.apl.washington.edu (troutmask.apl.washington.edu [128.95.76.21]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q6J2rmqD079030 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Thu, 19 Jul 2012 12:53:50 +1000 (EST) (envelope-from sgk@troutmask.apl.washington.edu) Received: from troutmask.apl.washington.edu (localhost.apl.washington.edu [127.0.0.1]) by troutmask.apl.washington.edu (8.14.5/8.14.5) with ESMTP id q6J2rkYq001416; Wed, 18 Jul 2012 19:53:46 -0700 (PDT) (envelope-from sgk@troutmask.apl.washington.edu) Received: (from sgk@localhost) by troutmask.apl.washington.edu (8.14.5/8.14.5/Submit) id q6J2rjga001415; Wed, 18 Jul 2012 19:53:45 -0700 (PDT) (envelope-from sgk) From: Steve Kargl Mail-Followup-To: freebsd-numerics@freebsd.org To: Stephen Montgomery-Smith Message-ID: <20120719025345.GA1376@troutmask.apl.washington.edu> References: <20120713155805.GC81965@zim.MIT.EDU> <20120714120432.GA70706@server.rulingia.com> <20120717084457.U3890@besplex.bde.org> <5004A5C7.1040405@missouri.edu> <5004DEA9.1050001@missouri.edu> <20120717200931.U6624@besplex.bde.org> <5006D13D.2080702@missouri.edu> <20120718205625.GA409@troutmask.apl.washington.edu> <500725F2.7060603@missouri.edu> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <500725F2.7060603@missouri.edu> User-Agent: Mutt/1.4.2.3i Cc: Diane Bruce , John Baldwin , David Chisnall , Bruce Evans , Bruce Evans , David Schultz , Peter Jeremy , Warner Losh Subject: Re: Use of C99 extra long double math functions after r236148 X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Date: Sun, 12 Aug 2012 23:08:42 -0000 X-Original-Date: Wed, 18 Jul 2012 19:53:45 -0700 X-List-Received-Date: Sun, 12 Aug 2012 23:08:42 -0000 On Wed, Jul 18, 2012 at 04:09:06PM -0500, Stephen Montgomery-Smith wrote: > On 07/18/2012 03:56 PM, Steve Kargl wrote: > >On Wed, Jul 18, 2012 at 10:07:41AM -0500, Stephen Montgomery-Smith wrote: > >> > >>>The most obvious immediate difficulty in translating the above into C is > >>>that y*y and z*z may overflow when the result shouldn't. > >> > >>This will be a lot easier than I originally expected. When we are in > >>conditions when overflow might occur, we can simply make the > >>approximations > >>sqrt(y*y-1) = y > >>csqrt(z*z+1) = signum(x)*z > >>because in floating point arithmetic, these will not be approximations, > >>but true exactly. And I am thinking that the test I will use for when > >>to use these approximations will be (y==y+1) and (z==z+1) respectively. > >> (I would use (z*z==z*z+1) but that test has the overflow problem.) > > > >I could be mistaken, but I believe that you need to raise the > >inexact flag with these approximations because in fact you > >are doing floating point math. > > > > Thanks for this observation. I am looking through the C99 standard, > trying to understand the inexact flag. But I am struggling to interpret it. > > Am I to understand that the inexact flag should be set anytime a > floating point operation produces an answer that is not guaranteed > exact? For example, should 1.0/3.0 and sqrt(2.0) raise the inexact flag? The inexact flag will get raised by the fpu, but you need to cause the condition. For your 'sqrt(y*y-1) = y' example, you would do something like 'sqrt(y*y-1) = abs(y) - tiny' where tiny is much less than abs(y). Search msun/src for inexact (ie., grep -i inexact msun/src/*.c) -- Steve From owner-freebsd-numerics@FreeBSD.ORG Sun Aug 12 23:08:49 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id BB03E106566B for ; Sun, 12 Aug 2012 23:08:49 +0000 (UTC) (envelope-from peter@rulingia.com) Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by mx1.freebsd.org (Postfix) with ESMTP id 4ED6A8FC15 for ; Sun, 12 Aug 2012 23:08:49 +0000 (UTC) Received: from server.rulingia.com (c220-239-249-137.belrs5.nsw.optusnet.com.au [220.239.249.137]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CN8nKL075757 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Mon, 13 Aug 2012 09:08:49 +1000 (EST) (envelope-from peter@rulingia.com) X-Bogosity: Ham, spamicity=0.000000 Received: from server.rulingia.com (localhost.rulingia.com [127.0.0.1]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CN8gVN021472 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Mon, 13 Aug 2012 09:08:42 +1000 (EST) (envelope-from peter@server.rulingia.com) Received: (from peter@localhost) by server.rulingia.com (8.14.5/8.14.5/Submit) id q7CN8gdt021471 for freebsd-numerics@freebsd.org; Mon, 13 Aug 2012 09:08:42 +1000 (EST) (envelope-from peter) Resent-From: Peter Jeremy Resent-Date: Mon, 13 Aug 2012 09:08:42 +1000 Resent-Message-ID: <20120812230842.GH20453@server.rulingia.com> Resent-To: freebsd-numerics@freebsd.org Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q6J367i7008738 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Thu, 19 Jul 2012 13:06:07 +1000 (EST) (envelope-from stephen@missouri.edu) Received: from wilberforce.math.missouri.edu (wilberforce.math.missouri.edu [128.206.184.213]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q6J364Bj079087 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Thu, 19 Jul 2012 13:06:06 +1000 (EST) (envelope-from stephen@missouri.edu) Received: from [127.0.0.1] (wilberforce.math.missouri.edu [128.206.184.213]) by wilberforce.math.missouri.edu (8.14.5/8.14.5) with ESMTP id q6J35g4r077487; Wed, 18 Jul 2012 22:05:43 -0500 (CDT) (envelope-from stephen@missouri.edu) Message-ID: <50077987.1080307@missouri.edu> From: Stephen Montgomery-Smith Mail-Followup-To: freebsd-numerics@freebsd.org User-Agent: Mozilla/5.0 (X11; Linux i686; rv:13.0) Gecko/20120615 Thunderbird/13.0.1 MIME-Version: 1.0 To: Steve Kargl References: <20120713155805.GC81965@zim.MIT.EDU> <20120714120432.GA70706@server.rulingia.com> <20120717084457.U3890@besplex.bde.org> <5004A5C7.1040405@missouri.edu> <5004DEA9.1050001@missouri.edu> <20120717200931.U6624@besplex.bde.org> <5006D13D.2080702@missouri.edu> <20120718205625.GA409@troutmask.apl.washington.edu> <500725F2.7060603@missouri.edu> <20120719025345.GA1376@troutmask.apl.washington.edu> In-Reply-To: <20120719025345.GA1376@troutmask.apl.washington.edu> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: Diane Bruce , John Baldwin , David Chisnall , Bruce Evans , Bruce Evans , David Schultz , Peter Jeremy , Warner Losh Subject: Re: Use of C99 extra long double math functions after r236148 X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Date: Sun, 12 Aug 2012 23:08:49 -0000 X-Original-Date: Wed, 18 Jul 2012 22:05:43 -0500 X-List-Received-Date: Sun, 12 Aug 2012 23:08:49 -0000 On 07/18/2012 09:53 PM, Steve Kargl wrote: > On Wed, Jul 18, 2012 at 04:09:06PM -0500, Stephen Montgomery-Smith wrote: >> On 07/18/2012 03:56 PM, Steve Kargl wrote: >>> On Wed, Jul 18, 2012 at 10:07:41AM -0500, Stephen Montgomery-Smith wrote: >>>> >>>>> The most obvious immediate difficulty in translating the above into C is >>>>> that y*y and z*z may overflow when the result shouldn't. >>>> >>>> This will be a lot easier than I originally expected. When we are in >>>> conditions when overflow might occur, we can simply make the >>>> approximations >>>> sqrt(y*y-1) = y >>>> csqrt(z*z+1) = signum(x)*z >>>> because in floating point arithmetic, these will not be approximations, >>>> but true exactly. And I am thinking that the test I will use for when >>>> to use these approximations will be (y==y+1) and (z==z+1) respectively. >>>> (I would use (z*z==z*z+1) but that test has the overflow problem.) >>> >>> I could be mistaken, but I believe that you need to raise the >>> inexact flag with these approximations because in fact you >>> are doing floating point math. >>> >> >> Thanks for this observation. I am looking through the C99 standard, >> trying to understand the inexact flag. But I am struggling to interpret it. >> >> Am I to understand that the inexact flag should be set anytime a >> floating point operation produces an answer that is not guaranteed >> exact? For example, should 1.0/3.0 and sqrt(2.0) raise the inexact flag? > > The inexact flag will get raised by the fpu, but you need to > cause the condition. For your 'sqrt(y*y-1) = y' example, > you would do something like 'sqrt(y*y-1) = abs(y) - tiny' where > tiny is much less than abs(y). Search msun/src for inexact > (ie., grep -i inexact msun/src/*.c) > Couldn't you do this instead? #include feraiseexcept(FE_INEXACT) From owner-freebsd-numerics@FreeBSD.ORG Sun Aug 12 23:08:55 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 1BB42106566B for ; Sun, 12 Aug 2012 23:08:55 +0000 (UTC) (envelope-from peter@rulingia.com) Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by mx1.freebsd.org (Postfix) with ESMTP id AADEC8FC19 for ; Sun, 12 Aug 2012 23:08:54 +0000 (UTC) Received: from server.rulingia.com (c220-239-249-137.belrs5.nsw.optusnet.com.au [220.239.249.137]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CN8ssH075762 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Mon, 13 Aug 2012 09:08:54 +1000 (EST) (envelope-from peter@rulingia.com) X-Bogosity: Ham, spamicity=0.000000 Received: from server.rulingia.com (localhost.rulingia.com [127.0.0.1]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CN8mKD021482 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Mon, 13 Aug 2012 09:08:48 +1000 (EST) (envelope-from peter@server.rulingia.com) Received: (from peter@localhost) by server.rulingia.com (8.14.5/8.14.5/Submit) id q7CN8mmp021481 for freebsd-numerics@freebsd.org; Mon, 13 Aug 2012 09:08:48 +1000 (EST) (envelope-from peter) Resent-From: Peter Jeremy Resent-Date: Mon, 13 Aug 2012 09:08:48 +1000 Resent-Message-ID: <20120812230848.GI20453@server.rulingia.com> Resent-To: freebsd-numerics@freebsd.org Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q6J3RBXO009208 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Thu, 19 Jul 2012 13:27:11 +1000 (EST) (envelope-from sgk@troutmask.apl.washington.edu) Received: from troutmask.apl.washington.edu (troutmask.apl.washington.edu [128.95.76.21]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q6J3R8YB079135 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Thu, 19 Jul 2012 13:27:10 +1000 (EST) (envelope-from sgk@troutmask.apl.washington.edu) Received: from troutmask.apl.washington.edu (localhost.apl.washington.edu [127.0.0.1]) by troutmask.apl.washington.edu (8.14.5/8.14.5) with ESMTP id q6J3R605001598; Wed, 18 Jul 2012 20:27:06 -0700 (PDT) (envelope-from sgk@troutmask.apl.washington.edu) Received: (from sgk@localhost) by troutmask.apl.washington.edu (8.14.5/8.14.5/Submit) id q6J3R6ta001597; Wed, 18 Jul 2012 20:27:06 -0700 (PDT) (envelope-from sgk) From: Steve Kargl Mail-Followup-To: freebsd-numerics@freebsd.org To: Stephen Montgomery-Smith Message-ID: <20120719032706.GA1558@troutmask.apl.washington.edu> References: <20120714120432.GA70706@server.rulingia.com> <20120717084457.U3890@besplex.bde.org> <5004A5C7.1040405@missouri.edu> <5004DEA9.1050001@missouri.edu> <20120717200931.U6624@besplex.bde.org> <5006D13D.2080702@missouri.edu> <20120718205625.GA409@troutmask.apl.washington.edu> <500725F2.7060603@missouri.edu> <20120719025345.GA1376@troutmask.apl.washington.edu> <50077987.1080307@missouri.edu> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <50077987.1080307@missouri.edu> User-Agent: Mutt/1.4.2.3i Cc: Diane Bruce , John Baldwin , David Chisnall , Bruce Evans , Bruce Evans , David Schultz , Peter Jeremy , Warner Losh Subject: Re: Use of C99 extra long double math functions after r236148 X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Date: Sun, 12 Aug 2012 23:08:55 -0000 X-Original-Date: Wed, 18 Jul 2012 20:27:06 -0700 X-List-Received-Date: Sun, 12 Aug 2012 23:08:55 -0000 On Wed, Jul 18, 2012 at 10:05:43PM -0500, Stephen Montgomery-Smith wrote: > On 07/18/2012 09:53 PM, Steve Kargl wrote: > > > >The inexact flag will get raised by the fpu, but you need to > >cause the condition. For your 'sqrt(y*y-1) = y' example, > >you would do something like 'sqrt(y*y-1) = abs(y) - tiny' where > >tiny is much less than abs(y). Search msun/src for inexact > >(ie., grep -i inexact msun/src/*.c) > > > > Couldn't you do this instead? > > #include > > feraiseexcept(FE_INEXACT) > I haven't checked, but I suspect you're looking at a speed issue. It's faster to let the hardware raise the flag. It seems that libm only uses the above in the fuse-multiple-add code: laptop:kargl[206] grep feraise src/*c src/s_fma.c: feraiseexcept(FE_INEXACT); src/s_fma.c: feraiseexcept(FE_UNDERFLOW); src/s_fmal.c: feraiseexcept(FE_INEXACT); src/s_fmal.c: feraiseexcept(FE_UNDERFLOW); src/s_lround.c: feraiseexcept(FE_INVALID); -- Steve From owner-freebsd-numerics@FreeBSD.ORG Sun Aug 12 23:09:02 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 880FC1065678 for ; Sun, 12 Aug 2012 23:09:02 +0000 (UTC) (envelope-from peter@rulingia.com) Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by mx1.freebsd.org (Postfix) with ESMTP id E76978FC14 for ; Sun, 12 Aug 2012 23:09:01 +0000 (UTC) Received: from server.rulingia.com (c220-239-249-137.belrs5.nsw.optusnet.com.au [220.239.249.137]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CN911H075767 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Mon, 13 Aug 2012 09:09:01 +1000 (EST) (envelope-from peter@rulingia.com) X-Bogosity: Ham, spamicity=0.000000 Received: from server.rulingia.com (localhost.rulingia.com [127.0.0.1]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CN8tFm021494 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Mon, 13 Aug 2012 09:08:55 +1000 (EST) (envelope-from peter@server.rulingia.com) Received: (from peter@localhost) by server.rulingia.com (8.14.5/8.14.5/Submit) id q7CN8tGT021493 for freebsd-numerics@freebsd.org; Mon, 13 Aug 2012 09:08:55 +1000 (EST) (envelope-from peter) Resent-From: Peter Jeremy Resent-Date: Mon, 13 Aug 2012 09:08:55 +1000 Resent-Message-ID: <20120812230855.GJ20453@server.rulingia.com> Resent-To: freebsd-numerics@freebsd.org Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q6J3i5Fn009349 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Thu, 19 Jul 2012 13:44:05 +1000 (EST) (envelope-from stephen@missouri.edu) Received: from wilberforce.math.missouri.edu (wilberforce.math.missouri.edu [128.206.184.213]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q6J3i2O1079163 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Thu, 19 Jul 2012 13:44:04 +1000 (EST) (envelope-from stephen@missouri.edu) Received: from [127.0.0.1] (wilberforce.math.missouri.edu [128.206.184.213]) by wilberforce.math.missouri.edu (8.14.5/8.14.5) with ESMTP id q6J3hfng079974; Wed, 18 Jul 2012 22:43:42 -0500 (CDT) (envelope-from stephen@missouri.edu) Message-ID: <5007826D.7060806@missouri.edu> From: Stephen Montgomery-Smith Mail-Followup-To: freebsd-numerics@freebsd.org User-Agent: Mozilla/5.0 (X11; Linux i686; rv:13.0) Gecko/20120615 Thunderbird/13.0.1 MIME-Version: 1.0 To: Steve Kargl References: <20120714120432.GA70706@server.rulingia.com> <20120717084457.U3890@besplex.bde.org> <5004A5C7.1040405@missouri.edu> <5004DEA9.1050001@missouri.edu> <20120717200931.U6624@besplex.bde.org> <5006D13D.2080702@missouri.edu> <20120718205625.GA409@troutmask.apl.washington.edu> <500725F2.7060603@missouri.edu> <20120719025345.GA1376@troutmask.apl.washington.edu> <50077987.1080307@missouri.edu> <20120719032706.GA1558@troutmask.apl.washington.edu> In-Reply-To: <20120719032706.GA1558@troutmask.apl.washington.edu> Content-Type: multipart/mixed; boundary="------------010702080409040707040105" X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: Diane Bruce , John Baldwin , David Chisnall , Bruce Evans , Bruce Evans , David Schultz , Peter Jeremy , Warner Losh Subject: Re: Use of C99 extra long double math functions after r236148 X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Date: Sun, 12 Aug 2012 23:09:02 -0000 X-Original-Date: Wed, 18 Jul 2012 22:43:41 -0500 X-List-Received-Date: Sun, 12 Aug 2012 23:09:02 -0000 This is a multi-part message in MIME format. --------------010702080409040707040105 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit On 07/18/2012 10:27 PM, Steve Kargl wrote: > On Wed, Jul 18, 2012 at 10:05:43PM -0500, Stephen Montgomery-Smith wrote: >> On 07/18/2012 09:53 PM, Steve Kargl wrote: >>> >>> The inexact flag will get raised by the fpu, but you need to >>> cause the condition. For your 'sqrt(y*y-1) = y' example, >>> you would do something like 'sqrt(y*y-1) = abs(y) - tiny' where >>> tiny is much less than abs(y). Search msun/src for inexact >>> (ie., grep -i inexact msun/src/*.c) >>> >> >> Couldn't you do this instead? >> >> #include >> >> feraiseexcept(FE_INEXACT) >> > > I haven't checked, but I suspect you're looking at a speed > issue. It's faster to let the hardware raise the flag. > It seems that libm only uses the above in the fuse-multiple-add > code: > > laptop:kargl[206] grep feraise src/*c > src/s_fma.c: feraiseexcept(FE_INEXACT); > src/s_fma.c: feraiseexcept(FE_UNDERFLOW); > src/s_fmal.c: feraiseexcept(FE_INEXACT); > src/s_fmal.c: feraiseexcept(FE_UNDERFLOW); > src/s_lround.c: feraiseexcept(FE_INVALID); > Still, I think I will use the feraiseexcept function in clog, because speed isn't an issue when nans are involved. And it does make the code less obscure. --------------010702080409040707040105-- From owner-freebsd-numerics@FreeBSD.ORG Sun Aug 12 23:09:12 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 6A6F0106564A for ; Sun, 12 Aug 2012 23:09:12 +0000 (UTC) (envelope-from peter@rulingia.com) Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by mx1.freebsd.org (Postfix) with ESMTP id DFE6E8FC1C for ; Sun, 12 Aug 2012 23:09:11 +0000 (UTC) Received: from server.rulingia.com (c220-239-249-137.belrs5.nsw.optusnet.com.au [220.239.249.137]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CN9BPR075774 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Mon, 13 Aug 2012 09:09:11 +1000 (EST) (envelope-from peter@rulingia.com) X-Bogosity: Ham, spamicity=0.000000 Received: from server.rulingia.com (localhost.rulingia.com [127.0.0.1]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CN95Dx021508 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Mon, 13 Aug 2012 09:09:05 +1000 (EST) (envelope-from peter@server.rulingia.com) Received: (from peter@localhost) by server.rulingia.com (8.14.5/8.14.5/Submit) id q7CN95uM021507 for freebsd-numerics@freebsd.org; Mon, 13 Aug 2012 09:09:05 +1000 (EST) (envelope-from peter) Resent-From: Peter Jeremy Resent-Date: Mon, 13 Aug 2012 09:09:05 +1000 Resent-Message-ID: <20120812230905.GK20453@server.rulingia.com> Resent-To: freebsd-numerics@freebsd.org Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q6J6kkkG011261 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Thu, 19 Jul 2012 16:46:46 +1000 (EST) (envelope-from stephen@missouri.edu) Received: from wilberforce.math.missouri.edu (wilberforce.math.missouri.edu [128.206.184.213]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q6J6kh0w079646 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Thu, 19 Jul 2012 16:46:46 +1000 (EST) (envelope-from stephen@missouri.edu) Received: from [127.0.0.1] (wilberforce.math.missouri.edu [128.206.184.213]) by wilberforce.math.missouri.edu (8.14.5/8.14.5) with ESMTP id q6J6kODn007834; Thu, 19 Jul 2012 01:46:25 -0500 (CDT) (envelope-from stephen@missouri.edu) Message-ID: <5007AD41.9070000@missouri.edu> From: Stephen Montgomery-Smith Mail-Followup-To: freebsd-numerics@freebsd.org User-Agent: Mozilla/5.0 (X11; Linux i686; rv:13.0) Gecko/20120615 Thunderbird/13.0.1 MIME-Version: 1.0 To: Steve Kargl References: <20120714120432.GA70706@server.rulingia.com> <20120717084457.U3890@besplex.bde.org> <5004A5C7.1040405@missouri.edu> <5004DEA9.1050001@missouri.edu> <20120717200931.U6624@besplex.bde.org> <5006D13D.2080702@missouri.edu> <20120718205625.GA409@troutmask.apl.washington.edu> <500725F2.7060603@missouri.edu> <20120719025345.GA1376@troutmask.apl.washington.edu> <50077987.1080307@missouri.edu> <20120719032706.GA1558@troutmask.apl.washington.edu> <5007826D.7060806@missouri.edu> In-Reply-To: <5007826D.7060806@missouri.edu> Content-Type: multipart/mixed; boundary="------------000500050401010706080607" X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: Diane Bruce , John Baldwin , David Chisnall , Bruce Evans , Bruce Evans , David Schultz , Peter Jeremy , Warner Losh Subject: Re: Use of C99 extra long double math functions after r236148 X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Date: Sun, 12 Aug 2012 23:09:12 -0000 X-Original-Date: Thu, 19 Jul 2012 01:46:25 -0500 X-List-Received-Date: Sun, 12 Aug 2012 23:09:12 -0000 This is a multi-part message in MIME format. --------------000500050401010706080607 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit I did a ULP test on clog. The test code is attached. (Not the cleanest code, I know, but it does the job.) It needs the mpfr and unuran ports installed. To my shock, I found that under certain circumstances, the ULP in the real part was huge. The problem is when hypot(x,y) is close to 1, because then the real part of clog is close to zero. I was seeing ULPs in the thousands. I struggled to find a solution, and now I think I have the ULP down to about 2. I am going to work on it more tomorrow to see if I can get ULP down even further. --------------000500050401010706080607-- From owner-freebsd-numerics@FreeBSD.ORG Sun Aug 12 23:09:23 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id BCCCD106564A for ; Sun, 12 Aug 2012 23:09:23 +0000 (UTC) (envelope-from peter@rulingia.com) Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by mx1.freebsd.org (Postfix) with ESMTP id 3FAF98FC12 for ; Sun, 12 Aug 2012 23:09:22 +0000 (UTC) Received: from server.rulingia.com (c220-239-249-137.belrs5.nsw.optusnet.com.au [220.239.249.137]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CN9Muu075780 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Mon, 13 Aug 2012 09:09:22 +1000 (EST) (envelope-from peter@rulingia.com) X-Bogosity: Ham, spamicity=0.000000 Received: from server.rulingia.com (localhost.rulingia.com [127.0.0.1]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CN9GIX021525 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Mon, 13 Aug 2012 09:09:16 +1000 (EST) (envelope-from peter@server.rulingia.com) Received: (from peter@localhost) by server.rulingia.com (8.14.5/8.14.5/Submit) id q7CN9GDc021524 for freebsd-numerics@freebsd.org; Mon, 13 Aug 2012 09:09:16 +1000 (EST) (envelope-from peter) Resent-From: Peter Jeremy Resent-Date: Mon, 13 Aug 2012 09:09:16 +1000 Resent-Message-ID: <20120812230916.GM20453@server.rulingia.com> Resent-To: freebsd-numerics@freebsd.org Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q6JHQGMs016057 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Fri, 20 Jul 2012 03:26:17 +1000 (EST) (envelope-from stephen@missouri.edu) Received: from wilberforce.math.missouri.edu (wilberforce.math.missouri.edu [128.206.184.213]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q6JHQEZB081701 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Fri, 20 Jul 2012 03:26:16 +1000 (EST) (envelope-from stephen@missouri.edu) Received: from [127.0.0.1] (wilberforce.math.missouri.edu [128.206.184.213]) by wilberforce.math.missouri.edu (8.14.5/8.14.5) with ESMTP id q6JHPsME048856; Thu, 19 Jul 2012 12:25:54 -0500 (CDT) (envelope-from stephen@missouri.edu) Message-ID: <50084322.7020401@missouri.edu> From: Stephen Montgomery-Smith Mail-Followup-To: freebsd-numerics@freebsd.org User-Agent: Mozilla/5.0 (X11; Linux i686; rv:13.0) Gecko/20120615 Thunderbird/13.0.1 MIME-Version: 1.0 To: Bruce Evans References: <20120714120432.GA70706@server.rulingia.com> <20120717084457.U3890@besplex.bde.org> <5004A5C7.1040405@missouri.edu> <5004DEA9.1050001@missouri.edu> <20120717200931.U6624@besplex.bde.org> <5006D13D.2080702@missouri.edu> <20120718205625.GA409@troutmask.apl.washington.edu> <500725F2.7060603@missouri.edu> <20120719025345.GA1376@troutmask.apl.washington.edu> <50077987.1080307@missouri.edu> <20120719032706.GA1558@troutmask.apl.washington.edu> <5007826D.7060806@missouri.edu> <5007AD41.9070000@missouri.edu> <20120719205347.T2601@besplex.bde.org> In-Reply-To: <20120719205347.T2601@besplex.bde.org> Content-Type: multipart/mixed; boundary="------------020105090904080608070404" X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: Diane Bruce , John Baldwin , David Chisnall , Bruce Evans , Steve Kargl , David Schultz , Peter Jeremy , Warner Losh Subject: Re: Use of C99 extra long double math functions after r236148 X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Date: Sun, 12 Aug 2012 23:09:23 -0000 X-Original-Date: Thu, 19 Jul 2012 12:25:54 -0500 X-List-Received-Date: Sun, 12 Aug 2012 23:09:23 -0000 This is a multi-part message in MIME format. --------------020105090904080608070404 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit On 07/19/2012 06:16 AM, Bruce Evans wrote: > On Thu, 19 Jul 2012, Stephen Montgomery-Smith wrote: > >> I did a ULP test on clog. The test code is attached. (Not the >> cleanest code, I know, but it does the job.) It needs the mpfr and >> unuran ports installed. >> >> To my shock, I found that under certain circumstances, the ULP in the >> real part was huge. The problem is when hypot(x,y) is close to 1, >> because then the real part of clog is close to zero. I was seeing >> ULPs in the thousands. > > Better than GULPs in the thousands :-). > > This is not the problem that I first thought it might be. > >> I struggled to find a solution, and now I think I have the ULP down to >> about 2. I am going to work on it more tomorrow to see if I can get >> ULP down even further. I have the ULP down to about 1.2 now. I don't see how I can do better, because I have to invoke log functions twice, and probably each one has a ULP of about 0.6. Also I decided to use 1/2 log(x*x+y*y) when x and y are not too large. I am really rather proud of how I got around the large ULP when hypot(x,y) is close to 1. I would be glad if any of you could look at the code when you get a chance. Also, now that I see how hard clog was, I have more appreciation of Steve's objections. --------------020105090904080608070404-- From owner-freebsd-numerics@FreeBSD.ORG Sun Aug 12 23:09:38 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 92D1C1065674 for ; Sun, 12 Aug 2012 23:09:38 +0000 (UTC) (envelope-from peter@rulingia.com) Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by mx1.freebsd.org (Postfix) with ESMTP id 2B6468FC19 for ; Sun, 12 Aug 2012 23:09:38 +0000 (UTC) Received: from server.rulingia.com (c220-239-249-137.belrs5.nsw.optusnet.com.au [220.239.249.137]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CN9cqW075790 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Mon, 13 Aug 2012 09:09:38 +1000 (EST) (envelope-from peter@rulingia.com) X-Bogosity: Ham, spamicity=0.000000 Received: from server.rulingia.com (localhost.rulingia.com [127.0.0.1]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CN9VfY021544 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Mon, 13 Aug 2012 09:09:31 +1000 (EST) (envelope-from peter@server.rulingia.com) Received: (from peter@localhost) by server.rulingia.com (8.14.5/8.14.5/Submit) id q7CN9VwV021543 for freebsd-numerics@freebsd.org; Mon, 13 Aug 2012 09:09:31 +1000 (EST) (envelope-from peter) Resent-From: Peter Jeremy Resent-Date: Mon, 13 Aug 2012 09:09:31 +1000 Resent-Message-ID: <20120812230931.GO20453@server.rulingia.com> Resent-To: freebsd-numerics@freebsd.org Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q6JIO4ua017870 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Fri, 20 Jul 2012 04:24:05 +1000 (EST) (envelope-from stephen@missouri.edu) Received: from wilberforce.math.missouri.edu (wilberforce.math.missouri.edu [128.206.184.213]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q6JIO2W7081964 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Fri, 20 Jul 2012 04:24:04 +1000 (EST) (envelope-from stephen@missouri.edu) Received: from [127.0.0.1] (wilberforce.math.missouri.edu [128.206.184.213]) by wilberforce.math.missouri.edu (8.14.5/8.14.5) with ESMTP id q6JINgO7052646; Thu, 19 Jul 2012 13:23:42 -0500 (CDT) (envelope-from stephen@missouri.edu) Message-ID: <500850AE.9030608@missouri.edu> From: Stephen Montgomery-Smith Mail-Followup-To: freebsd-numerics@freebsd.org User-Agent: Mozilla/5.0 (X11; Linux i686; rv:13.0) Gecko/20120615 Thunderbird/13.0.1 MIME-Version: 1.0 To: Bruce Evans References: <20120714120432.GA70706@server.rulingia.com> <20120717084457.U3890@besplex.bde.org> <5004A5C7.1040405@missouri.edu> <5004DEA9.1050001@missouri.edu> <20120717200931.U6624@besplex.bde.org> <5006D13D.2080702@missouri.edu> <20120718205625.GA409@troutmask.apl.washington.edu> <500725F2.7060603@missouri.edu> <20120719025345.GA1376@troutmask.apl.washington.edu> <50077987.1080307@missouri.edu> <20120719032706.GA1558@troutmask.apl.washington.edu> <5007826D.7060806@missouri.edu> <5007AD41.9070000@missouri.edu> <20120719205347.T2601@besplex.bde.org> <50084322.7020401@missouri.edu> <20120720035001.W4053@besplex.bde.org> In-Reply-To: <20120720035001.W4053@besplex.bde.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: Diane Bruce , John Baldwin , David Chisnall , Bruce Evans , Steve Kargl , David Schultz , Peter Jeremy , Warner Losh Subject: Re: Use of C99 extra long double math functions after r236148 X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Date: Sun, 12 Aug 2012 23:09:38 -0000 X-Original-Date: Thu, 19 Jul 2012 13:23:42 -0500 X-List-Received-Date: Sun, 12 Aug 2012 23:09:38 -0000 On 07/19/2012 01:12 PM, Bruce Evans wrote: > On Thu, 19 Jul 2012, Stephen Montgomery-Smith wrote: > >> I have the ULP down to about 1.2 now. I don't see how I can do >> better, because I have to invoke log functions twice, and probably >> each one has a ULP of about 0.6. >> >> Also I decided to use 1/2 log(x*x+y*y) when x and y are not too large. > > That's close to Apple complex.c clog(). Once you don't use hypot(), > it is clearly best to use log1p(): > > log(sqrt(x*x + y*y)) = log(|x|) + 1/2 log(1 + (y*y)/(x*x)) > = log(|x|) + 1/2 log1p((y*y)/(x*x)) > > where |x| >= |y| so that log1p()'s arg is as small as possible. > >> I am really rather proud of how I got around the large ULP when >> hypot(x,y) is close to 1. I would be glad if any of you could look at >> the code when you get a chance. > > WIll look more closely later. I see that you already use log1p() and much > more. Apple clog() uses not so much more, mainly by depending on extra > precision in hardware. The above also avoids overflow and use of hypot() > for all finite x and and y, but is probably too simple. The Apple solution has a problem. The two invocations of log might produce results that are nearly identical, but with opposite signs. Think about x = y = 1/sqrt(2). From owner-freebsd-numerics@FreeBSD.ORG Sun Aug 12 23:09:45 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 001CB1065678 for ; Sun, 12 Aug 2012 23:09:44 +0000 (UTC) (envelope-from peter@rulingia.com) Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by mx1.freebsd.org (Postfix) with ESMTP id 7869B8FC08 for ; Sun, 12 Aug 2012 23:09:44 +0000 (UTC) Received: from server.rulingia.com (c220-239-249-137.belrs5.nsw.optusnet.com.au [220.239.249.137]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CN9iht075795 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Mon, 13 Aug 2012 09:09:44 +1000 (EST) (envelope-from peter@rulingia.com) X-Bogosity: Ham, spamicity=0.000000 Received: from server.rulingia.com (localhost.rulingia.com [127.0.0.1]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CN9big021554 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Mon, 13 Aug 2012 09:09:38 +1000 (EST) (envelope-from peter@server.rulingia.com) Received: (from peter@localhost) by server.rulingia.com (8.14.5/8.14.5/Submit) id q7CN9bBf021553 for freebsd-numerics@freebsd.org; Mon, 13 Aug 2012 09:09:37 +1000 (EST) (envelope-from peter) Resent-From: Peter Jeremy Resent-Date: Mon, 13 Aug 2012 09:09:37 +1000 Resent-Message-ID: <20120812230937.GP20453@server.rulingia.com> Resent-To: freebsd-numerics@freebsd.org Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q6JIdJod019066 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Fri, 20 Jul 2012 04:39:20 +1000 (EST) (envelope-from stephen@missouri.edu) Received: from wilberforce.math.missouri.edu (wilberforce.math.missouri.edu [128.206.184.213]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q6JIdHg0082003 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Fri, 20 Jul 2012 04:39:19 +1000 (EST) (envelope-from stephen@missouri.edu) Received: from [127.0.0.1] (wilberforce.math.missouri.edu [128.206.184.213]) by wilberforce.math.missouri.edu (8.14.5/8.14.5) with ESMTP id q6JIcvP1053673; Thu, 19 Jul 2012 13:38:57 -0500 (CDT) (envelope-from stephen@missouri.edu) Message-ID: <50085441.4090305@missouri.edu> From: Stephen Montgomery-Smith Mail-Followup-To: freebsd-numerics@freebsd.org User-Agent: Mozilla/5.0 (X11; Linux i686; rv:13.0) Gecko/20120615 Thunderbird/13.0.1 MIME-Version: 1.0 To: Bruce Evans References: <20120714120432.GA70706@server.rulingia.com> <20120717084457.U3890@besplex.bde.org> <5004A5C7.1040405@missouri.edu> <5004DEA9.1050001@missouri.edu> <20120717200931.U6624@besplex.bde.org> <5006D13D.2080702@missouri.edu> <20120718205625.GA409@troutmask.apl.washington.edu> <500725F2.7060603@missouri.edu> <20120719025345.GA1376@troutmask.apl.washington.edu> <50077987.1080307@missouri.edu> <20120719032706.GA1558@troutmask.apl.washington.edu> <5007826D.7060806@missouri.edu> <5007AD41.9070000@missouri.edu> <20120719205347.T2601@besplex.bde.org> <50084322.7020401@missouri.edu> <20120720035001.W4053@besplex.bde.org> In-Reply-To: <20120720035001.W4053@besplex.bde.org> Content-Type: multipart/mixed; boundary="------------010001010601030602040403" X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: Diane Bruce , John Baldwin , David Chisnall , Bruce Evans , Steve Kargl , David Schultz , Peter Jeremy , Warner Losh Subject: Re: Use of C99 extra long double math functions after r236148 X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Date: Sun, 12 Aug 2012 23:09:45 -0000 X-Original-Date: Thu, 19 Jul 2012 13:38:57 -0500 X-List-Received-Date: Sun, 12 Aug 2012 23:09:45 -0000 This is a multi-part message in MIME format. --------------010001010601030602040403 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit On 07/19/2012 01:12 PM, Bruce Evans wrote: > On Thu, 19 Jul 2012, Stephen Montgomery-Smith wrote: > >> I have the ULP down to about 1.2 now. I don't see how I can do >> better, because I have to invoke log functions twice, and probably >> each one has a ULP of about 0.6. >> >> Also I decided to use 1/2 log(x*x+y*y) when x and y are not too large. > > That's close to Apple complex.c clog(). Once you don't use hypot(), > it is clearly best to use log1p(): > > log(sqrt(x*x + y*y)) = log(|x|) + 1/2 log(1 + (y*y)/(x*x)) > = log(|x|) + 1/2 log1p((y*y)/(x*x)) > > where |x| >= |y| so that log1p()'s arg is as small as possible. > >> I am really rather proud of how I got around the large ULP when >> hypot(x,y) is close to 1. I would be glad if any of you could look at >> the code when you get a chance. > > WIll look more closely later. I see that you already use log1p() and much > more. Apple clog() uses not so much more, mainly by depending on extra > precision in hardware. The above also avoids overflow and use of hypot() > for all finite x and and y, but is probably too simple. I think their solution merely avoids the overflow/underflow problem, and was not meant to address the problem I worked on. However, their solution will fail if y=1e100 and y=1e-100. This caused me to realize that my solution failed to account for underflow, so here is my next iteration. --------------010001010601030602040403-- From owner-freebsd-numerics@FreeBSD.ORG Sun Aug 12 23:10:09 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 48E231065673 for ; Sun, 12 Aug 2012 23:10:09 +0000 (UTC) (envelope-from peter@rulingia.com) Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by mx1.freebsd.org (Postfix) with ESMTP id A8EB98FC1B for ; Sun, 12 Aug 2012 23:10:08 +0000 (UTC) Received: from server.rulingia.com (c220-239-249-137.belrs5.nsw.optusnet.com.au [220.239.249.137]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CNA8oA075808 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Mon, 13 Aug 2012 09:10:08 +1000 (EST) (envelope-from peter@rulingia.com) X-Bogosity: Ham, spamicity=0.000000 Received: from server.rulingia.com (localhost.rulingia.com [127.0.0.1]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CNA2Ce021589 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Mon, 13 Aug 2012 09:10:02 +1000 (EST) (envelope-from peter@server.rulingia.com) Received: (from peter@localhost) by server.rulingia.com (8.14.5/8.14.5/Submit) id q7CNA21O021588 for freebsd-numerics@freebsd.org; Mon, 13 Aug 2012 09:10:02 +1000 (EST) (envelope-from peter) Resent-From: Peter Jeremy Resent-Date: Mon, 13 Aug 2012 09:10:02 +1000 Resent-Message-ID: <20120812231002.GS20453@server.rulingia.com> Resent-To: freebsd-numerics@freebsd.org Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q6KDSOUC030305 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Fri, 20 Jul 2012 23:28:24 +1000 (EST) (envelope-from stephen@missouri.edu) Received: from wilberforce.math.missouri.edu (wilberforce.math.missouri.edu [128.206.184.213]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q6KDSL4w086668 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Fri, 20 Jul 2012 23:28:23 +1000 (EST) (envelope-from stephen@missouri.edu) Received: from [127.0.0.1] (wilberforce.math.missouri.edu [128.206.184.213]) by wilberforce.math.missouri.edu (8.14.5/8.14.5) with ESMTP id q6KDRw6n032352; Fri, 20 Jul 2012 08:27:59 -0500 (CDT) (envelope-from stephen@missouri.edu) Message-ID: <50095CDE.4050507@missouri.edu> From: Stephen Montgomery-Smith Mail-Followup-To: freebsd-numerics@freebsd.org User-Agent: Mozilla/5.0 (X11; Linux i686; rv:14.0) Gecko/20120714 Thunderbird/14.0 MIME-Version: 1.0 To: Bruce Evans References: <20120714120432.GA70706@server.rulingia.com> <20120717084457.U3890@besplex.bde.org> <5004A5C7.1040405@missouri.edu> <5004DEA9.1050001@missouri.edu> <20120717200931.U6624@besplex.bde.org> <5006D13D.2080702@missouri.edu> <20120718205625.GA409@troutmask.apl.washington.edu> <500725F2.7060603@missouri.edu> <20120719025345.GA1376@troutmask.apl.washington.edu> <50077987.1080307@missouri.edu> <20120719032706.GA1558@troutmask.apl.washington.edu> <5007826D.7060806@missouri.edu> <5007AD41.9070000@missouri.edu> <20120719205347.T2601@besplex.bde.org> <50084322.7020401@missouri.edu> <20120720035001.W4053@besplex.bde.org> <50085441.4090305@missouri.edu> <20120720162953.N2162@besplex.bde.org> <20120720184114.B2790@besplex.bde.org> In-Reply-To: <20120720184114.B2790@besplex.bde.org> Content-Type: multipart/mixed; boundary="------------030904090805050607040500" X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: Diane Bruce , John Baldwin , David Chisnall , Bruce Evans , Steve Kargl , David Schultz , Peter Jeremy , Warner Losh Subject: Re: Use of C99 extra long double math functions after r236148 X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Date: Sun, 12 Aug 2012 23:10:09 -0000 X-Original-Date: Fri, 20 Jul 2012 08:27:58 -0500 X-List-Received-Date: Sun, 12 Aug 2012 23:10:09 -0000 This is a multi-part message in MIME format. --------------030904090805050607040500 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Bruce, I worked quite a bit on my clog function last night. Could you look at this one? The trick for when hypot(x,y) is close to 1 can only work so well, and you are testing it out of its range of applicability. But for the special case x is close to 1, and y is close to zero, I have a much better formula. I can produce three other formula so that I can handle |x| close to 1, y small, and |y| close to 1 and x small. After fixing it up, could you send it back as an attachment? That will make it easier for me to put it back into my system, and work more on it. Thanks, Stephen On 07/20/2012 04:19 AM, Bruce Evans wrote: > On Fri, 20 Jul 2012, Bruce Evans wrote: > >> I was going to leave all the fixes to you, but now want to try the >> approximation for tiny y :-). > > This version now beats Apple clog() for accuracy. The maximum difference > observed is down from the exa-ulp range to 16 ulps with all of the 4-16 > ulp differences checked against pari being innaccuracies in Apple clog(). > > 2**28 cases were tested, but most of the errors found look like this: > > % x = 0x3fe0000000000000 0.5 > % y = 0x3fec000000000000 0.875 > % loghypota(x, y) = 0x3ff7fe054587e01ea000 0x3f7fc0a8b0fc03d4 > 0.00775209326798261336156 > % loghypot(x, y) = 0x3ff7fe054587e01f2000 0x3f7fc0a8b0fc03e4 > 0.00775209326798262723934 > % err = +0x8000 16.00000 > % pari log(0.5+0.875*I): > % 0.007752093267982627075427023021 + 1.051650212548373667459867312*I > > so my tests barely cover fractions like 1/8 and the coverage may be too > limited. New errors kept turning up as I expanded the coverage. > > % #include > % #include > % #include > % #include > % % #include "math_private.h" > % % /* > % * gcc doesn't implement complex multiplication or division correctly, > so we > % * need to handle infinities specially. We turn on this pragma to notify > % * conforming c99 compilers that the fast-but-incorrect code that gcc > % * generates is acceptable, since the special cases have already been > % * handled. > % */ > % #pragma STDC CX_LIMITED_RANGE ON > % % double complex > % clog(double complex z) > % { > % double x, y, h, t1, t2, t3; > % double x0, y0, x1, y1; > % % x = creal(z); > % y = cimag(z); > % % #define NANMIX_APPLE_CLOG_COMPAT 1 > % /* Handle special cases when x or y is NAN. */ > % if (isnan(x)) { > % if (isinf(y)) > % return (cpack(INFINITY, NAN)); > % else { > % #if NANMIX_HYPOTF_COMPAT > % y = fabs(y + 0); > % t1 = (y - y) / (y - y); /* Raise invalid flag if y is > % * not NaN */ > % t1 = fabs(x + 0) + t1; /* Mix NaN(s). */ > % #elif NANMIX_APPLE_CLOG_COMPAT > % /* No actual mixing. */ > % return (cpack(x, copysign(x, y))); > % #else > % t1 = (y - y) / (y - y); /* Raise invalid flag if y is > % * not NaN */ > % #endif > % return (cpack(t1, t1)); > % } > % } else if (isnan(y)) { > % if (isinf(x)) > % return (cpack(INFINITY, NAN)); > % else { > % #ifdef NANMIX_HYPOTF_COMPAT > % x = fabs(x + 0); > % t1 = (x - x) / (x - x); /* Raise invalid flag if x is > % * not NaN */ > % t1 = t1 + fabs(y + 0); /* Mix NaN(s). */ > % #elif NANMIX_APPLE_CLOG_COMPAT > % /* No actual mixing. */ > % return (cpack(y, y)); > % #else > % t1 = (x - x) / (x - x); /* Raise invalid flag if x is > % * not NaN */ > % #endif > % return (cpack(t1, t1)); > % } > % } else if (isfinite(x) && isfinite(y) && > % (fabs(x) > 1e308 || fabs(y) > 1e308)) > % /* > % * To avoid unnecessary overflow, if x or y are very large, > % * divide x and y by M_E, and then add 1 to the logarithm. > % * This depends on M_E being larger than sqrt(2). > % */ > % return (cpack(log(hypot(x / M_E, y / M_E)) + 1, atan2(y, x))); > > Important fix here. 1 was added to the log() arg instead of to log(). > > % else if (fabs(x) < 1e-50 && fabs(y) != 1 || > % fabs(y) < 1e-50 && fabs(x) != 1 || > % fabs(x) > 1e50 || fabs(y) > 1e50) > > The special case for |x| == 1 and |y| == 1 was defeated by returning for > it here. > > % /* > % * Because atan2 and hypot conform to C99, this also covers > % * all the edge cases when x or y are 0 or infinite. > % */ > % return (cpack(log(hypot(x, y)), atan2(y, x))); > % else { > % /* We don't need to worry about overflow in x*x+y*y. */ > % h = x * x + y * y; > % if (h < 0.1 || h > 10) > % return (cpack(log(h) / 2, atan2(y, x))); > % /* Take extra care if h is moderately close to 1 */ > % else { > % #if 1 > % if (fabs(x) == 1) > % return (cpack(log1p(y * y) / 2, > % atan2(y, x))); > % if (fabs(y) == 1) > % return (cpack(log1p(x * x) / 2, > % atan2(y, x))); > % #endif > > Special case. It seems too special, but Apple clog() doesn't do any more, > and this with the other special cases is enough to beat Apple clog(). > > % /* > % * x0 and y0 are good approximations to x and y, but > % * have their bits trimmed so that double precision > % * floating point is capable of calculating x0*x0 + > % * y0*y0 - 1 exactly. > % */ > > The only way for x*x + y*y to be _very_ near 1 in infinite precision > is for |x| or y| to be 1 (I think). Other cases are bounded away from > 1, and if you are lucky the bound is fairly far from 1 so that sloppier > approximations work OK. Mathematicians should determine the bound > exactly using continued fractions or something like they do for > approximations to N*Pi/2. This becomes especially interesting in high > precisions where you can't hope to get near the worst case by random > testing. > > % x0 = ldexp(floor(ldexp(x, 24)), -24); > % x1 = x - x0; > % y0 = ldexp(floor(ldexp(y, 24)), -24); > % y1 = y - y0; > > This has a chance of working iff the bound away from 1 is something like > 2**-24. Otherwise, multiplying by 2**24 and flooring a positive value > will just produce 0. 2**-24 seems much too small a bound. My test > coverage is not wide enough to hit many bad cases. > > % /* Notice that mathematically, h = t1*(1+t3). */ > % t1 = x0 * x0 + y0 * y0; > % t2 = 2 * x0 * x1 + x1 * x1 + 2 * y0 * y1 + y1 * y1; > % t3 = t2 / t1; > % return (cpack((log1p(t1 - 1) + log1p(t3)) / 2, > % atan2(y, x))); > % } > % } > % } > > Bruce > > --------------030904090805050607040500-- From owner-freebsd-numerics@FreeBSD.ORG Sun Aug 12 23:10:18 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id ED0E11065673 for ; Sun, 12 Aug 2012 23:10:17 +0000 (UTC) (envelope-from peter@rulingia.com) Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by mx1.freebsd.org (Postfix) with ESMTP id 80EAA8FC08 for ; Sun, 12 Aug 2012 23:10:17 +0000 (UTC) Received: from server.rulingia.com (c220-239-249-137.belrs5.nsw.optusnet.com.au [220.239.249.137]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CNAH1p075814 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Mon, 13 Aug 2012 09:10:17 +1000 (EST) (envelope-from peter@rulingia.com) X-Bogosity: Ham, spamicity=0.000000 Received: from server.rulingia.com (localhost.rulingia.com [127.0.0.1]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CNAB3L021606 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Mon, 13 Aug 2012 09:10:11 +1000 (EST) (envelope-from peter@server.rulingia.com) Received: (from peter@localhost) by server.rulingia.com (8.14.5/8.14.5/Submit) id q7CNABmT021605 for freebsd-numerics@freebsd.org; Mon, 13 Aug 2012 09:10:11 +1000 (EST) (envelope-from peter) Resent-From: Peter Jeremy Resent-Date: Mon, 13 Aug 2012 09:10:11 +1000 Resent-Message-ID: <20120812231011.GU20453@server.rulingia.com> Resent-To: freebsd-numerics@freebsd.org Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q6KKBIog060179 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Sat, 21 Jul 2012 06:11:18 +1000 (EST) (envelope-from stephen@missouri.edu) Received: from wilberforce.math.missouri.edu (wilberforce.math.missouri.edu [128.206.184.213]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q6KKBFc6094310 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Sat, 21 Jul 2012 06:11:17 +1000 (EST) (envelope-from stephen@missouri.edu) Received: from [127.0.0.1] (wilberforce.math.missouri.edu [128.206.184.213]) by wilberforce.math.missouri.edu (8.14.5/8.14.5) with ESMTP id q6KKAk0F058109; Fri, 20 Jul 2012 15:10:47 -0500 (CDT) (envelope-from stephen@missouri.edu) Message-ID: <5009BB46.3050001@missouri.edu> From: Stephen Montgomery-Smith Mail-Followup-To: freebsd-numerics@freebsd.org User-Agent: Mozilla/5.0 (X11; Linux i686; rv:14.0) Gecko/20120714 Thunderbird/14.0 MIME-Version: 1.0 To: Bruce Evans References: <20120714120432.GA70706@server.rulingia.com> <20120717084457.U3890@besplex.bde.org> <5004A5C7.1040405@missouri.edu> <5004DEA9.1050001@missouri.edu> <20120717200931.U6624@besplex.bde.org> <5006D13D.2080702@missouri.edu> <20120718205625.GA409@troutmask.apl.washington.edu> <500725F2.7060603@missouri.edu> <20120719025345.GA1376@troutmask.apl.washington.edu> <50077987.1080307@missouri.edu> <20120719032706.GA1558@troutmask.apl.washington.edu> <5007826D.7060806@missouri.edu> <5007AD41.9070000@missouri.edu> <20120719205347.T2601@besplex.bde.org> <50084322.7020401@missouri.edu> <20120720035001.W4053@besplex.bde.org> <50085441.4090305@missouri.edu> <20120720162953.N2162@besplex.bde.org> <20120720184114.B2790@besplex.bde.org> <50095CDE.4050507@missouri.edu> <20120721011112.D5008@besplex.bde.org> In-Reply-To: <20120721011112.D5008@besplex.bde.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: Diane Bruce , John Baldwin , David Chisnall , Bruce Evans , Steve Kargl , David Schultz , Peter Jeremy , Warner Losh Subject: Re: Use of C99 extra long double math functions after r236148 X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Date: Sun, 12 Aug 2012 23:10:18 -0000 X-Original-Date: Fri, 20 Jul 2012 15:10:46 -0500 X-List-Received-Date: Sun, 12 Aug 2012 23:10:18 -0000 On 07/20/2012 11:25 AM, Bruce Evans wrote: > On Fri, 20 Jul 2012, Stephen Montgomery-Smith wrote: > % x0 = (float)x; > % x1 = x - x0; > % y0 = (float)y; > % y1 = y - y0; > > A good way to do the hi+lo decompositions. That was the way I tried first. But it didn't work for me! But I see you changed things further down, so that is probably why it works for you. From owner-freebsd-numerics@FreeBSD.ORG Sun Aug 12 23:10:28 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 7E9A4106566C for ; Sun, 12 Aug 2012 23:10:28 +0000 (UTC) (envelope-from peter@rulingia.com) Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by mx1.freebsd.org (Postfix) with ESMTP id 148968FC1E for ; Sun, 12 Aug 2012 23:10:27 +0000 (UTC) Received: from server.rulingia.com (c220-239-249-137.belrs5.nsw.optusnet.com.au [220.239.249.137]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CNARwm075820 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Mon, 13 Aug 2012 09:10:28 +1000 (EST) (envelope-from peter@rulingia.com) X-Bogosity: Ham, spamicity=0.000000 Received: from server.rulingia.com (localhost.rulingia.com [127.0.0.1]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CNAL08021618 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Mon, 13 Aug 2012 09:10:21 +1000 (EST) (envelope-from peter@server.rulingia.com) Received: (from peter@localhost) by server.rulingia.com (8.14.5/8.14.5/Submit) id q7CNALAW021617 for freebsd-numerics@freebsd.org; Mon, 13 Aug 2012 09:10:21 +1000 (EST) (envelope-from peter) Resent-From: Peter Jeremy Resent-Date: Mon, 13 Aug 2012 09:10:21 +1000 Resent-Message-ID: <20120812231021.GW20453@server.rulingia.com> Resent-To: freebsd-numerics@freebsd.org Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q6L2dAGn063055 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Sat, 21 Jul 2012 12:39:10 +1000 (EST) (envelope-from stephen@missouri.edu) Received: from wilberforce.math.missouri.edu (wilberforce.math.missouri.edu [128.206.184.213]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q6L2d7r8095312 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Sat, 21 Jul 2012 12:39:09 +1000 (EST) (envelope-from stephen@missouri.edu) Received: from [127.0.0.1] (wilberforce.math.missouri.edu [128.206.184.213]) by wilberforce.math.missouri.edu (8.14.5/8.14.5) with ESMTP id q6L2cmkV095789; Fri, 20 Jul 2012 21:38:48 -0500 (CDT) (envelope-from stephen@missouri.edu) Message-ID: <500A1638.5090601@missouri.edu> From: Stephen Montgomery-Smith Mail-Followup-To: freebsd-numerics@freebsd.org User-Agent: Mozilla/5.0 (X11; Linux i686; rv:14.0) Gecko/20120714 Thunderbird/14.0 MIME-Version: 1.0 To: Bruce Evans References: <20120714120432.GA70706@server.rulingia.com> <20120717084457.U3890@besplex.bde.org> <5004A5C7.1040405@missouri.edu> <5004DEA9.1050001@missouri.edu> <20120717200931.U6624@besplex.bde.org> <5006D13D.2080702@missouri.edu> <20120718205625.GA409@troutmask.apl.washington.edu> <500725F2.7060603@missouri.edu> <20120719025345.GA1376@troutmask.apl.washington.edu> <50077987.1080307@missouri.edu> <20120719032706.GA1558@troutmask.apl.washington.edu> <5007826D.7060806@missouri.edu> <5007AD41.9070000@missouri.edu> <20120719205347.T2601@besplex.bde.org> <50084322.7020401@missouri.edu> <20120720035001.W4053@besplex.bde.org> <50085441.4090305@missouri.edu> <20120720162953.N2162@besplex.bde.org> <20120720184114.B2790@besplex.bde.org> <50095CDE.4050507@missouri.edu> <20120721011112.D5008@besplex.bde.org> <5009BB46.3050001@missouri.edu> <20120721122309.R856@besplex.bde.org> In-Reply-To: <20120721122309.R856@besplex.bde.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: Diane Bruce , John Baldwin , David Chisnall , Bruce Evans , Steve Kargl , David Schultz , Peter Jeremy , Warner Losh Subject: Re: Use of C99 extra long double math functions after r236148 X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Date: Sun, 12 Aug 2012 23:10:28 -0000 X-Original-Date: Fri, 20 Jul 2012 21:38:48 -0500 X-List-Received-Date: Sun, 12 Aug 2012 23:10:28 -0000 On 07/20/2012 09:33 PM, Bruce Evans wrote: > On Fri, 20 Jul 2012, Stephen Montgomery-Smith wrote: > >> On 07/20/2012 11:25 AM, Bruce Evans wrote: >>> On Fri, 20 Jul 2012, Stephen Montgomery-Smith wrote: >> >> >>> % x0 = (float)x; >>> % x1 = x - x0; >>> % y0 = (float)y; >>> % y1 = y - y0; >>> >>> A good way to do the hi+lo decompositions. >> >> That was the way I tried first. But it didn't work for me! >> >> But I see you changed things further down, so that is probably why it >> works for you. > > I didn't understand what was happening before, but think I can explain it > now: > - the above gives correct hi+lo decompositions. Both hi and lo are usually > nonzero. The code below did't really understand hi+lo decompositions, > and often increases the final error (relative to naive code). > - your code often gives null but backwards hi+lo decompositions, with hi > = 0 > and lo = full value. The code below did't really understand hi+lo > decompositions. But when hi = 0, it is especially easy to add and > multiply it exactly, so the final error isn't increased so often. Yes. That was my intention. But I will go with whatever works best - I am not sold on one solution over another. From owner-freebsd-numerics@FreeBSD.ORG Sun Aug 12 23:10:39 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id A6B72106564A for ; Sun, 12 Aug 2012 23:10:39 +0000 (UTC) (envelope-from peter@rulingia.com) Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by mx1.freebsd.org (Postfix) with ESMTP id 3D8008FC1C for ; Sun, 12 Aug 2012 23:10:38 +0000 (UTC) Received: from server.rulingia.com (c220-239-249-137.belrs5.nsw.optusnet.com.au [220.239.249.137]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CNAchS075830 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Mon, 13 Aug 2012 09:10:38 +1000 (EST) (envelope-from peter@rulingia.com) X-Bogosity: Ham, spamicity=0.000000 Received: from server.rulingia.com (localhost.rulingia.com [127.0.0.1]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CNAWnW021637 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Mon, 13 Aug 2012 09:10:32 +1000 (EST) (envelope-from peter@server.rulingia.com) Received: (from peter@localhost) by server.rulingia.com (8.14.5/8.14.5/Submit) id q7CNAWGL021636 for freebsd-numerics@freebsd.org; Mon, 13 Aug 2012 09:10:32 +1000 (EST) (envelope-from peter) Resent-From: Peter Jeremy Resent-Date: Mon, 13 Aug 2012 09:10:32 +1000 Resent-Message-ID: <20120812231032.GY20453@server.rulingia.com> Resent-To: freebsd-numerics@freebsd.org Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q6MJxQnS006903 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Mon, 23 Jul 2012 05:59:27 +1000 (EST) (envelope-from stephen@missouri.edu) Received: from wilberforce.math.missouri.edu (wilberforce.math.missouri.edu [128.206.184.213]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q6MJxODf009577 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Mon, 23 Jul 2012 05:59:26 +1000 (EST) (envelope-from stephen@missouri.edu) Received: from [127.0.0.1] (wilberforce.math.missouri.edu [128.206.184.213]) by wilberforce.math.missouri.edu (8.14.5/8.14.5) with ESMTP id q6MJx2lq071625; Sun, 22 Jul 2012 14:59:03 -0500 (CDT) (envelope-from stephen@missouri.edu) Message-ID: <500C5B87.6050502@missouri.edu> From: Stephen Montgomery-Smith Mail-Followup-To: freebsd-numerics@freebsd.org User-Agent: Mozilla/5.0 (X11; Linux i686; rv:14.0) Gecko/20120714 Thunderbird/14.0 MIME-Version: 1.0 To: Bruce Evans References: <20120714120432.GA70706@server.rulingia.com> <20120717084457.U3890@besplex.bde.org> <5004A5C7.1040405@missouri.edu> <5004DEA9.1050001@missouri.edu> <20120717200931.U6624@besplex.bde.org> <5006D13D.2080702@missouri.edu> <20120718205625.GA409@troutmask.apl.washington.edu> <500725F2.7060603@missouri.edu> <20120719025345.GA1376@troutmask.apl.washington.edu> <50077987.1080307@missouri.edu> <20120719032706.GA1558@troutmask.apl.washington.edu> <5007826D.7060806@missouri.edu> <5007AD41.9070000@missouri.edu> <20120719205347.T2601@besplex.bde.org> <50084322.7020401@missouri.edu> <20120720035001.W4053@besplex.bde.org> <50085441.4090305@missouri.edu> <20120720162953.N2162@besplex.bde.org> <20120720184114.B2790@besplex.bde.org> <50095CDE.4050507@missouri.edu> <20120723044308.X6145@besplex.bde.org> In-Reply-To: <20120723044308.X6145@besplex.bde.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: Diane Bruce , John Baldwin , David Chisnall , Bruce Evans , Steve Kargl , David Schultz , Peter Jeremy , Warner Losh Subject: Re: Use of C99 extra long double math functions after r236148 X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Date: Sun, 12 Aug 2012 23:10:39 -0000 X-Original-Date: Sun, 22 Jul 2012 14:59:03 -0500 X-List-Received-Date: Sun, 12 Aug 2012 23:10:39 -0000 On 07/22/2012 02:29 PM, Bruce Evans wrote: > Replying again to this... > > On Fri, 20 Jul 2012, Stephen Montgomery-Smith wrote: > >> I worked quite a bit on my clog function last night. Could you look >> at this one? > > I ended up deleting most of your changes in this one. > Bruce, thank you for sending me your copy as an attachment. Any further changes I will make will be to your version. I really wasn't looking forward to rewriting the code to get it to fit style guides. I am also glad you deleted a lot of my additions. I was tearing my hair out trying to get it to work, first trying this thing, and first trying that. This is why the last code I sent was so complicated, because it included every latest thing I had tried. And this morning I suddenly realized my reference program was buggy because I didn't use enough precision! My code suddenly worked! But if course it included a lot of unnecessary junk. Stephen From owner-freebsd-numerics@FreeBSD.ORG Sun Aug 12 23:10:42 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 8C1021065674 for ; Sun, 12 Aug 2012 23:10:42 +0000 (UTC) (envelope-from peter@rulingia.com) Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by mx1.freebsd.org (Postfix) with ESMTP id 1188A8FC0C for ; Sun, 12 Aug 2012 23:10:41 +0000 (UTC) Received: from server.rulingia.com (c220-239-249-137.belrs5.nsw.optusnet.com.au [220.239.249.137]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CNAfjr075831 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Mon, 13 Aug 2012 09:10:42 +1000 (EST) (envelope-from peter@rulingia.com) X-Bogosity: Ham, spamicity=0.000000 Received: from server.rulingia.com (localhost.rulingia.com [127.0.0.1]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CNAZ7W021643 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Mon, 13 Aug 2012 09:10:35 +1000 (EST) (envelope-from peter@server.rulingia.com) Received: (from peter@localhost) by server.rulingia.com (8.14.5/8.14.5/Submit) id q7CNAZHw021642 for freebsd-numerics@freebsd.org; Mon, 13 Aug 2012 09:10:35 +1000 (EST) (envelope-from peter) Resent-From: Peter Jeremy Resent-Date: Mon, 13 Aug 2012 09:10:35 +1000 Resent-Message-ID: <20120812231035.GZ20453@server.rulingia.com> Resent-To: freebsd-numerics@freebsd.org Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q6MKDm4g007043 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Mon, 23 Jul 2012 06:13:48 +1000 (EST) (envelope-from stephen@missouri.edu) Received: from wilberforce.math.missouri.edu (wilberforce.math.missouri.edu [128.206.184.213]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q6MKDjlF009627 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Mon, 23 Jul 2012 06:13:48 +1000 (EST) (envelope-from stephen@missouri.edu) Received: from [127.0.0.1] (wilberforce.math.missouri.edu [128.206.184.213]) by wilberforce.math.missouri.edu (8.14.5/8.14.5) with ESMTP id q6MKDOBd072644; Sun, 22 Jul 2012 15:13:25 -0500 (CDT) (envelope-from stephen@missouri.edu) Message-ID: <500C5EE5.4090602@missouri.edu> From: Stephen Montgomery-Smith Mail-Followup-To: freebsd-numerics@freebsd.org User-Agent: Mozilla/5.0 (X11; Linux i686; rv:14.0) Gecko/20120714 Thunderbird/14.0 MIME-Version: 1.0 To: Bruce Evans References: <20120714120432.GA70706@server.rulingia.com> <20120717084457.U3890@besplex.bde.org> <5004A5C7.1040405@missouri.edu> <5004DEA9.1050001@missouri.edu> <20120717200931.U6624@besplex.bde.org> <5006D13D.2080702@missouri.edu> <20120718205625.GA409@troutmask.apl.washington.edu> <500725F2.7060603@missouri.edu> <20120719025345.GA1376@troutmask.apl.washington.edu> <50077987.1080307@missouri.edu> <20120719032706.GA1558@troutmask.apl.washington.edu> <5007826D.7060806@missouri.edu> <5007AD41.9070000@missouri.edu> <20120719205347.T2601@besplex.bde.org> <50084322.7020401@missouri.edu> <20120720035001.W4053@besplex.bde.org> <50085441.4090305@missouri.edu> <20120720162953.N2162@besplex.bde.org> <20120720184114.B2790@besplex.bde.org> <50095CDE.4050507@missouri.edu> <20120723044308.X6145@besplex.bde.org> In-Reply-To: <20120723044308.X6145@besplex.bde.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: Diane Bruce , John Baldwin , David Chisnall , Bruce Evans , Steve Kargl , David Schultz , Peter Jeremy , Warner Losh Subject: Re: Use of C99 extra long double math functions after r236148 X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Date: Sun, 12 Aug 2012 23:10:42 -0000 X-Original-Date: Sun, 22 Jul 2012 15:13:25 -0500 X-List-Received-Date: Sun, 12 Aug 2012 23:10:42 -0000 But I will say that your latest version of clog doesn't do as well as mine with this input: x = unur_sample_cont(gen); y = unur_sample_cont(gen); h = hypot(x,y); x = x/h; y = y/h; I was able to get ULPs less than 2. Your program gets ULPs more like up to 4000. I have to say that I consider a ULP of 4000 under these very extreme circumstances to be acceptable. Definitely acceptable if the code goes a whole lot faster than code that has a ULP of less than 2. On 07/22/2012 02:29 PM, Bruce Evans wrote: > Replying again to this... > > On Fri, 20 Jul 2012, Stephen Montgomery-Smith wrote: > >> I worked quite a bit on my clog function last night. Could you look >> at this one? > > I ended up deleting most of your changes in this one. > >> The trick for when hypot(x,y) is close to 1 can only work so well, and >> you are testing it out of its range of applicability. But for the >> special case x is close to 1, and y is close to zero, I have a much >> better formula. I can produce three other formula so that I can >> handle |x| close to 1, y small, and |y| close to 1 and x small. >> >> After fixing it up, could you send it back as an attachment? That >> will make it easier for me to put it back into my system, and work >> more on it. > > It will be painful for everyone to understand and merge. This time as > an attachment as well as (partly) inline with commentary. > > % #include > % #include > % #include > % % #include "math_private.h" > % % static inline void > % xonorm(double *a, double *b) > % { > % double w; > % % if (fabs(*a) < fabs(*b)) { > % w = *a; > % *a = *b; > % *b = w; > % } > % STRICT_ASSIGN(double, w, *a + *b); > % *b = (*a - w) + *b; > % *a = w; > % } > % % #define xnorm(a, b) xonorm(&(a), &(b)) > % % #define xspadd(a, b, c) do { \ > % double __tmp; \ > % \ > % __tmp = (c); \ > % xonorm(&__tmp, &(a)); \ > % (b) += (a); \ > % (a) = __tmp; \ > % } while (0) > % % static inline void > % xonormf(float *a, float *b) > % { > % float w; > % % if (fabsf(*a) < fabsf(*b)) { > % w = *a; > % *a = *b; > % *b = w; > % } > % STRICT_ASSIGN(float, w, *a + *b); > % *b = (*a - w) + *b; > % *a = w; > % } > % % #define xnormf(a, b) xonormf(&(a), &(b)) > % % #define xspaddf(a, b, c) do { \ > % float __tmp; \ > % \ > % __tmp = (c); \ > % xonormf(&__tmp, &(a)); \ > % (b) += (a); \ > % (a) = __tmp; \ > % } while (0) > > Above are my standard extra-precision macros from my math_private.h, > cut down for use here and named with an x. Then expanded and pessimized > to swap the args. Optimal callers ensure that they don't need swapping. > See s_fma.c for a fuller algorithm that doesn't need swapping but does > more operations. I started to copy that, but s_fma.c doesn't seem to > have anything as convenient as xspaddf(). Further expanded and > pessimized() to do STRICT_ASSIGN(). Optimal callers use float_t and > double_t so that STRICT_ASSIGN() is unnecessary. Compiler bugs break > algorithms like the above on i386 unless float_t and double_t, or > STRICT_ASSIGN() are used. Fixing the bugs would give the same slowness > as STRICT_ASSIGN(), but globally by doing it for all assignments, so > even I now consider these bugs to be features and C standards to be > broken for not allowing them. I first discussed fixing them with gcc > maintainers over 20 years ago. > > % % double complex > % clog(double complex z) > % { > % double x, y, h, t1, t2, t3; > % double ax, ay, x0, y0, x1, y1; > % % x = creal(z); > % y = cimag(z); > % % /* Handle NaNs using the general formula to mix them right. */ > % if (x != x || y != y) > % return (cpack(log(hypot(x, y)), atan2(y, x))); > > I replaced all my messy ifdefs for this by the function call. Also > changes isnan() to a not-so-magic test. Though isnan() is about the > only FP classification macro that I trust the compiler for. > > % % ax = fabs(x); > % ay = fabs(y); > % if (ax < ay) { > % t1 = ax; > % ax = ay; > % ay = t1; > % } > > I got tired of repeating fabs()'s, and need to know which arg is larger. > > % % /* > % * To avoid unnecessary overflow, if x or y are very large, divide x > % * and y by M_E, and then add 1 to the logarithm. This depends on > % * M_E being larger than sqrt(2). > % * > % * XXX bugs to fix: > % * - underflow if one of x or y is tiny. e_hypot.c avoids this > % * problem, and optimizes for the case that the ratio of the > % * args is very large, by returning the absolute value of > % * the largest arg in this case. > % * - not very accurate. Could divide by 2 and add log(2) in extra > % * precision. A general scaling step that divides by 2**k and > % * adds k*log(2) in extra precision might be good for reducing > % * the range so that we don't have to worry about overflow or > % * underflow in the general steps. This needs the previous step > % * of eliminating large ratios of args so that the args can be > % * scaled on the same scale. > % * - s/are/is/ in comment. > % */ > % if (ax > 1e308) > % return (cpack(log(hypot(x / M_E, y / M_E)) + 1, atan2(y, x))); > > No need to avoid infinities here. No need to test y now that we know > ax is largest. > > % % if (ax == 1) { > % if (ay < 1e-150) > % return (cpack((ay * 0.5) * ay, atan2(y, x))); > % return (cpack(log1p(ay * ay) * 0.5, atan2(y, x))); > % } > > Special case mainly for when (ay * ay) is rounded down to the smallest > denormal. > > % % /* > % * Because atan2 and hypot conform to C99, this also covers all the > % * edge cases when x or y are 0 or infinite. > % */ > % if (ax < 1e-50 || ay < 1e-50 || ax > 1e50 || ay > 1e50) > % return (cpack(log(hypot(x, y)), atan2(y, x))); > > Not quite right. It is the ratio that matters more than these magic > magnitudes. > > % % /* We don't need to worry about overflow in x*x+y*y. */ > % % /* > % * Take extra care so that ULP of real part is small if h is > % * moderately close to 1. If one only cares about the relative error > % * of the whole result (real and imaginary part taken together), this > % * algorithm is overkill. > % * > % * This algorithm does a rather good job if |h-1| >= 1e-5. The only > % * algorithm that I can think of that would work for any h close to > % * one would require hypot(x,y) being computed using double double > % * precision precision (i.e. double as many bits in the mantissa as > % * double precision). > % * > % * x0 and y0 are good approximations to x and y, but have their bits > % * trimmed so that double precision floating point is capable of > % * calculating x0*x0 + y0*y0 - 1 exactly. > % */ > > Comments not all updated. This one especially out of date. > > % x0 = ax; > % SET_LOW_WORD(x0, 0); > % x1 = ax - x0; > % y0 = ay; > % SET_LOW_WORD(y0, 0); > % y1 = ay - y0; > > Sloppy decomposition with only 21 bits in hi part. Since we are short > of bits, we shouldn't burn 5 like this for efficency. In float precision, > all the multiplications are exact since 24 splits exactly. > > % /* Notice that mathematically, h = t1*(1+t3). */ > % #if 0 > > Old version. Still drops bits unnecessary, although I added several > full hi/lo decomposition steps to it. > > % t1 = x0 * x0; /* Exact. */ > % t2 = y0 * y0; /* Exact. */ > > Comments not quite right. All of the muliplications are as exact as > possible. They would all be exact if we could split 53 in half, and > did so. > > % STRICT_ASSIGN(double, t3, t1 + t2); > % t2 = (t1 - t3) + t2; > % t1 = t3; /* Now t1+t2 is hi+lo for x0*x0+y0*y0.*/ > % t2 += 2 * x0 * x1; > % STRICT_ASSIGN(double, t3, t1 + t2); > % t2 = (t1 - t3) + t2; > % t1 = t3; > % t2 += 2 * y0 * y1; > % STRICT_ASSIGN(double, t3, t1 + t2); > % t2 = (t1 - t3) + t2; > % t1 = t3; > % t2 += x1 * x1 + y1 * y1; > % STRICT_ASSIGN(double, t3, t1 + t2); > % t2 = (t1 - t3) + t2; > % t1 = t3; /* Now t1+t2 is hi+lo for x*x+y*y.*/ > % #else > % t1 = x1 * x1; > % t2 = y1 * y1; > % xnorm(t1, t2); > % xspadd(t1, t2, 2 * y0 * y1); > % xspadd(t1, t2, 2 * x0 * x1); > % xspadd(t1, t2, y0 * y0); > % xspadd(t1, t2, x0 * x0); > % xnorm(t1, t2); > > It was too hard to turn the above into this without using the macros. > Now all the multiplications are as exact as possible, and extra precision > is used for all the additions (this mattered even for the first 2 terms). > Terms should be added from the smallest to the highest. This happens in > most cases and some bits are lost when it isn't. > > % #endif > % t3 = t2 / t1; > % /* > % * |t3| ~< 2**-22 since we work with 24 extra bits of precision, so > % * log1p(t3) can be evaluated with about 13 extra bits of precision > % * using 2 terms of its power series. But there are complexities > % * to avoid underflow. > % */ > > Complexities to avoid underflow incomplete and not here yet. > > Comment otherwise anachronistic/anaspac(sp?)istic. 22 and 13 are for the > float version. The final xnorm() step (to maximize accuracy) ensures that > |t3| < 2**-24 precisely for floats (half an ulp). 2**-53 for doubles. > > % return (cpack((t3 - t3*0.5*t3 + log(t1)) * 0.5, atan2(y, x))); > > The second term for log1p(t3) is probably nonsense . We lose by > inaccuracies in t3 itself. > > % } > % % float complex > % clogf(float complex z) > % { > > This is a routine translation. It duplicates too many comments. Only > this has been tested much with the latest accuracty fixes. > > % float x, y, h, t1, t2, t3; > % float ax, ay, x0, y0, x1, y1; > % uint32_t hx, hy; > % % x = crealf(z); > % y = cimagf(z); > % % /* Handle NaNs using the general formula to mix them right. */ > % if (x != x || y != y) > % return (cpack(log(hypot(x, y)), atan2(y, x))); > > Oops, copied too much -- forgot to add f's. > > Bruce From owner-freebsd-numerics@FreeBSD.ORG Sun Aug 12 23:10:47 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 96DB8106564A for ; Sun, 12 Aug 2012 23:10:47 +0000 (UTC) (envelope-from peter@rulingia.com) Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by mx1.freebsd.org (Postfix) with ESMTP id 1BA898FC0A for ; Sun, 12 Aug 2012 23:10:46 +0000 (UTC) Received: from server.rulingia.com (c220-239-249-137.belrs5.nsw.optusnet.com.au [220.239.249.137]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CNAkq5075836 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Mon, 13 Aug 2012 09:10:47 +1000 (EST) (envelope-from peter@rulingia.com) X-Bogosity: Ham, spamicity=0.000000 Received: from server.rulingia.com (localhost.rulingia.com [127.0.0.1]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CNAesd021655 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Mon, 13 Aug 2012 09:10:40 +1000 (EST) (envelope-from peter@server.rulingia.com) Received: (from peter@localhost) by server.rulingia.com (8.14.5/8.14.5/Submit) id q7CNAeuw021654 for freebsd-numerics@freebsd.org; Mon, 13 Aug 2012 09:10:40 +1000 (EST) (envelope-from peter) Resent-From: Peter Jeremy Resent-Date: Mon, 13 Aug 2012 09:10:40 +1000 Resent-Message-ID: <20120812231040.GB20453@server.rulingia.com> Resent-To: freebsd-numerics@freebsd.org Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q6N4qgxI011451 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Mon, 23 Jul 2012 14:52:43 +1000 (EST) (envelope-from stephen@missouri.edu) Received: from wilberforce.math.missouri.edu (wilberforce.math.missouri.edu [128.206.184.213]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q6N4qdZB010952 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Mon, 23 Jul 2012 14:52:41 +1000 (EST) (envelope-from stephen@missouri.edu) Received: from [127.0.0.1] (wilberforce.math.missouri.edu [128.206.184.213]) by wilberforce.math.missouri.edu (8.14.5/8.14.5) with ESMTP id q6N4qCsj008623; Sun, 22 Jul 2012 23:52:12 -0500 (CDT) (envelope-from stephen@missouri.edu) Message-ID: <500CD87D.9060804@missouri.edu> From: Stephen Montgomery-Smith Mail-Followup-To: freebsd-numerics@freebsd.org User-Agent: Mozilla/5.0 (X11; Linux i686; rv:14.0) Gecko/20120714 Thunderbird/14.0 MIME-Version: 1.0 To: Bruce Evans References: <20120714120432.GA70706@server.rulingia.com> <20120717084457.U3890@besplex.bde.org> <5004A5C7.1040405@missouri.edu> <5004DEA9.1050001@missouri.edu> <20120717200931.U6624@besplex.bde.org> <5006D13D.2080702@missouri.edu> <20120718205625.GA409@troutmask.apl.washington.edu> <500725F2.7060603@missouri.edu> <20120719025345.GA1376@troutmask.apl.washington.edu> <50077987.1080307@missouri.edu> <20120719032706.GA1558@troutmask.apl.washington.edu> <5007826D.7060806@missouri.edu> <5007AD41.9070000@missouri.edu> <20120719205347.T2601@besplex.bde.org> <50084322.7020401@missouri.edu> <20120720035001.W4053@besplex.bde.org> <50085441.4090305@missouri.edu> <20120720162953.N2162@besplex.bde.org> <20120720184114.B2790@besplex.bde.org> <50095CDE.4050507@missouri.edu> <20120723044308.X6145@besplex.bde.org> <500C5EE5.4090602@missouri.edu> <20120723131233.U1189@besplex.bde.org> In-Reply-To: <20120723131233.U1189@besplex.bde.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: Diane Bruce , John Baldwin , David Chisnall , Bruce Evans , Steve Kargl , David Schultz , Peter Jeremy , Warner Losh Subject: Re: Use of C99 extra long double math functions after r236148 X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Date: Sun, 12 Aug 2012 23:10:47 -0000 X-Original-Date: Sun, 22 Jul 2012 23:52:13 -0500 X-List-Received-Date: Sun, 12 Aug 2012 23:10:47 -0000 On 07/22/2012 11:12 PM, Bruce Evans wrote: > On Sun, 22 Jul 2012, Stephen Montgomery-Smith wrote: > >> But I will say that your latest version of clog doesn't do as well as >> mine with this input: >> >> x = unur_sample_cont(gen); >> y = unur_sample_cont(gen); >> h = hypot(x,y); >> x = x/h; >> y = y/h; >> >> I was able to get ULPs less than 2. Your program gets ULPs more like >> up to 4000. > > I may have broken the double version when working mostly on the float > version recently. > > What are the actual x and y? I'm not set up to use mpfr. The code segment didn't I showed didn't use mpfr. It used unuran. Basically I am generating random numbers uniformly distributed on the disk |z|=1. You could also do it using x = cos(t) y = sin(t) where t is a random real number in the interval [0,2 pi]. > Since the float version gets errors of 4096 ulps (12 bits wrong), the > double version is sure to get errors of [much more than] 12 + (53-24) > = 41 bits wrong. That is 2 tera ulps. Not noticing such enormous > errors indicates that the problematic cases haven't been tested. I > think you are right that it needs more like tripled double precision > -- with merely doubled double precision, it can probably get all 53 > mantissa bits and the sign bit wrong too (sign of (|z|^2 - 1)). That > is total loss of precision (TLOSS), and should be handled by returning > NaN. Sign errors are especially interesting with complex functions > and even for real log() applied to a real function, since they may > change the branch. I got TLOSS including sign errors in the loghypotf() > result in intermediate version due to bugs in the doubling of float > precision. Before the attempted doubling, TLOSS might have been the > usual case for z near 1! I will have another go at this code, maybe tomorrow, maybe later. I have been putting a lot of work into casinh, casin, cacosh and cacos, getting the branches correct. That has exhausted me. >> I have to say that I consider a ULP of 4000 under these very extreme >> circumstances to be acceptable. Definitely acceptable if the code >> goes a whole lot faster than code that has a ULP of less than 2. > > "An ULP of 4000" is unusual terminology. An ulp is a unit, not a count. I am so different than you guys. I have no problem with lack of consistency in notation, no problem with language being abused, no problem with people using different programming styles - certainly no problems with ULP's on the large side (oops, errors with large numbers of ULP's) , and ... > I haven't figured out how to cut down the amount of mail generated by > this thread. Sorry to add to it :-). ... no problems deleting emails that I don't want (but surely with all the email volume we have, this does suggest we need a freebsd-numerics mailing list, doesn't it?), and ... > >> On 07/22/2012 02:29 PM, Bruce Evans wrote: >>> Replying again to this... > > Top posting is one way :-). ... most definitely no problem with top posting! I try not to top post on the FreeBSD mailing list because I know so many people there are bothered by it. But most of the people I communicate with (professors in academic institutions, friends and relatives) top post all the time. But I should have remembered not to top post with you guys. I understand that you guys and I think differently. I don't think that I am right and you guys are wrong. But conversely, I hope that you don't think I am wrong and you are right! And I hope that you can appreciate the qualities I bring to the discussion, just as I appreciate the qualities that you bring. Stephen From owner-freebsd-numerics@FreeBSD.ORG Sun Aug 12 23:11:00 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id A65A21065676 for ; Sun, 12 Aug 2012 23:11:00 +0000 (UTC) (envelope-from peter@rulingia.com) Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by mx1.freebsd.org (Postfix) with ESMTP id 378FF8FC0C for ; Sun, 12 Aug 2012 23:11:00 +0000 (UTC) Received: from server.rulingia.com (c220-239-249-137.belrs5.nsw.optusnet.com.au [220.239.249.137]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CNB0gJ075847 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Mon, 13 Aug 2012 09:11:00 +1000 (EST) (envelope-from peter@rulingia.com) X-Bogosity: Ham, spamicity=0.000000 Received: from server.rulingia.com (localhost.rulingia.com [127.0.0.1]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CNArTZ021678 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Mon, 13 Aug 2012 09:10:53 +1000 (EST) (envelope-from peter@server.rulingia.com) Received: (from peter@localhost) by server.rulingia.com (8.14.5/8.14.5/Submit) id q7CNArqu021677 for freebsd-numerics@freebsd.org; Mon, 13 Aug 2012 09:10:53 +1000 (EST) (envelope-from peter) Resent-From: Peter Jeremy Resent-Date: Mon, 13 Aug 2012 09:10:53 +1000 Resent-Message-ID: <20120812231053.GD20453@server.rulingia.com> Resent-To: freebsd-numerics@freebsd.org Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q6PNs2ql072401 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Thu, 26 Jul 2012 09:54:02 +1000 (EST) (envelope-from stephen@missouri.edu) Received: from wilberforce.math.missouri.edu (wilberforce.math.missouri.edu [128.206.184.213]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q6PNrxsA033550 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Thu, 26 Jul 2012 09:54:01 +1000 (EST) (envelope-from stephen@missouri.edu) Received: from [127.0.0.1] (wilberforce.math.missouri.edu [128.206.184.213]) by wilberforce.math.missouri.edu (8.14.5/8.14.5) with ESMTP id q6PNrWfF069691; Wed, 25 Jul 2012 18:53:34 -0500 (CDT) (envelope-from stephen@missouri.edu) Message-ID: <501086FC.8030902@missouri.edu> From: Stephen Montgomery-Smith Mail-Followup-To: freebsd-numerics@freebsd.org User-Agent: Mozilla/5.0 (X11; Linux i686; rv:14.0) Gecko/20120714 Thunderbird/14.0 MIME-Version: 1.0 To: Bruce Evans References: <20120714120432.GA70706@server.rulingia.com> <20120717084457.U3890@besplex.bde.org> <5004A5C7.1040405@missouri.edu> <5004DEA9.1050001@missouri.edu> <20120717200931.U6624@besplex.bde.org> <5006D13D.2080702@missouri.edu> <20120718205625.GA409@troutmask.apl.washington.edu> <500725F2.7060603@missouri.edu> <20120719025345.GA1376@troutmask.apl.washington.edu> <50077987.1080307@missouri.edu> <20120719032706.GA1558@troutmask.apl.washington.edu> <5007826D.7060806@missouri.edu> <5007AD41.9070000@missouri.edu> <20120719205347.T2601@besplex.bde.org> <50084322.7020401@missouri.edu> <20120720035001.W4053@besplex.bde.org> <50085441.4090305@missouri.edu> <20120720162953.N2162@besplex.bde.org> <20120720184114.B2790@besplex.bde.org> <50095CDE.4050507@missouri.edu> <20120723044308.X6145@besplex.bde.org> In-Reply-To: <20120723044308.X6145@besplex.bde.org> Content-Type: multipart/mixed; boundary="------------040903050101000604080209" X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: Diane Bruce , John Baldwin , David Chisnall , Bruce Evans , Steve Kargl , David Schultz , Peter Jeremy , Warner Losh Subject: Re: Use of C99 extra long double math functions after r236148 X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Date: Sun, 12 Aug 2012 23:11:00 -0000 X-Original-Date: Wed, 25 Jul 2012 18:53:32 -0500 X-List-Received-Date: Sun, 12 Aug 2012 23:11:00 -0000 This is a multi-part message in MIME format. --------------040903050101000604080209 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit This function seems to be able to compute clog with a worst case relative error of 4 or 5 ULP. For clogf I simply used double precision. That seems to get a errors as high as 1000 ULP. It might be simpler just to have clogf use clog: float complex clogf(float complex z) { return (clog(z)); } --------------040903050101000604080209-- From owner-freebsd-numerics@FreeBSD.ORG Sun Aug 12 23:11:04 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 1DDBC106564A for ; Sun, 12 Aug 2012 23:11:04 +0000 (UTC) (envelope-from peter@rulingia.com) Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by mx1.freebsd.org (Postfix) with ESMTP id 7D37F8FC14 for ; Sun, 12 Aug 2012 23:11:03 +0000 (UTC) Received: from server.rulingia.com (c220-239-249-137.belrs5.nsw.optusnet.com.au [220.239.249.137]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CNB3nK075850 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Mon, 13 Aug 2012 09:11:03 +1000 (EST) (envelope-from peter@rulingia.com) X-Bogosity: Ham, spamicity=0.000000 Received: from server.rulingia.com (localhost.rulingia.com [127.0.0.1]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CNAvAo021684 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Mon, 13 Aug 2012 09:10:57 +1000 (EST) (envelope-from peter@server.rulingia.com) Received: (from peter@localhost) by server.rulingia.com (8.14.5/8.14.5/Submit) id q7CNAv6H021683 for freebsd-numerics@freebsd.org; Mon, 13 Aug 2012 09:10:57 +1000 (EST) (envelope-from peter) Resent-From: Peter Jeremy Resent-Date: Mon, 13 Aug 2012 09:10:56 +1000 Resent-Message-ID: <20120812231056.GE20453@server.rulingia.com> Resent-To: freebsd-numerics@freebsd.org Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q6KEssvS031624 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Sat, 21 Jul 2012 00:54:55 +1000 (EST) (envelope-from stephen@missouri.edu) Received: from wilberforce.math.missouri.edu (wilberforce.math.missouri.edu [128.206.184.213]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q6KEsqnW086889 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Sat, 21 Jul 2012 00:54:54 +1000 (EST) (envelope-from stephen@missouri.edu) Received: from [127.0.0.1] (wilberforce.math.missouri.edu [128.206.184.213]) by wilberforce.math.missouri.edu (8.14.5/8.14.5) with ESMTP id q6KEsWie037880; Fri, 20 Jul 2012 09:54:32 -0500 (CDT) (envelope-from stephen@missouri.edu) Message-ID: <50097128.6030405@missouri.edu> From: Stephen Montgomery-Smith Mail-Followup-To: freebsd-numerics@freebsd.org User-Agent: Mozilla/5.0 (X11; Linux i686; rv:14.0) Gecko/20120714 Thunderbird/14.0 MIME-Version: 1.0 To: Bruce Evans References: <20120714120432.GA70706@server.rulingia.com> <20120717084457.U3890@besplex.bde.org> <5004A5C7.1040405@missouri.edu> <5004DEA9.1050001@missouri.edu> <20120717200931.U6624@besplex.bde.org> <5006D13D.2080702@missouri.edu> <20120718205625.GA409@troutmask.apl.washington.edu> <500725F2.7060603@missouri.edu> <20120719025345.GA1376@troutmask.apl.washington.edu> <50077987.1080307@missouri.edu> <20120719032706.GA1558@troutmask.apl.washington.edu> <5007826D.7060806@missouri.edu> <5007AD41.9070000@missouri.edu> <20120719205347.T2601@besplex.bde.org> <50084322.7020401@missouri.edu> <20120720035001.W4053@besplex.bde.org> <50085441.4090305@missouri.edu> <20120720162953.N2162@besplex.bde.org> <20120720184114.B2790@besplex.bde.org> In-Reply-To: <20120720184114.B2790@besplex.bde.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: Diane Bruce , John Baldwin , David Chisnall , Bruce Evans , Steve Kargl , David Schultz , Peter Jeremy , Warner Losh Subject: Re: Use of C99 extra long double math functions after r236148 X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Date: Sun, 12 Aug 2012 23:11:04 -0000 X-Original-Date: Fri, 20 Jul 2012 09:54:32 -0500 X-List-Received-Date: Sun, 12 Aug 2012 23:11:04 -0000 On 07/20/2012 04:19 AM, Bruce Evans wrote: > On Fri, 20 Jul 2012, Bruce Evans wrote: > The only way for x*x + y*y to be _very_ near 1 in infinite precision > is for |x| or y| to be 1 (I think). Other cases are bounded away from > 1, and if you are lucky the bound is fairly far from 1 so that sloppier > approximations work OK. Mathematicians should determine the bound > exactly using continued fractions or something like they do for > approximations to N*Pi/2. This becomes especially interesting in high > precisions where you can't hope to get near the worst case by random > testing. > > % x0 = ldexp(floor(ldexp(x, 24)), -24); > % x1 = x - x0; > % y0 = ldexp(floor(ldexp(y, 24)), -24); > % y1 = y - y0; > > This has a chance of working iff the bound away from 1 is something like > 2**-24. Otherwise, multiplying by 2**24 and flooring a positive value > will just produce 0. 2**-24 seems much too small a bound. My test > coverage is not wide enough to hit many bad cases. This is meant to cover a situation where x = cos(t) and y = sin(t) for some t not a multiple of PI/2. Now, hypot(x,y) will be 1, but only to within machine precision, i.e. an error of about 1e-17. So log(hypot(x,y)) will be about 1e-17. The true answer being 0, the ULP will be infinite. BUT (and this goes with Goldberg's paper when he considers the quadratic formula when the quadratic equation has nearly equal roots), suppose x = (double)(cos(t)) that is, x is not exactly cos(t), but it is the number "cos(t) written in IEEE double precision". Similarly for y. That is, even though the formula that produce x and y isn't exact, let's pretend that x and y are exact. Again log(hypot(x,y)) will be about 1e-17. But the true answer will also be about 1e-17. But they won't be the same, and the ULP will be about 1e17. What my formula does is deal with the second case, but reduce the ULP to about 1e8! That is, if x and y are exact numbers, and it so happens that hypot(x,y) is very close to 1, my method will get you about 8 extra digits of accuracy. Now you have special formulas that handle the cases when z is close to 1, -1, I and -I. Earlier this morning I sent you a formula, which I think might be slightly more accurate than yours, for when z is close to 1. I think similar formulas can be produced for when z is close to -1, I and -I. To get ULP of about 1 when x and y are exact, and it happens that hypot(x,y) is close to 1, but z is not close to 1, -1, I or -I, would require, I think, hypot(x,y)-1 being computed using double double precision (i.e. a mantissa of 108 bits), and then feeding this into log1p. Of course, you may argue that situations when x and y are exact, not close to 1 or 0, and hypot(x,y) is close to 1, are so very rare that extra consideration is not required. But my algorithm produces better answers than the naive formula even when the distance between 1 and hypot is about 1/10. The naive formula has a ULP of about 10, and I get it down to less than 2. And when the distance between hypot(x,y) and 1 is about 1e-5, the naive formula has a ULP of about 1e5, and I still manage to get a ULP of about less than 2. Stephen From owner-freebsd-numerics@FreeBSD.ORG Sun Aug 12 23:11:15 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 2D792106566B for ; Sun, 12 Aug 2012 23:11:15 +0000 (UTC) (envelope-from peter@rulingia.com) Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by mx1.freebsd.org (Postfix) with ESMTP id B7A658FC08 for ; Sun, 12 Aug 2012 23:11:14 +0000 (UTC) Received: from server.rulingia.com (c220-239-249-137.belrs5.nsw.optusnet.com.au [220.239.249.137]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CNBEib075874 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Mon, 13 Aug 2012 09:11:14 +1000 (EST) (envelope-from peter@rulingia.com) X-Bogosity: Ham, spamicity=0.000000 Received: from server.rulingia.com (localhost.rulingia.com [127.0.0.1]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CNB8RB021711 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Mon, 13 Aug 2012 09:11:08 +1000 (EST) (envelope-from peter@server.rulingia.com) Received: (from peter@localhost) by server.rulingia.com (8.14.5/8.14.5/Submit) id q7CNB80s021710 for freebsd-numerics@freebsd.org; Mon, 13 Aug 2012 09:11:08 +1000 (EST) (envelope-from peter) Resent-From: Peter Jeremy Resent-Date: Mon, 13 Aug 2012 09:11:08 +1000 Resent-Message-ID: <20120812231108.GG20453@server.rulingia.com> Resent-To: freebsd-numerics@freebsd.org Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q6KKKJof060244 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Sat, 21 Jul 2012 06:20:19 +1000 (EST) (envelope-from stephen@missouri.edu) Received: from wilberforce.math.missouri.edu (wilberforce.math.missouri.edu [128.206.184.213]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q6KKKG1I094318 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Sat, 21 Jul 2012 06:20:18 +1000 (EST) (envelope-from stephen@missouri.edu) Received: from [127.0.0.1] (wilberforce.math.missouri.edu [128.206.184.213]) by wilberforce.math.missouri.edu (8.14.5/8.14.5) with ESMTP id q6KKJuKK058690; Fri, 20 Jul 2012 15:19:56 -0500 (CDT) (envelope-from stephen@missouri.edu) Message-ID: <5009BD6C.9050301@missouri.edu> From: Stephen Montgomery-Smith Mail-Followup-To: freebsd-numerics@freebsd.org User-Agent: Mozilla/5.0 (X11; Linux i686; rv:14.0) Gecko/20120714 Thunderbird/14.0 MIME-Version: 1.0 To: Bruce Evans References: <20120714120432.GA70706@server.rulingia.com> <20120717084457.U3890@besplex.bde.org> <5004A5C7.1040405@missouri.edu> <5004DEA9.1050001@missouri.edu> <20120717200931.U6624@besplex.bde.org> <5006D13D.2080702@missouri.edu> <20120718205625.GA409@troutmask.apl.washington.edu> <500725F2.7060603@missouri.edu> <20120719025345.GA1376@troutmask.apl.washington.edu> <50077987.1080307@missouri.edu> <20120719032706.GA1558@troutmask.apl.washington.edu> <5007826D.7060806@missouri.edu> <5007AD41.9070000@missouri.edu> <20120719205347.T2601@besplex.bde.org> <50084322.7020401@missouri.edu> <20120720035001.W4053@besplex.bde.org> <50085441.4090305@missouri.edu> <20120720162953.N2162@besplex.bde.org> <20120720184114.B2790@besplex.bde.org> <50097128.6030405@missouri.edu> <20120721032448.X5744@besplex.bde.org> In-Reply-To: <20120721032448.X5744@besplex.bde.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: Diane Bruce , John Baldwin , David Chisnall , Bruce Evans , Steve Kargl , David Schultz , Peter Jeremy , Warner Losh Subject: Re: Use of C99 extra long double math functions after r236148 X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Date: Sun, 12 Aug 2012 23:11:15 -0000 X-Original-Date: Fri, 20 Jul 2012 15:19:56 -0500 X-List-Received-Date: Sun, 12 Aug 2012 23:11:15 -0000 Bruce, with both of us working at the same time on clog, it is getting hard for me to follow. The version I sent this morning is the last change I made. How about if you come the owner of the code for a while. When you are finished, send it back to me, and I will look over everything you have done. I won't work on it until then. This works for me in other ways too, because my life is very busy at the moment. If I do work on code, it will be on casinh/casin. I am looking over the paper by Hull et al, and I am learning a lot from it. One thing I did realize - casin and casinh are essentially the same. If #define reverse(z) cpack(cimag(z),creal(z)) then casin(z) = reverse(casinh(reverse(z))). Unfortunately it is not so nice for cacos/cacosh. I do like the progress we are making, and I appreciate your help very much. From owner-freebsd-numerics@FreeBSD.ORG Sun Aug 12 23:11:23 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id C8647106566B for ; Sun, 12 Aug 2012 23:11:23 +0000 (UTC) (envelope-from peter@rulingia.com) Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by mx1.freebsd.org (Postfix) with ESMTP id 5D29D8FC19 for ; Sun, 12 Aug 2012 23:11:23 +0000 (UTC) Received: from server.rulingia.com (c220-239-249-137.belrs5.nsw.optusnet.com.au [220.239.249.137]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CNBNvX075880 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Mon, 13 Aug 2012 09:11:23 +1000 (EST) (envelope-from peter@rulingia.com) X-Bogosity: Ham, spamicity=0.000000 Received: from server.rulingia.com (localhost.rulingia.com [127.0.0.1]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CNBGTD021727 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Mon, 13 Aug 2012 09:11:17 +1000 (EST) (envelope-from peter@server.rulingia.com) Received: (from peter@localhost) by server.rulingia.com (8.14.5/8.14.5/Submit) id q7CNBGXU021726 for freebsd-numerics@freebsd.org; Mon, 13 Aug 2012 09:11:16 +1000 (EST) (envelope-from peter) Resent-From: Peter Jeremy Resent-Date: Mon, 13 Aug 2012 09:11:16 +1000 Resent-Message-ID: <20120812231116.GI20453@server.rulingia.com> Resent-To: freebsd-numerics@freebsd.org Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q6L3hwTI063546 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Sat, 21 Jul 2012 13:43:58 +1000 (EST) (envelope-from stephen@missouri.edu) Received: from wilberforce.math.missouri.edu (wilberforce.math.missouri.edu [128.206.184.213]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q6L3htmF095447 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Sat, 21 Jul 2012 13:43:57 +1000 (EST) (envelope-from stephen@missouri.edu) Received: from [127.0.0.1] (wilberforce.math.missouri.edu [128.206.184.213]) by wilberforce.math.missouri.edu (8.14.5/8.14.5) with ESMTP id q6L3hX6d014258; Fri, 20 Jul 2012 22:43:34 -0500 (CDT) (envelope-from stephen@missouri.edu) Message-ID: <500A2565.9090009@missouri.edu> From: Stephen Montgomery-Smith Mail-Followup-To: freebsd-numerics@freebsd.org User-Agent: Mozilla/5.0 (X11; Linux i686; rv:14.0) Gecko/20120714 Thunderbird/14.0 MIME-Version: 1.0 To: Bruce Evans References: <20120714120432.GA70706@server.rulingia.com> <20120717084457.U3890@besplex.bde.org> <5004A5C7.1040405@missouri.edu> <5004DEA9.1050001@missouri.edu> <20120717200931.U6624@besplex.bde.org> <5006D13D.2080702@missouri.edu> <20120718205625.GA409@troutmask.apl.washington.edu> <500725F2.7060603@missouri.edu> <20120719025345.GA1376@troutmask.apl.washington.edu> <50077987.1080307@missouri.edu> <20120719032706.GA1558@troutmask.apl.washington.edu> <5007826D.7060806@missouri.edu> <5007AD41.9070000@missouri.edu> <20120719205347.T2601@besplex.bde.org> <50084322.7020401@missouri.edu> <20120720035001.W4053@besplex.bde.org> <50085441.4090305@missouri.edu> <20120720162953.N2162@besplex.bde.org> <20120720184114.B2790@besplex.bde.org> <50097128.6030405@missouri.edu> <20120721032448.X5744@besplex.bde.org> <5009BD6C.9050301@missouri.edu> <20120721123522.T877@besplex.bde.org> In-Reply-To: <20120721123522.T877@besplex.bde.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: Diane Bruce , John Baldwin , David Chisnall , Bruce Evans , Steve Kargl , David Schultz , Peter Jeremy , Warner Losh Subject: Re: Use of C99 extra long double math functions after r236148 X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Date: Sun, 12 Aug 2012 23:11:24 -0000 X-Original-Date: Fri, 20 Jul 2012 22:43:33 -0500 X-List-Received-Date: Sun, 12 Aug 2012 23:11:24 -0000 On 07/20/2012 10:34 PM, Bruce Evans wrote: > On Fri, 20 Jul 2012, Stephen Montgomery-Smith wrote: > >> Bruce, with both of us working at the same time on clog, it is getting >> hard for me to follow. The version I sent this morning is the last >> change I made. >> >> How about if you come the owner of the code for a while. When you are >> finished, send it back to me, and I will look over everything you have >> done. I won't work on it until then. This works for me in other ways >> too, because my life is very busy at the moment. > > I'd prefer you (or Somone Else) to keep working on it. I just plugged it > into my test framework and started zapping errors... (I need to make my > test framework easier to set up so that I don't have any investment in > the not seeing the errors.) > Do you have a piece of code after you made the changes? Or did you only record the changes in the emails you sent to me? I was hoping that you could send me a file as an attachment, with all your suggested changes. But if I have to go through all the emails you sent in the last few days, I guess I'll have to do that. From owner-freebsd-numerics@FreeBSD.ORG Sun Aug 12 23:11:27 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 2FE5B106566B for ; Sun, 12 Aug 2012 23:11:27 +0000 (UTC) (envelope-from peter@rulingia.com) Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by mx1.freebsd.org (Postfix) with ESMTP id B95858FC0C for ; Sun, 12 Aug 2012 23:11:26 +0000 (UTC) Received: from server.rulingia.com (c220-239-249-137.belrs5.nsw.optusnet.com.au [220.239.249.137]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CNBQ11075883 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Mon, 13 Aug 2012 09:11:26 +1000 (EST) (envelope-from peter@rulingia.com) X-Bogosity: Ham, spamicity=0.000000 Received: from server.rulingia.com (localhost.rulingia.com [127.0.0.1]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CNBK9V021737 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Mon, 13 Aug 2012 09:11:20 +1000 (EST) (envelope-from peter@server.rulingia.com) Received: (from peter@localhost) by server.rulingia.com (8.14.5/8.14.5/Submit) id q7CNBKDL021736 for freebsd-numerics@freebsd.org; Mon, 13 Aug 2012 09:11:20 +1000 (EST) (envelope-from peter) Resent-From: Peter Jeremy Resent-Date: Mon, 13 Aug 2012 09:11:20 +1000 Resent-Message-ID: <20120812231120.GJ20453@server.rulingia.com> Resent-To: freebsd-numerics@freebsd.org Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q6L5PL7e064390 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Sat, 21 Jul 2012 15:25:22 +1000 (EST) (envelope-from stephen@missouri.edu) Received: from wilberforce.math.missouri.edu (wilberforce.math.missouri.edu [128.206.184.213]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q6L5PI0A095666 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Sat, 21 Jul 2012 15:25:20 +1000 (EST) (envelope-from stephen@missouri.edu) Received: from [127.0.0.1] (wilberforce.math.missouri.edu [128.206.184.213]) by wilberforce.math.missouri.edu (8.14.5/8.14.5) with ESMTP id q6L5OtXL020678; Sat, 21 Jul 2012 00:24:55 -0500 (CDT) (envelope-from stephen@missouri.edu) Message-ID: <500A3D27.9070403@missouri.edu> From: Stephen Montgomery-Smith Mail-Followup-To: freebsd-numerics@freebsd.org User-Agent: Mozilla/5.0 (X11; Linux i686; rv:14.0) Gecko/20120714 Thunderbird/14.0 MIME-Version: 1.0 To: Bruce Evans References: <20120714120432.GA70706@server.rulingia.com> <20120717084457.U3890@besplex.bde.org> <5004A5C7.1040405@missouri.edu> <5004DEA9.1050001@missouri.edu> <20120717200931.U6624@besplex.bde.org> <5006D13D.2080702@missouri.edu> <20120718205625.GA409@troutmask.apl.washington.edu> <500725F2.7060603@missouri.edu> <20120719025345.GA1376@troutmask.apl.washington.edu> <50077987.1080307@missouri.edu> <20120719032706.GA1558@troutmask.apl.washington.edu> <5007826D.7060806@missouri.edu> <5007AD41.9070000@missouri.edu> <20120719205347.T2601@besplex.bde.org> <50084322.7020401@missouri.edu> <20120720035001.W4053@besplex.bde.org> <50085441.4090305@missouri.edu> <20120720162953.N2162@besplex.bde.org> <20120720184114.B2790@besplex.bde.org> <50097128.6030405@missouri.edu> <20120721032448.X5744@besplex.bde.org> <5009BD6C.9050301@missouri.edu> <20120721123522.T877@besplex.bde.org> <500A2565.9090009@missouri.edu> In-Reply-To: <500A2565.9090009@missouri.edu> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: Diane Bruce , John Baldwin , David Chisnall , Bruce Evans , Steve Kargl , David Schultz , Peter Jeremy , Warner Losh Subject: Re: Use of C99 extra long double math functions after r236148 X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Date: Sun, 12 Aug 2012 23:11:27 -0000 X-Original-Date: Sat, 21 Jul 2012 00:24:55 -0500 X-List-Received-Date: Sun, 12 Aug 2012 23:11:27 -0000 On 07/20/2012 10:43 PM, Stephen Montgomery-Smith wrote: > On 07/20/2012 10:34 PM, Bruce Evans wrote: >> On Fri, 20 Jul 2012, Stephen Montgomery-Smith wrote: >> >>> Bruce, with both of us working at the same time on clog, it is getting >>> hard for me to follow. The version I sent this morning is the last >>> change I made. >>> >>> How about if you come the owner of the code for a while. When you are >>> finished, send it back to me, and I will look over everything you have >>> done. I won't work on it until then. This works for me in other ways >>> too, because my life is very busy at the moment. >> >> I'd prefer you (or Somone Else) to keep working on it. I just plugged it >> into my test framework and started zapping errors... (I need to make my >> test framework easier to set up so that I don't have any investment in >> the not seeing the errors.) >> > > Do you have a piece of code after you made the changes? Or did you only > record the changes in the emails you sent to me? I was hoping that you > could send me a file as an attachment, with all your suggested changes. > But if I have to go through all the emails you sent in the last few > days, I guess I'll have to do that. > OK Bruce, I started to go to sleep, and my mind started racing. Now I am beginning to understand what you were trying to tell me all day today. I can see that I might quite a lot of mistakes, and you were correcting me. And also, I think that I now see that you were trying to tell me that I could emulate genuine double double precision math with the hi-lo thing. The ideas were coming too thick and fast. I also had a lot of other stuff going on today. But even I hadn't, I think your ideas would have been coming too fast and thick anyway. I'll go through the emails you sent me today more carefully in the next week or so. Thanks. And sorry I was not reading your stuff properly today. Stephen From owner-freebsd-numerics@FreeBSD.ORG Sun Aug 12 23:11:33 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id EF743106566B for ; Sun, 12 Aug 2012 23:11:32 +0000 (UTC) (envelope-from peter@rulingia.com) Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by mx1.freebsd.org (Postfix) with ESMTP id 0EF438FC0A for ; Sun, 12 Aug 2012 23:11:31 +0000 (UTC) Received: from server.rulingia.com (c220-239-249-137.belrs5.nsw.optusnet.com.au [220.239.249.137]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CNBVAD075887 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Mon, 13 Aug 2012 09:11:32 +1000 (EST) (envelope-from peter@rulingia.com) X-Bogosity: Ham, spamicity=0.000000 Received: from server.rulingia.com (localhost.rulingia.com [127.0.0.1]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CNBPa4021751 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Mon, 13 Aug 2012 09:11:25 +1000 (EST) (envelope-from peter@server.rulingia.com) Received: (from peter@localhost) by server.rulingia.com (8.14.5/8.14.5/Submit) id q7CNBPLT021750 for freebsd-numerics@freebsd.org; Mon, 13 Aug 2012 09:11:25 +1000 (EST) (envelope-from peter) Resent-From: Peter Jeremy Resent-Date: Mon, 13 Aug 2012 09:11:25 +1000 Resent-Message-ID: <20120812231125.GL20453@server.rulingia.com> Resent-To: freebsd-numerics@freebsd.org Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q6M1bp7V055103 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Sun, 22 Jul 2012 11:37:51 +1000 (EST) (envelope-from stephen@missouri.edu) Received: from wilberforce.math.missouri.edu (wilberforce.math.missouri.edu [128.206.184.213]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q6M1bkNJ004462 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Sun, 22 Jul 2012 11:37:48 +1000 (EST) (envelope-from stephen@missouri.edu) Received: from [127.0.0.1] (wilberforce.math.missouri.edu [128.206.184.213]) by wilberforce.math.missouri.edu (8.14.5/8.14.5) with ESMTP id q6M1bHfY099640; Sat, 21 Jul 2012 20:37:17 -0500 (CDT) (envelope-from stephen@missouri.edu) Message-ID: <500B594D.1020305@missouri.edu> From: Stephen Montgomery-Smith Mail-Followup-To: freebsd-numerics@freebsd.org User-Agent: Mozilla/5.0 (X11; Linux i686; rv:14.0) Gecko/20120714 Thunderbird/14.0 MIME-Version: 1.0 To: Bruce Evans References: <20120714120432.GA70706@server.rulingia.com> <20120717084457.U3890@besplex.bde.org> <5004A5C7.1040405@missouri.edu> <5004DEA9.1050001@missouri.edu> <20120717200931.U6624@besplex.bde.org> <5006D13D.2080702@missouri.edu> <20120718205625.GA409@troutmask.apl.washington.edu> <500725F2.7060603@missouri.edu> <20120719025345.GA1376@troutmask.apl.washington.edu> <50077987.1080307@missouri.edu> <20120719032706.GA1558@troutmask.apl.washington.edu> <5007826D.7060806@missouri.edu> <5007AD41.9070000@missouri.edu> <20120719205347.T2601@besplex.bde.org> <50084322.7020401@missouri.edu> <20120720035001.W4053@besplex.bde.org> <50085441.4090305@missouri.edu> <20120720162953.N2162@besplex.bde.org> <20120720184114.B2790@besplex.bde.org> <50097128.6030405@missouri.edu> <20120721032448.X5744@besplex.bde.org> <5009BD6C.9050301@missouri.edu> <20120721123522.T877@besplex.bde.org> <500A2565.9090009@missouri.edu> <20120721181204.A1702@besplex.bde.org> In-Reply-To: <20120721181204.A1702@besplex.bde.org> Content-Type: multipart/mixed; boundary="------------070704010607090802070100" X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: Diane Bruce , John Baldwin , David Chisnall , Bruce Evans , Steve Kargl , David Schultz , Peter Jeremy , Warner Losh Subject: Re: Use of C99 extra long double math functions after r236148 X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Date: Sun, 12 Aug 2012 23:11:33 -0000 X-Original-Date: Sat, 21 Jul 2012 20:37:17 -0500 X-List-Received-Date: Sun, 12 Aug 2012 23:11:33 -0000 This is a multi-part message in MIME format. --------------070704010607090802070100 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Three things. 1. I think I now understand the casinh algorithm by Hull et al rather well. I implemented it in the attached code. Bruce, could you try it out against pari? It did compare rather well to Mathematica's ArcSinh function. I tested it mostly near the branch cut ends at I and -I. (The branch cut's behavior is somewhat similar to that of sqrt.) After this, I think getting casin, cacos and cacosh should be relatively straightforward. The hardest thing about cacos and cacosh will be figuring out where all the branch cuts and discontinuities are. I also worked quite hard to avoid underflows and overflows from happening. But I need to go through it again several times to check that the logic really works. 2. I have thought more about the problem of computing clog(z) when |z| is close to 1. I now think it might even require precision that is 3 times better than double precision. It is possible that you could, by chance, pick x and y so that |z| = 1 + 1e30. (I wrote a test program, and this did actually happen a few times.) So when you compute x^2+y^2-1, if you do it using double double precision, you will get 2e30, but only the most significant digits will be accurate. The ULP will be about 1e15. You need triple double precision to get ULP's close to 1. And I cannot even figure out how to do log(1+z) when z is close to 0! The trouble is, (1+x)^2+y^2-1 = 2x+x^2+y^2, and if x is negative and approximately -y^2, you are in the situation of subtracting nearly equal numbers! (Obviously if z is very close to 0, I could use Taylor's series.) 3. I haven't learned proper style yet. (Is it what you call KNF?) I have always had a distrust of consistency, especially when it comes to people's coding styles. Sorry, but this is going to be a bit painful for me. Stephen --------------070704010607090802070100-- From owner-freebsd-numerics@FreeBSD.ORG Sun Aug 12 23:11:43 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 8D115106566B for ; Sun, 12 Aug 2012 23:11:43 +0000 (UTC) (envelope-from peter@rulingia.com) Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by mx1.freebsd.org (Postfix) with ESMTP id DF4B88FC14 for ; Sun, 12 Aug 2012 23:11:42 +0000 (UTC) Received: from server.rulingia.com (c220-239-249-137.belrs5.nsw.optusnet.com.au [220.239.249.137]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CNBgug075897 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Mon, 13 Aug 2012 09:11:42 +1000 (EST) (envelope-from peter@rulingia.com) X-Bogosity: Ham, spamicity=0.000000 Received: from server.rulingia.com (localhost.rulingia.com [127.0.0.1]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CNBaMX021764 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Mon, 13 Aug 2012 09:11:36 +1000 (EST) (envelope-from peter@server.rulingia.com) Received: (from peter@localhost) by server.rulingia.com (8.14.5/8.14.5/Submit) id q7CNBaNU021763 for freebsd-numerics@freebsd.org; Mon, 13 Aug 2012 09:11:36 +1000 (EST) (envelope-from peter) Resent-From: Peter Jeremy Resent-Date: Mon, 13 Aug 2012 09:11:36 +1000 Resent-Message-ID: <20120812231136.GN20453@server.rulingia.com> Resent-To: freebsd-numerics@freebsd.org Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q6MFObAe056537 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Mon, 23 Jul 2012 01:24:37 +1000 (EST) (envelope-from stephen@missouri.edu) Received: from wilberforce.math.missouri.edu (wilberforce.math.missouri.edu [128.206.184.213]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q6MFOYkU006698 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Mon, 23 Jul 2012 01:24:36 +1000 (EST) (envelope-from stephen@missouri.edu) Received: from [127.0.0.1] (wilberforce.math.missouri.edu [128.206.184.213]) by wilberforce.math.missouri.edu (8.14.5/8.14.5) with ESMTP id q6MFO9Fc054062; Sun, 22 Jul 2012 10:24:11 -0500 (CDT) (envelope-from stephen@missouri.edu) Message-ID: <500C1B1A.5070107@missouri.edu> From: Stephen Montgomery-Smith Mail-Followup-To: freebsd-numerics@freebsd.org User-Agent: Mozilla/5.0 (X11; Linux i686; rv:14.0) Gecko/20120714 Thunderbird/14.0 MIME-Version: 1.0 To: Bruce Evans References: <20120714120432.GA70706@server.rulingia.com> <5004A5C7.1040405@missouri.edu> <5004DEA9.1050001@missouri.edu> <20120717200931.U6624@besplex.bde.org> <5006D13D.2080702@missouri.edu> <20120718205625.GA409@troutmask.apl.washington.edu> <500725F2.7060603@missouri.edu> <20120719025345.GA1376@troutmask.apl.washington.edu> <50077987.1080307@missouri.edu> <20120719032706.GA1558@troutmask.apl.washington.edu> <5007826D.7060806@missouri.edu> <5007AD41.9070000@missouri.edu> <20120719205347.T2601@besplex.bde.org> <50084322.7020401@missouri.edu> <20120720035001.W4053@besplex.bde.org> <50085441.4090305@missouri.edu> <20120720162953.N2162@besplex.bde.org> <20120720184114.B2790@besplex.bde.org> <50097128.6030405@missouri.edu> <20120721032448.X5744@besplex.bde.org> <5009BD6C.9050301@missouri.edu> <20120721123522.T877@besplex.bde.org> <500A2565.9090009@missouri.edu> <20120721181204.A1702@besplex.bde.org> <500B594D.1020305@missouri.edu> <20120722125300.P2246@besplex.bde.org> In-Reply-To: <20120722125300.P2246@besplex.bde.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: Diane Bruce , John Baldwin , David Chisnall , Bruce Evans , Steve Kargl , David Schultz , Peter Jeremy , Warner Losh Subject: Re: Use of C99 extra long double math functions after r236148 X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Date: Sun, 12 Aug 2012 23:11:43 -0000 X-Original-Date: Sun, 22 Jul 2012 10:24:10 -0500 X-List-Received-Date: Sun, 12 Aug 2012 23:11:43 -0000 I will study your comments on KNF later, and try rewriting the code appropriately. In the mean time, could you send me a copy of your set up for non-manually comparing C answers with pari answers? Speed is really not an issue. My comparisons with Mathematica were definitely manual. The main issue as I see it is this. How do different programs communicate double precision numbers to each other, without loosing precision? I don't want one program to convert to base 10, and then have the other program read it and convert back to its internal floating point? I can see from the C end how to do it (using EXTRACT_WORDS etc). Can pari do this as well? mpfr is OK, but terribly cludgy. Having it compute clog is easy. But having it computing casinh is going to be very painful. If I cannot use pari, I will probably use C++ code based around a floating point package called cln. It is used in GiNaC, which is a rather cool way of embedding symbolic expressions inside C++ code. They are both available in the ports. I see on the internet that there is a C++ overlay of mpfr as well. But it doesn't seem to be in the FreeBSD ports. I am not in the mood to create a new port just now. The C++ overlay of gmp does exist, but gmp doesn't even have log and atan2. Incidentally, I just now realized that my test code for clog needs more than 100 bits. I'll try to fix that this afternoon. Maybe that is why I am seeing examples of 1e-30 in the real part of clog. On 07/22/2012 01:19 AM, Bruce Evans wrote: > On Sat, 21 Jul 2012, Stephen Montgomery-Smith wrote: > >> 1. I think I now understand the casinh algorithm by Hull et al rather >> well. I implemented it in the attached code. Bruce, could you try it >> out against pari? It did compare rather well to Mathematica's ArcSinh >> function. I tested it mostly near the branch cut ends at I and -I. >> (The branch cut's behavior is somewhat similar to that of sqrt.) > > I'm actually using my test framework on the real part, and then pari > to manually check differences found by this. The test framework checks > a few billion cases (not manually :-) in a minute or 2. pari is much > slower. I don't use mfpr (spelling?) yet. I think it would be slow > too, but not as slow as pari. However, my test framework doesn't > really handle long doubles, so I use pari (not manually) to check a > few million cases (in the same time as a few billion for the C version) > for them. > >> 2. I have thought more about the problem of computing clog(z) when >> |z| is close to 1. I now think it might even require precision that >> is 3 times better than double precision. >> It is possible that you could, by chance, pick x and y so that |z| = 1 >> + 1e30. (I wrote a test program, and this did > e-30 here and elsewhere >> actually happen a few times.) So when you compute x^2+y^2-1, if you >> do it using double double precision, you will get 2e30, but only the >> most significant digits will be accurate. The ULP will be about >> 1e15. You need triple double precision to get ULP's close to 1. > > Hmm, I'm not seeing such cases except when |x| or |y| is 1 and the other > is tiny. I'm only using 21 bits of extra precision now. I think that > has been tested for |z| much closer to 1 than 2^-21, so it is avoiding > more than 21 bits of cancelation, but nothing like the ~100 bits for > 1e-30 away from 1. I hope it doesn't get close. More testing and > analysis is required. > > It is certainly amazingly difficult to subtract 1 without losing accuracy. > >> And I cannot even figure out how to do log(1+z) when z is close to 0! >> The trouble is, (1+x)^2+y^2-1 = 2x+x^2+y^2, and if x is negative and >> approximately -y^2, you are in the situation of subtracting nearly >> equal numbers! (Obviously if z is very close to 0, I could use >> Taylor's series.) > > This is a bit easier, and the hardest subcase isn't required for clog(): > - the case of z very close to 0 is not actually obvious. > - (A) For z = x + I*y with x = 0 and y tiny, log(1+z) ~= (y/2)*y. > (Not y*y/2 since that loses an ulp by spuriously underflowing when > y is slightly larger than sqrt(the smallest strictly positive > representable value.) > - (B) We get the case where 2x almost cancels with y^2 when x is tiny > and > negative and y ~= sqrt(|x|). x has to be fairly tiny to cause > problems. > This case is unreachable for clog() because we start with x near 1 > and subtract 1 from it. This always gives a non-tiny x (|x| >= > DBL_MIN/2). This is a sub-case of (C). > - (C) when z is merely close to 0, the cancelations are not very large. > > I was starting to consider the following case for clog(): |x| much larger > (relatively) than |y|. |x| may or may not be near 1. Then hypot() just > returns |x|. For clog(), |x| needs to dominate |y| by much more for > |y| to be ignored, especially when |x| = 1. When |x| dominates |y| for > hypot() but not for clog(), we are in the solved case (A) or the non- > canceling case (C). > > The remaining problematic case for clog() is when neither x nor y is small, > and x^2+y^2 is nearly 1 (say both near 1/sqrt(2)). It seems to be > necessary > to square them using doubled double precision. Then subtract 1 from x^2 > (no more precision required). Then subtract y^2 (no more precision > required, and the result needs only double precision plus a couple of > guard bits). The squaring is currently only done in ~sesqui double > precision, which only handles most cases. 24 extra bits is ~29 fewer > than needed, but it handles all problematic cases except ~1 in every > 2^24 (assuming uniform distribution in double space), and the problematic > cases are already sparse. My longest test was of 2^32 cases (takes half > an hour). This hits some problematic cases, but only for the Apple > implementation because it uses _no_ extra bits except when real(z) == 1). > Its maximum error detected increased from 1 ulp to 16 ulps when the number > of test cases was increased by a factor of 16. > >> 3. I haven't learned proper style yet. (Is it what you call KNF?) I >> have always had a distrust of consistency, especially when it comes to >> people's coding styles. Sorry, but this is going to be a bit painful >> for me. > > The main non-KNFisms are comment indentation, comment filling, and spaces > around binary operators. indent [-npro] does a good job of fixing the > first 2 and a fairly good job with the spaces, but makes some messes in > the code. Mathematicians like to omit spaces (and multiplication operators > ...) and use single letters for identifiers, but programmers learned that > this gives hard-to-maintain code. > > % ... > % /* > % * gcc doesn't implement complex multiplication or division correctly, > % * so we need to handle infinities specially. We turn on this pragma to > % * notify conforming c99 compilers that the fast-but-incorrect code that > % * gcc generates is acceptable, since the special cases have already been > % * handled. > % */ > % #pragma STDC CX_LIMITED_RANGE ON > > This comment (and code) was copied from s_csqrt.c, and is filled normally > (except for the single-space sentence break). > > I think this doesn't actually apply here or in s_clog.c, since no complex > operations are done directly. > > % ... > % /* > % The algorithm is very close to that in > % "Implementing the complex arcsine and arccosine functions using exception > % handling" by T. E. Hull, Thomas F. Fairgrieve, and Ping Tak Peter Tang, > % ... > % */ > > This and most block comments aren't filled normally (with a column of > stars at the left). > > When indent(1) fixes these, it mangles the formatting of displayed > formulas and similar things. Comments with manual formatting should be > marked up by starting them with "/*-" to prevent indent mangling them. > s_clog.c doesn't have any comments with displayed formulas, and indent > did a good job of reformatting their paragraphs. I normally tell indent > not to reformat any block comments. > > % ... > % static double f(double x, double y, int *underflow) { > > The left brace at the start of a function should be on a new line. > > % if (x==0) { > % *underflow = 0; > % if (y > 0) > % return 0; > % return -y; > % } > > This has spaces around some of the binary operators but not all. > > Return expressions are parenthesized in KNF. indent doesn't fix this. > > % ... > % double complex > % casinh(double complex z) > % { > > Normal now. > > % ... > % if (cabs(z) > 1e20) { > % if (huge+x>one) { /* set inexact flag. */ > % if (sx == 0) return clog(2*z); > % if (sx == 1) return -clog(-2*z); > % } > % } > > Use a new line for all statements. Especially return statements. For > among other reasons, so that it is as easy as possible to manage > statements in line-oriented debuggers. > > % ... > % if (A < 1.5) { > % fp = f(x,1+y,&fpuf); > % fm = f(x,1-y,&fmuf); > % if (fpuf == 1 && fmuf == 1) { > % if (huge+x>one) /* set inexact flag. */ > % rx = log1p(x*sqrt((fp+fm)*(A+1))); > % } else if (fmuf == 1) { > % /* Overflow not possible because fp < 1e50 and x > 1e-100. > % Underflow not possible because either fm=0 or fm approximately > % bigger than 1e-200. */ > % if (huge+x>one) /* set inexact flag. */ > % rx = log1p(fp+sqrt(x)*sqrt((fp/x+fm*x)*(A+1))); > > Comments should be indented the same as the code that they describe. I > found ones like the above especially hard to read (much harder than the > compressed formulas). > > indent -npro turns (some of) the above into the following: > > @ ... > @ static const double > @ one = 1.00000000000000000000e+00, /* 0x3FF00000, > @ * 0x00000000 */ > @ huge = 1.00000000000000000000e+300; > > Mangled. > > @ @ /* > @ * The algorithm is very close to that in "Implementing the complex > arcsine > @ * and arccosine functions using exception handling" by T. E. Hull, > Thomas F. > @ * Fairgrieve, and Ping Tak Peter Tang, published in ACM Transactions on > @ * Mathematical Software, Volume 23 Issue 3, 1997, Pages 299-335. > @ * http://dl.acm.org/citation.cfm?id=275324 > @ * @ * casinh(x+iy) = sign(x)*log(A+sqrt(A*A+1)) + sign(y)*I*acos(B) > where A = > @ * 0.5(|z+I| + |z-I|) = f(x,1+y) + f(x,1-y) + 1 B = 0.5(|z+I| - |z-I|) > z = > @ * x+I*y f(x,y) = 0.5*(hypot(x,y)-y) We also use asin(B) = > @ * atan2(sqrt(A*A-y*y),y) A-y = f(x,y+1) + f(x,y-1). > @ * @ * Much of the difficulty comes because computing f(x,y) may > produce underflows. > @ */ > > Grossly mangled. > > @ ... > @ static double @ f(double x, double y, int *underflow) > @ { > @ if (x == 0) { > @ *underflow = 0; > @ if (y > 0) > @ return 0; > @ return -y; > @ } > > Fixed, except for the return statements (and a weirder KNFism which I > won't describe now). > > @ ... > @ double complex > @ casinh(double complex z) > @ { > @ double x , y, sx, sy, rx, ry; > @ double R , S, A, B, fp, fm; > @ int fpuf , fmuf; > > Mangled. indent's default profile is bad. > > @ ... > @ if (cabs(z) > 1e20) { > @ if (huge + x > one) { /* set inexact flag. */ > @ if (sx == 0) > @ return clog(2 * z); > @ if (sx == 1) > @ return -clog(-2 * z); > @ } > @ } > > indent actually understands KNF for most code. It added spaces and > newlines here. > > @ ... > @ } else if (fmuf == 1) { > @ /* > @ * Overflow not possible because fp < 1e50 and x > > @ * 1e-100. Underflow not possible because either fm=0 > @ * or fm approximately bigger than 1e-200. > @ */ > @ if (huge + x > one) /* set inexact flag. */ > @ rx = log1p(fp + sqrt(x) * sqrt((fp / x + fm * x) * (A > + 1))); > @ } else if (fpuf == 1) { > @ /* Similar arguments against over/underflow. */ > > Comments reformatted and indented OK. > > Some lines were mangled (made too long) by adding spaces. indent doesn't > understand its own -lp option. > > s_clog.c has simpler expressions that mostlu didn't expand too much. > > Bruce > > From owner-freebsd-numerics@FreeBSD.ORG Sun Aug 12 23:11:55 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 53F65106566B for ; Sun, 12 Aug 2012 23:11:55 +0000 (UTC) (envelope-from peter@rulingia.com) Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by mx1.freebsd.org (Postfix) with ESMTP id E34638FC0C for ; Sun, 12 Aug 2012 23:11:54 +0000 (UTC) Received: from server.rulingia.com (c220-239-249-137.belrs5.nsw.optusnet.com.au [220.239.249.137]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CNBs65075902 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Mon, 13 Aug 2012 09:11:54 +1000 (EST) (envelope-from peter@rulingia.com) X-Bogosity: Ham, spamicity=0.000000 Received: from server.rulingia.com (localhost.rulingia.com [127.0.0.1]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CNBmNE021772 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Mon, 13 Aug 2012 09:11:48 +1000 (EST) (envelope-from peter@server.rulingia.com) Received: (from peter@localhost) by server.rulingia.com (8.14.5/8.14.5/Submit) id q7CNBmVC021771 for freebsd-numerics@freebsd.org; Mon, 13 Aug 2012 09:11:48 +1000 (EST) (envelope-from peter) Resent-From: Peter Jeremy Resent-Date: Mon, 13 Aug 2012 09:11:48 +1000 Resent-Message-ID: <20120812231148.GO20453@server.rulingia.com> Resent-To: freebsd-numerics@freebsd.org Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q6MHLOix091134 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Mon, 23 Jul 2012 03:21:25 +1000 (EST) (envelope-from sgk@troutmask.apl.washington.edu) Received: from troutmask.apl.washington.edu (troutmask.apl.washington.edu [128.95.76.21]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q6MHLM3r007277 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Mon, 23 Jul 2012 03:21:23 +1000 (EST) (envelope-from sgk@troutmask.apl.washington.edu) Received: from troutmask.apl.washington.edu (localhost.apl.washington.edu [127.0.0.1]) by troutmask.apl.washington.edu (8.14.5/8.14.5) with ESMTP id q6MHLKtV083285; Sun, 22 Jul 2012 10:21:20 -0700 (PDT) (envelope-from sgk@troutmask.apl.washington.edu) Received: (from sgk@localhost) by troutmask.apl.washington.edu (8.14.5/8.14.5/Submit) id q6MHLJ4k083284; Sun, 22 Jul 2012 10:21:19 -0700 (PDT) (envelope-from sgk) From: Steve Kargl Mail-Followup-To: freebsd-numerics@freebsd.org To: Stephen Montgomery-Smith Message-ID: <20120722172119.GA83243@troutmask.apl.washington.edu> References: <20120720184114.B2790@besplex.bde.org> <50097128.6030405@missouri.edu> <20120721032448.X5744@besplex.bde.org> <5009BD6C.9050301@missouri.edu> <20120721123522.T877@besplex.bde.org> <500A2565.9090009@missouri.edu> <20120721181204.A1702@besplex.bde.org> <500B594D.1020305@missouri.edu> <20120722125300.P2246@besplex.bde.org> <500C1B1A.5070107@missouri.edu> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <500C1B1A.5070107@missouri.edu> User-Agent: Mutt/1.4.2.3i Cc: Diane Bruce , John Baldwin , David Chisnall , Bruce Evans , Bruce Evans , David Schultz , Peter Jeremy , Warner Losh Subject: Re: Use of C99 extra long double math functions after r236148 X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Date: Sun, 12 Aug 2012 23:11:55 -0000 X-Original-Date: Sun, 22 Jul 2012 10:21:19 -0700 X-List-Received-Date: Sun, 12 Aug 2012 23:11:55 -0000 On Sun, Jul 22, 2012 at 10:24:10AM -0500, Stephen Montgomery-Smith wrote: > > mpfr is OK, but terribly cludgy. I almost chocked on my coffee reading this statement. > Having it compute clog is easy. But > having it computing casinh is going to be very painful. If I cannot use > pari, I will probably use C++ code based around a floating point package > called cln. It is used in GiNaC, which is a rather cool way of > embedding symbolic expressions inside C++ code. They are both available > in the ports. I you looked at MPC? http://www.multiprecision.org/index.php?prog=mpc >From the MPC manual: 4.6 Branch Cuts And Special Values Some complex functions have branch cuts, across which the function is discontinous. In GNU MPC, the branch cuts chosen are the same as those specified for the corresponding functions in the ISO C99 standard. Likewise, when evaluated at a point whose real or imaginary part is either infinite or a NaN or a signed zero, a function returns the same value as those specified for the corresponding function in the ISO C99 standard. 5.9 Trigonometric Functions - Function: int mpc_asinh (mpc_t rop, mpc_t op, mpc_rnd_t rnd) - Function: int mpc_acosh (mpc_t rop, mpc_t op, mpc_rnd_t rnd) - Function: int mpc_atanh (mpc_t rop, mpc_t op, mpc_rnd_t rnd) Set rop to the inverse hyperbolic sine, inverse hyperbolic cosine, inverse hyperbolic tangent of op, rounded according to rnd with the precision of rop. The branch cut of mpc_acosh is (-\infty, 1). -- Steve From owner-freebsd-numerics@FreeBSD.ORG Sun Aug 12 23:12:01 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id DCCC81065676 for ; Sun, 12 Aug 2012 23:12:01 +0000 (UTC) (envelope-from peter@rulingia.com) Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by mx1.freebsd.org (Postfix) with ESMTP id 49BCF8FC08 for ; Sun, 12 Aug 2012 23:12:01 +0000 (UTC) Received: from server.rulingia.com (c220-239-249-137.belrs5.nsw.optusnet.com.au [220.239.249.137]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CNC1q8075905 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Mon, 13 Aug 2012 09:12:01 +1000 (EST) (envelope-from peter@rulingia.com) X-Bogosity: Ham, spamicity=0.000000 Received: from server.rulingia.com (localhost.rulingia.com [127.0.0.1]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CNBsZC021784 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Mon, 13 Aug 2012 09:11:54 +1000 (EST) (envelope-from peter@server.rulingia.com) Received: (from peter@localhost) by server.rulingia.com (8.14.5/8.14.5/Submit) id q7CNBs8N021783 for freebsd-numerics@freebsd.org; Mon, 13 Aug 2012 09:11:54 +1000 (EST) (envelope-from peter) Resent-From: Peter Jeremy Resent-Date: Mon, 13 Aug 2012 09:11:54 +1000 Resent-Message-ID: <20120812231154.GP20453@server.rulingia.com> Resent-To: freebsd-numerics@freebsd.org Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q6MJaqO4006711 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Mon, 23 Jul 2012 05:36:52 +1000 (EST) (envelope-from stephen@missouri.edu) Received: from wilberforce.math.missouri.edu (wilberforce.math.missouri.edu [128.206.184.213]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q6MJanqZ009529 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Mon, 23 Jul 2012 05:36:51 +1000 (EST) (envelope-from stephen@missouri.edu) Received: from [127.0.0.1] (wilberforce.math.missouri.edu [128.206.184.213]) by wilberforce.math.missouri.edu (8.14.5/8.14.5) with ESMTP id q6MJaSsf070184; Sun, 22 Jul 2012 14:36:29 -0500 (CDT) (envelope-from stephen@missouri.edu) Message-ID: <500C563D.9000605@missouri.edu> From: Stephen Montgomery-Smith Mail-Followup-To: freebsd-numerics@freebsd.org User-Agent: Mozilla/5.0 (X11; Linux i686; rv:14.0) Gecko/20120714 Thunderbird/14.0 MIME-Version: 1.0 To: Steve Kargl References: <20120720184114.B2790@besplex.bde.org> <50097128.6030405@missouri.edu> <20120721032448.X5744@besplex.bde.org> <5009BD6C.9050301@missouri.edu> <20120721123522.T877@besplex.bde.org> <500A2565.9090009@missouri.edu> <20120721181204.A1702@besplex.bde.org> <500B594D.1020305@missouri.edu> <20120722125300.P2246@besplex.bde.org> <500C1B1A.5070107@missouri.edu> <20120722172119.GA83243@troutmask.apl.washington.edu> In-Reply-To: <20120722172119.GA83243@troutmask.apl.washington.edu> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: Diane Bruce , John Baldwin , David Chisnall , Bruce Evans , Bruce Evans , David Schultz , Peter Jeremy , Warner Losh Subject: Re: Use of C99 extra long double math functions after r236148 X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Date: Sun, 12 Aug 2012 23:12:02 -0000 X-Original-Date: Sun, 22 Jul 2012 14:36:29 -0500 X-List-Received-Date: Sun, 12 Aug 2012 23:12:02 -0000 On 07/22/2012 12:21 PM, Steve Kargl wrote: > On Sun, Jul 22, 2012 at 10:24:10AM -0500, Stephen Montgomery-Smith wrote: >> >> mpfr is OK, but terribly cludgy. > > I almost chocked on my coffee reading this statement. > >> Having it compute clog is easy. But >> having it computing casinh is going to be very painful. If I cannot use >> pari, I will probably use C++ code based around a floating point package >> called cln. It is used in GiNaC, which is a rather cool way of >> embedding symbolic expressions inside C++ code. They are both available >> in the ports. > > I you looked at MPC? http://www.multiprecision.org/index.php?prog=mpc >>From the MPC manual: > > 4.6 Branch Cuts And Special Values > > Some complex functions have branch cuts, across which the function is > discontinous. In GNU MPC, the branch cuts chosen are the same as those > specified for the corresponding functions in the ISO C99 standard. > > Likewise, when evaluated at a point whose real or imaginary part is > either infinite or a NaN or a signed zero, a function returns the same > value as those specified for the corresponding function in the ISO C99 > standard. > > 5.9 Trigonometric Functions > > - Function: int mpc_asinh (mpc_t rop, mpc_t op, mpc_rnd_t rnd) > - Function: int mpc_acosh (mpc_t rop, mpc_t op, mpc_rnd_t rnd) > - Function: int mpc_atanh (mpc_t rop, mpc_t op, mpc_rnd_t rnd) > > Set rop to the inverse hyperbolic sine, inverse hyperbolic cosine, inverse > hyperbolic tangent of op, rounded according to rnd with the precision of > rop. The branch cut of mpc_acosh is (-\infty, 1). > Oops, my apologies. Reading the manuals and the literature has always been my biggest weakness. I have this unfortunate tendency to keep reinventing the wheel. (It does slow you down, but you also understand things a lot better!) I will try out these mpc_asinh, etc functions. Incidentally, I did find a few edge cases when my program actually beat Mathematica! (I know I was correct, and not Mathematica, by looking at power series expansions.) It was things like I + 1e-200. So if they disagree, I might check to see if it is mpc or my program that is correct. The real difficulty with casinh and cacosh is not the computation of clog, but the calculation of z+csqrt(z*z+1)-1. It is very hard when z is close to I or -I. The paper by Hull et al really does an excellent job of handling the special cases. From owner-freebsd-numerics@FreeBSD.ORG Sun Aug 12 23:12:13 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 8050A106566B for ; Sun, 12 Aug 2012 23:12:13 +0000 (UTC) (envelope-from peter@rulingia.com) Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by mx1.freebsd.org (Postfix) with ESMTP id DE4068FC18 for ; Sun, 12 Aug 2012 23:12:12 +0000 (UTC) Received: from server.rulingia.com (c220-239-249-137.belrs5.nsw.optusnet.com.au [220.239.249.137]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CNCCrw075913 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Mon, 13 Aug 2012 09:12:12 +1000 (EST) (envelope-from peter@rulingia.com) X-Bogosity: Ham, spamicity=0.000000 Received: from server.rulingia.com (localhost.rulingia.com [127.0.0.1]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CNC66w021796 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Mon, 13 Aug 2012 09:12:06 +1000 (EST) (envelope-from peter@server.rulingia.com) Received: (from peter@localhost) by server.rulingia.com (8.14.5/8.14.5/Submit) id q7CNC6Ne021795 for freebsd-numerics@freebsd.org; Mon, 13 Aug 2012 09:12:06 +1000 (EST) (envelope-from peter) Resent-From: Peter Jeremy Resent-Date: Mon, 13 Aug 2012 09:12:06 +1000 Resent-Message-ID: <20120812231206.GR20453@server.rulingia.com> Resent-To: freebsd-numerics@freebsd.org Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q6MM7q1O007961 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Mon, 23 Jul 2012 08:07:53 +1000 (EST) (envelope-from stephen@missouri.edu) Received: from wilberforce.math.missouri.edu (wilberforce.math.missouri.edu [128.206.184.213]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q6MM7mKB009916 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Mon, 23 Jul 2012 08:07:50 +1000 (EST) (envelope-from stephen@missouri.edu) Received: from [127.0.0.1] (wilberforce.math.missouri.edu [128.206.184.213]) by wilberforce.math.missouri.edu (8.14.5/8.14.5) with ESMTP id q6MM7S9A081132; Sun, 22 Jul 2012 17:07:28 -0500 (CDT) (envelope-from stephen@missouri.edu) Message-ID: <500C79A1.6080809@missouri.edu> From: Stephen Montgomery-Smith Mail-Followup-To: freebsd-numerics@freebsd.org User-Agent: Mozilla/5.0 (X11; Linux i686; rv:14.0) Gecko/20120714 Thunderbird/14.0 MIME-Version: 1.0 To: Bruce Evans References: <20120714120432.GA70706@server.rulingia.com> <20120717200931.U6624@besplex.bde.org> <5006D13D.2080702@missouri.edu> <20120718205625.GA409@troutmask.apl.washington.edu> <500725F2.7060603@missouri.edu> <20120719025345.GA1376@troutmask.apl.washington.edu> <50077987.1080307@missouri.edu> <20120719032706.GA1558@troutmask.apl.washington.edu> <5007826D.7060806@missouri.edu> <5007AD41.9070000@missouri.edu> <20120719205347.T2601@besplex.bde.org> <50084322.7020401@missouri.edu> <20120720035001.W4053@besplex.bde.org> <50085441.4090305@missouri.edu> <20120720162953.N2162@besplex.bde.org> <20120720184114.B2790@besplex.bde.org> <50097128.6030405@missouri.edu> <20120721032448.X5744@besplex.bde.org> <5009BD6C.9050301@missouri.edu> <20120721123522.T877@besplex.bde.org> <500A2565.9090009@missouri.edu> <20120721181204.A1702@besplex.bde.org> <500B594D.1020305@missouri.edu> <20120722125300.P2246@besplex.bde.org> <500C1B1A.5070107@missouri.edu> <20120723015912.K5029@besplex.bde.org> In-Reply-To: <20120723015912.K5029@besplex.bde.org> Content-Type: multipart/mixed; boundary="------------080906040809060000080001" X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: Diane Bruce , John Baldwin , David Chisnall , Bruce Evans , Steve Kargl , David Schultz , Peter Jeremy , Warner Losh Subject: Re: Use of C99 extra long double math functions after r236148 X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Date: Sun, 12 Aug 2012 23:12:13 -0000 X-Original-Date: Sun, 22 Jul 2012 17:07:29 -0500 X-List-Received-Date: Sun, 12 Aug 2012 23:12:13 -0000 This is a multi-part message in MIME format. --------------080906040809060000080001 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit So now that people pointed out that casinh is available in mpc, I no longer need the interface to pari. I used mpc with 300 bit precision. I tested out the casinh program. After a small tweek, the worst case ULP is about 2.5 or 3. Edge cases close to I do very well, with a ULP of about 0.5. I am very pleased with how well it performs. I looked at Peter Jeremy's code for catanh, and I notice that he hasn't yet worked to get optimal ULP, since he has concentrated on the handling of the edge cases. If Peter is OK with me butting in, I'll could try to work on getting optimal ULP for catanh. I think it will be easier to analyze than casinh, since no csqrts need to be involved. I anticipate that the hard case will be when z is close to the imaginary axis, and moderately large (like 1e-5 + 10*I). --------------080906040809060000080001-- From owner-freebsd-numerics@FreeBSD.ORG Sun Aug 12 23:12:28 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 68C5D106566B for ; Sun, 12 Aug 2012 23:12:28 +0000 (UTC) (envelope-from peter@rulingia.com) Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by mx1.freebsd.org (Postfix) with ESMTP id 105748FC0A for ; Sun, 12 Aug 2012 23:12:27 +0000 (UTC) Received: from server.rulingia.com (c220-239-249-137.belrs5.nsw.optusnet.com.au [220.239.249.137]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CNCRmC075919 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Mon, 13 Aug 2012 09:12:28 +1000 (EST) (envelope-from peter@rulingia.com) X-Bogosity: Ham, spamicity=0.000000 Received: from server.rulingia.com (localhost.rulingia.com [127.0.0.1]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CNCL75021813 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Mon, 13 Aug 2012 09:12:21 +1000 (EST) (envelope-from peter@server.rulingia.com) Received: (from peter@localhost) by server.rulingia.com (8.14.5/8.14.5/Submit) id q7CNCLMl021812 for freebsd-numerics@freebsd.org; Mon, 13 Aug 2012 09:12:21 +1000 (EST) (envelope-from peter) Resent-From: Peter Jeremy Resent-Date: Mon, 13 Aug 2012 09:12:21 +1000 Resent-Message-ID: <20120812231221.GS20453@server.rulingia.com> Resent-To: freebsd-numerics@freebsd.org From: Peter Jeremy Mail-Followup-To: freebsd-numerics@freebsd.org To: Stephen Montgomery-Smith Message-ID: <20120722231300.GA8033@server.rulingia.com> References: <20120721032448.X5744@besplex.bde.org> <5009BD6C.9050301@missouri.edu> <20120721123522.T877@besplex.bde.org> <500A2565.9090009@missouri.edu> <20120721181204.A1702@besplex.bde.org> <500B594D.1020305@missouri.edu> <20120722125300.P2246@besplex.bde.org> <500C1B1A.5070107@missouri.edu> <20120723015912.K5029@besplex.bde.org> <500C79A1.6080809@missouri.edu> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="uAKRQypu60I7Lcqm" Content-Disposition: inline In-Reply-To: <500C79A1.6080809@missouri.edu> X-PGP-Key: http://www.rulingia.com/keys/peter.pgp User-Agent: Mutt/1.5.21 (2010-09-15) Cc: Diane Bruce , Bruce Evans , John Baldwin , David Chisnall , Bruce Evans , Steve Kargl , David Schultz , Warner Losh Subject: Re: Use of C99 extra long double math functions after r236148 X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Date: Sun, 12 Aug 2012 23:12:28 -0000 X-Original-Date: Mon, 23 Jul 2012 09:13:00 +1000 X-List-Received-Date: Sun, 12 Aug 2012 23:12:28 -0000 --uAKRQypu60I7Lcqm Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On 2012-Jul-22 17:07:29 -0500, Stephen Montgomery-Smith wrote: >I tested out the casinh program. After a small tweek, the worst case=20 >ULP is about 2.5 or 3. Edge cases close to I do very well, with a ULP=20 >of about 0.5. I am very pleased with how well it performs. That's excellent. I think the exception handling in clog() needs some work - in particular, input NaNs should be returned, rather than returning new default NaNs. >I looked at Peter Jeremy's code for catanh, and I notice that he hasn't=20 >yet worked to get optimal ULP, since he has concentrated on the handling= =20 >of the edge cases. As I've previously mentioned, I believe handling the exception cases can be done completely independently of handling the "normal" cases and I was focussing on the former since it is just a (simple) matter of implementing the text in n1256 G.6.2.3. >If Peter is OK with me butting in, I'll could try to work on getting=20 >optimal ULP for catanh. I think it will be easier to analyze than=20 >casinh, since no csqrts need to be involved. I anticipate that the hard= =20 >case will be when z is close to the imaginary axis, and moderately large= =20 >(like 1e-5 + 10*I). The "normal" cases are (algorithmetically with '^' as exponentiation): catanh(z) =3D clog((1+z)/(1-z))/2 =3D (clog(1+z) - clog(1-z))/2 catanh(x+I*y) =3D log((y^2 + (1+x)^2)/(y^2 + (1-x)^2))/4 + I*atan2(y^2, 1-x= ^2-y^2)/2 None of these approaches behave cleanly when |z| is close to 1. If you have some insights, feel free to work on it. --=20 Peter Jeremy --uAKRQypu60I7Lcqm Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.19 (FreeBSD) iEYEARECAAYFAlAMiPwACgkQ/opHv/APuIe48ACgpA3iF4khaKiaOrVgd6KWaWnn pg0AoL+hqCD6aZuaoPj85qSJZ1hX3bgX =L1Wx -----END PGP SIGNATURE----- --uAKRQypu60I7Lcqm-- From owner-freebsd-numerics@FreeBSD.ORG Sun Aug 12 23:12:38 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 615921065675 for ; Sun, 12 Aug 2012 23:12:38 +0000 (UTC) (envelope-from peter@rulingia.com) Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by mx1.freebsd.org (Postfix) with ESMTP id EE4AA8FC15 for ; Sun, 12 Aug 2012 23:12:37 +0000 (UTC) Received: from server.rulingia.com (c220-239-249-137.belrs5.nsw.optusnet.com.au [220.239.249.137]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CNCbHf075922 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Mon, 13 Aug 2012 09:12:37 +1000 (EST) (envelope-from peter@rulingia.com) X-Bogosity: Ham, spamicity=0.000000 Received: from server.rulingia.com (localhost.rulingia.com [127.0.0.1]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CNCVi8021821 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Mon, 13 Aug 2012 09:12:31 +1000 (EST) (envelope-from peter@server.rulingia.com) Received: (from peter@localhost) by server.rulingia.com (8.14.5/8.14.5/Submit) id q7CNCVQe021820 for freebsd-numerics@freebsd.org; Mon, 13 Aug 2012 09:12:31 +1000 (EST) (envelope-from peter) Resent-From: Peter Jeremy Resent-Date: Mon, 13 Aug 2012 09:12:31 +1000 Resent-Message-ID: <20120812231231.GT20453@server.rulingia.com> Resent-To: freebsd-numerics@freebsd.org Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q6N1Np6m009692 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Mon, 23 Jul 2012 11:23:52 +1000 (EST) (envelope-from stephen@missouri.edu) Received: from wilberforce.math.missouri.edu (wilberforce.math.missouri.edu [128.206.184.213]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q6N1Nnx8010439 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Mon, 23 Jul 2012 11:23:51 +1000 (EST) (envelope-from stephen@missouri.edu) Received: from [127.0.0.1] (wilberforce.math.missouri.edu [128.206.184.213]) by wilberforce.math.missouri.edu (8.14.5/8.14.5) with ESMTP id q6N1NPo6094254; Sun, 22 Jul 2012 20:23:27 -0500 (CDT) (envelope-from stephen@missouri.edu) Message-ID: <500CA78E.2070302@missouri.edu> From: Stephen Montgomery-Smith Mail-Followup-To: freebsd-numerics@freebsd.org User-Agent: Mozilla/5.0 (X11; Linux i686; rv:14.0) Gecko/20120714 Thunderbird/14.0 MIME-Version: 1.0 To: Peter Jeremy References: <20120721032448.X5744@besplex.bde.org> <5009BD6C.9050301@missouri.edu> <20120721123522.T877@besplex.bde.org> <500A2565.9090009@missouri.edu> <20120721181204.A1702@besplex.bde.org> <500B594D.1020305@missouri.edu> <20120722125300.P2246@besplex.bde.org> <500C1B1A.5070107@missouri.edu> <20120723015912.K5029@besplex.bde.org> <500C79A1.6080809@missouri.edu> <20120722231300.GA8033@server.rulingia.com> In-Reply-To: <20120722231300.GA8033@server.rulingia.com> Content-Type: multipart/mixed; boundary="------------030507040607030000090202" X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: Diane Bruce , Bruce Evans , John Baldwin , David Chisnall , Bruce Evans , Steve Kargl , David Schultz , Warner Losh Subject: Re: Use of C99 extra long double math functions after r236148 X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Date: Sun, 12 Aug 2012 23:12:38 -0000 X-Original-Date: Sun, 22 Jul 2012 20:23:26 -0500 X-List-Received-Date: Sun, 12 Aug 2012 23:12:38 -0000 This is a multi-part message in MIME format. --------------030507040607030000090202 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit This is what I came up with for catanh. I seem to get a ULP less than 3. On 07/22/2012 06:13 PM, Peter Jeremy wrote: > The "normal" cases are (algorithmetically with '^' as exponentiation): > > catanh(z) = clog((1+z)/(1-z))/2 > = (clog(1+z) - clog(1-z))/2 > catanh(x+I*y) = log((y^2 + (1+x)^2)/(y^2 + (1-x)^2))/4 + I*atan2(y^2, 1-x^2-y^2)/2 > > None of these approaches behave cleanly when |z| is close to 1. If > you have some insights, feel free to work on it. > --------------030507040607030000090202-- From owner-freebsd-numerics@FreeBSD.ORG Sun Aug 12 23:12:46 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 673C51065673 for ; Sun, 12 Aug 2012 23:12:46 +0000 (UTC) (envelope-from peter@rulingia.com) Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by mx1.freebsd.org (Postfix) with ESMTP id C603A8FC12 for ; Sun, 12 Aug 2012 23:12:45 +0000 (UTC) Received: from server.rulingia.com (c220-239-249-137.belrs5.nsw.optusnet.com.au [220.239.249.137]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CNCjHW075926 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Mon, 13 Aug 2012 09:12:45 +1000 (EST) (envelope-from peter@rulingia.com) X-Bogosity: Ham, spamicity=0.000000 Received: from server.rulingia.com (localhost.rulingia.com [127.0.0.1]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CNCdaL021828 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Mon, 13 Aug 2012 09:12:39 +1000 (EST) (envelope-from peter@server.rulingia.com) Received: (from peter@localhost) by server.rulingia.com (8.14.5/8.14.5/Submit) id q7CNCdfH021827 for freebsd-numerics@freebsd.org; Mon, 13 Aug 2012 09:12:39 +1000 (EST) (envelope-from peter) Resent-From: Peter Jeremy Resent-Date: Mon, 13 Aug 2012 09:12:39 +1000 Resent-Message-ID: <20120812231239.GU20453@server.rulingia.com> Resent-To: freebsd-numerics@freebsd.org Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q6MFgWNg097070 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Mon, 23 Jul 2012 01:42:32 +1000 (EST) (envelope-from stephen@missouri.edu) Received: from wilberforce.math.missouri.edu (wilberforce.math.missouri.edu [128.206.184.213]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q6MFgTwE006732 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Mon, 23 Jul 2012 01:42:31 +1000 (EST) (envelope-from stephen@missouri.edu) Received: from [127.0.0.1] (wilberforce.math.missouri.edu [128.206.184.213]) by wilberforce.math.missouri.edu (8.14.5/8.14.5) with ESMTP id q6MFg8dJ055307; Sun, 22 Jul 2012 10:42:08 -0500 (CDT) (envelope-from stephen@missouri.edu) Message-ID: <500C1F51.7050209@missouri.edu> From: Stephen Montgomery-Smith Mail-Followup-To: freebsd-numerics@freebsd.org User-Agent: Mozilla/5.0 (X11; Linux i686; rv:14.0) Gecko/20120714 Thunderbird/14.0 MIME-Version: 1.0 To: Bruce Evans References: <20120714120432.GA70706@server.rulingia.com> <5004A5C7.1040405@missouri.edu> <5004DEA9.1050001@missouri.edu> <20120717200931.U6624@besplex.bde.org> <5006D13D.2080702@missouri.edu> <20120718205625.GA409@troutmask.apl.washington.edu> <500725F2.7060603@missouri.edu> <20120719025345.GA1376@troutmask.apl.washington.edu> <50077987.1080307@missouri.edu> <20120719032706.GA1558@troutmask.apl.washington.edu> <5007826D.7060806@missouri.edu> <5007AD41.9070000@missouri.edu> <20120719205347.T2601@besplex.bde.org> <50084322.7020401@missouri.edu> <20120720035001.W4053@besplex.bde.org> <50085441.4090305@missouri.edu> <20120720162953.N2162@besplex.bde.org> <20120720184114.B2790@besplex.bde.org> <50097128.6030405@missouri.edu> <20120721032448.X5744@besplex.bde.org> <5009BD6C.9050301@missouri.edu> <20120721123522.T877@besplex.bde.org> <500A2565.9090009@missouri.edu> <20120721181204.A1702@besplex.bde.org> <500B594D.1020305@missouri.edu> <20120722125300.P2246@besplex.bde.org> In-Reply-To: <20120722125300.P2246@besplex.bde.org> Content-Type: multipart/mixed; boundary="------------040203020507050000080401" X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: Diane Bruce , John Baldwin , David Chisnall , Bruce Evans , Steve Kargl , David Schultz , Peter Jeremy , Warner Losh Subject: Re: Use of C99 extra long double math functions after r236148 X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Date: Sun, 12 Aug 2012 23:12:46 -0000 X-Original-Date: Sun, 22 Jul 2012 10:42:09 -0500 X-List-Received-Date: Sun, 12 Aug 2012 23:12:46 -0000 This is a multi-part message in MIME format. --------------040203020507050000080401 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit On 07/22/2012 01:19 AM, Bruce Evans wrote: >> 2. I have thought more about the problem of computing clog(z) when >> |z| is close to 1. I now think it might even require precision that >> is 3 times better than double precision. >> It is possible that you could, by chance, pick x and y so that |z| = 1 >> + 1e30. (I wrote a test program, and this did > e-30 here and elsewhere >> actually happen a few times.) So when you compute x^2+y^2-1, if you >> do it using double double precision, you will get 2e30, but only the >> most significant digits will be accurate. The ULP will be about >> 1e15. You need triple double precision to get ULP's close to 1. > > Hmm, I'm not seeing such cases except when |x| or |y| is 1 and the other > is tiny. That was my mistake. I didn't have enough precision in mpfr to use it as a proper reference! Now I tested my code, and I seem to be getting ULP's consistently less than 2! I am attaching my clog code and my test code. I go in and manually change the test code to test around a different range of values. --------------040203020507050000080401-- From owner-freebsd-numerics@FreeBSD.ORG Sun Aug 12 23:12:51 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 3C1E0106566C for ; Sun, 12 Aug 2012 23:12:51 +0000 (UTC) (envelope-from peter@rulingia.com) Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by mx1.freebsd.org (Postfix) with ESMTP id C6C558FC17 for ; Sun, 12 Aug 2012 23:12:50 +0000 (UTC) Received: from server.rulingia.com (c220-239-249-137.belrs5.nsw.optusnet.com.au [220.239.249.137]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CNCoLf075930 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Mon, 13 Aug 2012 09:12:50 +1000 (EST) (envelope-from peter@rulingia.com) X-Bogosity: Ham, spamicity=0.000000 Received: from server.rulingia.com (localhost.rulingia.com [127.0.0.1]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CNCiCm021842 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Mon, 13 Aug 2012 09:12:44 +1000 (EST) (envelope-from peter@server.rulingia.com) Received: (from peter@localhost) by server.rulingia.com (8.14.5/8.14.5/Submit) id q7CNCi07021841 for freebsd-numerics@freebsd.org; Mon, 13 Aug 2012 09:12:44 +1000 (EST) (envelope-from peter) Resent-From: Peter Jeremy Resent-Date: Mon, 13 Aug 2012 09:12:44 +1000 Resent-Message-ID: <20120812231244.GV20453@server.rulingia.com> Resent-To: freebsd-numerics@freebsd.org Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q6L23Ji5062784 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Sat, 21 Jul 2012 12:03:19 +1000 (EST) (envelope-from stephen@missouri.edu) Received: from wilberforce.math.missouri.edu (wilberforce.math.missouri.edu [128.206.184.213]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q6L23Ge3095233 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Sat, 21 Jul 2012 12:03:18 +1000 (EST) (envelope-from stephen@missouri.edu) Received: from [127.0.0.1] (wilberforce.math.missouri.edu [128.206.184.213]) by wilberforce.math.missouri.edu (8.14.5/8.14.5) with ESMTP id q6L22ssN093460; Fri, 20 Jul 2012 21:02:55 -0500 (CDT) (envelope-from stephen@missouri.edu) Message-ID: <500A0DCF.4030707@missouri.edu> From: Stephen Montgomery-Smith Mail-Followup-To: freebsd-numerics@freebsd.org User-Agent: Mozilla/5.0 (X11; Linux i686; rv:14.0) Gecko/20120714 Thunderbird/14.0 MIME-Version: 1.0 To: Bruce Evans References: <20120714120432.GA70706@server.rulingia.com> <20120717084457.U3890@besplex.bde.org> <5004A5C7.1040405@missouri.edu> <5004DEA9.1050001@missouri.edu> <20120717200931.U6624@besplex.bde.org> <5006D13D.2080702@missouri.edu> <20120718205625.GA409@troutmask.apl.washington.edu> <500725F2.7060603@missouri.edu> <20120719025345.GA1376@troutmask.apl.washington.edu> <50077987.1080307@missouri.edu> <20120719032706.GA1558@troutmask.apl.washington.edu> <5007826D.7060806@missouri.edu> <5007AD41.9070000@missouri.edu> <20120719205347.T2601@besplex.bde.org> <50084322.7020401@missouri.edu> <20120720035001.W4053@besplex.bde.org> <50085441.4090305@missouri.edu> <20120720162953.N2162@besplex.bde.org> <20120720184114.B2790@besplex.bde.org> <50097128.6030405@missouri.edu> <20120721032448.X5744@besplex.bde.org> In-Reply-To: <20120721032448.X5744@besplex.bde.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: Diane Bruce , John Baldwin , David Chisnall , Bruce Evans , Steve Kargl , David Schultz , Peter Jeremy , Warner Losh Subject: Re: Use of C99 extra long double math functions after r236148 X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Date: Sun, 12 Aug 2012 23:12:51 -0000 X-Original-Date: Fri, 20 Jul 2012 21:02:55 -0500 X-List-Received-Date: Sun, 12 Aug 2012 23:12:51 -0000 Hey guys, I just found this paper: Implementing complex elementary functions using exception handling, T. E. Hull, Thomas F. Fairgrieve, Ping-Tak Peter Tang ACM Transactions on Mathematical Software, Volume 20 Issue 2, June 1994 Pages 215-244 http://dl.acm.org/citation.cfm?doid=178365.178404 This includes an algorithm for clog. Now that I have discovered just how hard it really is, I am going to read this paper. There is also a Corrigenda to this paper - the digital copy seems to be missing. But I will try to retrieve the paper copy from my library. http://toms.acm.org/cgi/TOMSbibget.cgi?Anonymous:1994:C If anyone else wants a copy of these papers, I would be glad to share it. But I am using the University of Missouri subscription to ACM, and I don't want to abuse it by sharing the paper willy-nilly. From owner-freebsd-numerics@FreeBSD.ORG Sun Aug 12 23:12:58 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 29AD4106564A for ; Sun, 12 Aug 2012 23:12:58 +0000 (UTC) (envelope-from peter@rulingia.com) Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by mx1.freebsd.org (Postfix) with ESMTP id 8A0628FC12 for ; Sun, 12 Aug 2012 23:12:57 +0000 (UTC) Received: from server.rulingia.com (c220-239-249-137.belrs5.nsw.optusnet.com.au [220.239.249.137]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CNCvtc075938 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Mon, 13 Aug 2012 09:12:57 +1000 (EST) (envelope-from peter@rulingia.com) X-Bogosity: Ham, spamicity=0.000000 Received: from server.rulingia.com (localhost.rulingia.com [127.0.0.1]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CNCp7p021856 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Mon, 13 Aug 2012 09:12:51 +1000 (EST) (envelope-from peter@server.rulingia.com) Received: (from peter@localhost) by server.rulingia.com (8.14.5/8.14.5/Submit) id q7CNCpIV021855 for freebsd-numerics@freebsd.org; Mon, 13 Aug 2012 09:12:51 +1000 (EST) (envelope-from peter) Resent-From: Peter Jeremy Resent-Date: Mon, 13 Aug 2012 09:12:51 +1000 Resent-Message-ID: <20120812231251.GW20453@server.rulingia.com> Resent-To: freebsd-numerics@freebsd.org Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q6L2S8fk062956 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Sat, 21 Jul 2012 12:28:08 +1000 (EST) (envelope-from stephen@missouri.edu) Received: from wilberforce.math.missouri.edu (wilberforce.math.missouri.edu [128.206.184.213]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q6L2S456095277 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Sat, 21 Jul 2012 12:28:06 +1000 (EST) (envelope-from stephen@missouri.edu) Received: from [127.0.0.1] (wilberforce.math.missouri.edu [128.206.184.213]) by wilberforce.math.missouri.edu (8.14.5/8.14.5) with ESMTP id q6L2RfBp095071; Fri, 20 Jul 2012 21:27:42 -0500 (CDT) (envelope-from stephen@missouri.edu) Message-ID: <500A139D.2090803@missouri.edu> From: Stephen Montgomery-Smith Mail-Followup-To: freebsd-numerics@freebsd.org User-Agent: Mozilla/5.0 (X11; Linux i686; rv:14.0) Gecko/20120714 Thunderbird/14.0 MIME-Version: 1.0 To: Bruce Evans References: <20120714120432.GA70706@server.rulingia.com> <20120717084457.U3890@besplex.bde.org> <5004A5C7.1040405@missouri.edu> <5004DEA9.1050001@missouri.edu> <20120717200931.U6624@besplex.bde.org> <5006D13D.2080702@missouri.edu> <20120718205625.GA409@troutmask.apl.washington.edu> <500725F2.7060603@missouri.edu> <20120719025345.GA1376@troutmask.apl.washington.edu> <50077987.1080307@missouri.edu> <20120719032706.GA1558@troutmask.apl.washington.edu> <5007826D.7060806@missouri.edu> <5007AD41.9070000@missouri.edu> <20120719205347.T2601@besplex.bde.org> <50084322.7020401@missouri.edu> <20120720035001.W4053@besplex.bde.org> <50085441.4090305@missouri.edu> <20120720162953.N2162@besplex.bde.org> <20120720184114.B2790@besplex.bde.org> <50097128.6030405@missouri.edu> <20120721032448.X5744@besplex.bde.org> <500A0DCF.4030707@missouri.edu> In-Reply-To: <500A0DCF.4030707@missouri.edu> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: Diane Bruce , John Baldwin , David Chisnall , Bruce Evans , Steve Kargl , David Schultz , Peter Jeremy , Warner Losh Subject: Re: Use of C99 extra long double math functions after r236148 X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Date: Sun, 12 Aug 2012 23:12:58 -0000 X-Original-Date: Fri, 20 Jul 2012 21:27:41 -0500 X-List-Received-Date: Sun, 12 Aug 2012 23:12:58 -0000 On 07/20/2012 09:02 PM, Stephen Montgomery-Smith wrote: > Hey guys, I just found this paper: > > Implementing complex elementary functions using exception handling, > T. E. Hull, Thomas F. Fairgrieve, Ping-Tak Peter Tang > ACM Transactions on Mathematical Software, Volume 20 Issue 2, June 1994 > Pages 215-244 > > http://dl.acm.org/citation.cfm?doid=178365.178404 > > This includes an algorithm for clog. Now that I have discovered just > how hard it really is, I am going to read this paper. > > There is also a Corrigenda to this paper - the digital copy seems to be > missing. But I will try to retrieve the paper copy from my library. > > http://toms.acm.org/cgi/TOMSbibget.cgi?Anonymous:1994:C > > If anyone else wants a copy of these papers, I would be glad to share > it. But I am using the University of Missouri subscription to ACM, and > I don't want to abuse it by sharing the paper willy-nilly. Ugh. The way they handle the real part of clog(z), log(hypot(x,y)), when hypot(x,y) is close to 1, is to do the calculation in double precision. So to do it properly, we need a double double precision arithmetic, which we don't have. They say it can be simulated, and I think I have seen quad-precision packages (I think it might even be included in FreeBSD ports). But I am not so sure if we want to introduce quad precision into base FreeBSD. The paper says that simulating double precision arithmetic using single arithmetic is slow, so I would think the same is true for quad-precision. Another thing - all their algorithms use exception handling. That is, you go through the steps of the calculation, and if the FPU gives an overflow or underflow error, the program catches it, and then it does something else special. So I guess the programs would have to save the state of the SIGFPE signal, set its own handler, and then restore the SIGFPE signal at the end. And then the program has to keep track of SIGFPE signals it really wanted to send outside of itself, and then trigger those. Does the inexact flag also raise the SIGFPE signal? I can see how to avoid using exception handling, and they mention this in their papers, but they say it isn't clean. From owner-freebsd-numerics@FreeBSD.ORG Sun Aug 12 23:13:03 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 1F000106564A for ; Sun, 12 Aug 2012 23:13:03 +0000 (UTC) (envelope-from peter@rulingia.com) Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by mx1.freebsd.org (Postfix) with ESMTP id A5D428FC0C for ; Sun, 12 Aug 2012 23:13:02 +0000 (UTC) Received: from server.rulingia.com (c220-239-249-137.belrs5.nsw.optusnet.com.au [220.239.249.137]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CND2a6075941 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Mon, 13 Aug 2012 09:13:02 +1000 (EST) (envelope-from peter@rulingia.com) X-Bogosity: Ham, spamicity=0.000000 Received: from server.rulingia.com (localhost.rulingia.com [127.0.0.1]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CNCujx021867 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Mon, 13 Aug 2012 09:12:56 +1000 (EST) (envelope-from peter@server.rulingia.com) Received: (from peter@localhost) by server.rulingia.com (8.14.5/8.14.5/Submit) id q7CNCuCt021865 for freebsd-numerics@freebsd.org; Mon, 13 Aug 2012 09:12:56 +1000 (EST) (envelope-from peter) Resent-From: Peter Jeremy Resent-Date: Mon, 13 Aug 2012 09:12:56 +1000 Resent-Message-ID: <20120812231256.GX20453@server.rulingia.com> Resent-To: freebsd-numerics@freebsd.org From: Peter Jeremy Mail-Followup-To: freebsd-numerics@freebsd.org To: Stephen Montgomery-Smith Message-ID: <20120721043052.GB73662@server.rulingia.com> References: <20120719205347.T2601@besplex.bde.org> <50084322.7020401@missouri.edu> <20120720035001.W4053@besplex.bde.org> <50085441.4090305@missouri.edu> <20120720162953.N2162@besplex.bde.org> <20120720184114.B2790@besplex.bde.org> <50097128.6030405@missouri.edu> <20120721032448.X5744@besplex.bde.org> <500A0DCF.4030707@missouri.edu> <500A139D.2090803@missouri.edu> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="MW5yreqqjyrRcusr" Content-Disposition: inline In-Reply-To: <500A139D.2090803@missouri.edu> X-PGP-Key: http://www.rulingia.com/keys/peter.pgp User-Agent: Mutt/1.5.21 (2010-09-15) Cc: Diane Bruce , Bruce Evans , John Baldwin , David Chisnall , Bruce Evans , Steve Kargl , David Schultz , Warner Losh Subject: Re: Use of C99 extra long double math functions after r236148 X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Date: Sun, 12 Aug 2012 23:13:03 -0000 X-Original-Date: Sat, 21 Jul 2012 14:30:52 +1000 X-List-Received-Date: Sun, 12 Aug 2012 23:13:03 -0000 --MW5yreqqjyrRcusr Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On 2012-Jul-20 21:27:41 -0500, Stephen Montgomery-Smith wrote: >The way they handle the real part of clog(z), log(hypot(x,y)), when=20 >hypot(x,y) is close to 1, is to do the calculation in double precision.=20 > So to do it properly, we need a double double precision arithmetic,=20 >which we don't have. Actually, we do. r230363 includes both extended and quad long-double emulation code in lib/libc/softfloat. There's also ld128 code under lib/libc/sparc64/fpu. >The paper says that simulating double precision arithmetic using single=20 >arithmetic is slow, so I would think the same is true for quad-precision. It depends how much af the arithmetic needs to be in more-than-double precision. Possibly, careful choice of partitioning would allow double precision to be used for most values with multi-precision only needed some of the time. I expect catanh() will run into similar problems evaluating clog((1 + z)/(1 - z)). Something like clog1p() would help when z=20 is close to 1, as will using alternative expansions. >Does the inexact flag also raise the SIGFPE signal? It's under program control - see feenableexcept() and fedisableexcept(), --=20 Peter Jeremy --MW5yreqqjyrRcusr Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.19 (FreeBSD) iEYEARECAAYFAlAKMHwACgkQ/opHv/APuIdDewCfdQfziJIblEBpobdLeISWPnRV OUcAoMDn0liv1DgjH4J8NmH2xnXSPcAe =NIRR -----END PGP SIGNATURE----- --MW5yreqqjyrRcusr-- From owner-freebsd-numerics@FreeBSD.ORG Sun Aug 12 23:13:10 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 18F6F106566B for ; Sun, 12 Aug 2012 23:13:10 +0000 (UTC) (envelope-from peter@rulingia.com) Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by mx1.freebsd.org (Postfix) with ESMTP id A8EF88FC15 for ; Sun, 12 Aug 2012 23:13:09 +0000 (UTC) Received: from server.rulingia.com (c220-239-249-137.belrs5.nsw.optusnet.com.au [220.239.249.137]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CND9it075948 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Mon, 13 Aug 2012 09:13:09 +1000 (EST) (envelope-from peter@rulingia.com) X-Bogosity: Ham, spamicity=0.000000 Received: from server.rulingia.com (localhost.rulingia.com [127.0.0.1]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CND2RV021880 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Mon, 13 Aug 2012 09:13:03 +1000 (EST) (envelope-from peter@server.rulingia.com) Received: (from peter@localhost) by server.rulingia.com (8.14.5/8.14.5/Submit) id q7CND2gf021879 for freebsd-numerics@freebsd.org; Mon, 13 Aug 2012 09:13:02 +1000 (EST) (envelope-from peter) Resent-From: Peter Jeremy Resent-Date: Mon, 13 Aug 2012 09:13:02 +1000 Resent-Message-ID: <20120812231302.GY20453@server.rulingia.com> Resent-To: freebsd-numerics@freebsd.org Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q6L5Vq1d064426 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Sat, 21 Jul 2012 15:31:52 +1000 (EST) (envelope-from stephen@missouri.edu) Received: from wilberforce.math.missouri.edu (wilberforce.math.missouri.edu [128.206.184.213]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q6L5Vnxp095671 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Sat, 21 Jul 2012 15:31:52 +1000 (EST) (envelope-from stephen@missouri.edu) Received: from [127.0.0.1] (wilberforce.math.missouri.edu [128.206.184.213]) by wilberforce.math.missouri.edu (8.14.5/8.14.5) with ESMTP id q6L5VTZr021090; Sat, 21 Jul 2012 00:31:30 -0500 (CDT) (envelope-from stephen@missouri.edu) Message-ID: <500A3EB1.9040806@missouri.edu> From: Stephen Montgomery-Smith Mail-Followup-To: freebsd-numerics@freebsd.org User-Agent: Mozilla/5.0 (X11; Linux i686; rv:14.0) Gecko/20120714 Thunderbird/14.0 MIME-Version: 1.0 To: Peter Jeremy References: <20120719205347.T2601@besplex.bde.org> <50084322.7020401@missouri.edu> <20120720035001.W4053@besplex.bde.org> <50085441.4090305@missouri.edu> <20120720162953.N2162@besplex.bde.org> <20120720184114.B2790@besplex.bde.org> <50097128.6030405@missouri.edu> <20120721032448.X5744@besplex.bde.org> <500A0DCF.4030707@missouri.edu> <500A139D.2090803@missouri.edu> <20120721043052.GB73662@server.rulingia.com> In-Reply-To: <20120721043052.GB73662@server.rulingia.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: Diane Bruce , Bruce Evans , John Baldwin , David Chisnall , Bruce Evans , Steve Kargl , David Schultz , Warner Losh Subject: Re: Use of C99 extra long double math functions after r236148 X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Date: Sun, 12 Aug 2012 23:13:10 -0000 X-Original-Date: Sat, 21 Jul 2012 00:31:29 -0500 X-List-Received-Date: Sun, 12 Aug 2012 23:13:10 -0000 On 07/20/2012 11:30 PM, Peter Jeremy wrote: > On 2012-Jul-20 21:27:41 -0500, Stephen Montgomery-Smith wrote: >> The way they handle the real part of clog(z), log(hypot(x,y)), when >> hypot(x,y) is close to 1, is to do the calculation in double precision. >> So to do it properly, we need a double double precision arithmetic, >> which we don't have. > > Actually, we do. r230363 includes both extended and quad long-double > emulation code in lib/libc/softfloat. There's also ld128 code under > lib/libc/sparc64/fpu. Does the double-double precision arithmetic include sqrt? If that is the case, I might be able to make short work of casinh. Looking at the Hull paper, one hard part is computing hypot(x-1,y) + hypot(x+1,y) - 2 that is |z-1| + |z+1| - 2 to great accuracy. And after thinking about it for a while, you realize that when z is close to the interval [-1,1], getting this calculation done with any accuracy is really hard. I think if I could do this calculation in double-double precision, then I would get all the accuracy I need. From owner-freebsd-numerics@FreeBSD.ORG Sun Aug 12 23:13:23 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 41CA81065689 for ; Sun, 12 Aug 2012 23:13:23 +0000 (UTC) (envelope-from peter@rulingia.com) Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by mx1.freebsd.org (Postfix) with ESMTP id CA7E68FC19 for ; Sun, 12 Aug 2012 23:13:22 +0000 (UTC) Received: from server.rulingia.com (c220-239-249-137.belrs5.nsw.optusnet.com.au [220.239.249.137]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CNDMI2075956 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Mon, 13 Aug 2012 09:13:22 +1000 (EST) (envelope-from peter@rulingia.com) X-Bogosity: Ham, spamicity=0.000000 Received: from server.rulingia.com (localhost.rulingia.com [127.0.0.1]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CNDGeX021899 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Mon, 13 Aug 2012 09:13:16 +1000 (EST) (envelope-from peter@server.rulingia.com) Received: (from peter@localhost) by server.rulingia.com (8.14.5/8.14.5/Submit) id q7CNDGeI021898 for freebsd-numerics@freebsd.org; Mon, 13 Aug 2012 09:13:16 +1000 (EST) (envelope-from peter) Resent-From: Peter Jeremy Resent-Date: Mon, 13 Aug 2012 09:13:16 +1000 Resent-Message-ID: <20120812231316.GB20453@server.rulingia.com> Resent-To: freebsd-numerics@freebsd.org Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q6JHBSWg015915 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Fri, 20 Jul 2012 03:11:28 +1000 (EST) (envelope-from stephen@missouri.edu) Received: from wilberforce.math.missouri.edu (wilberforce.math.missouri.edu [128.206.184.213]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q6JHBPnP081658 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Fri, 20 Jul 2012 03:11:27 +1000 (EST) (envelope-from stephen@missouri.edu) Received: from [127.0.0.1] (wilberforce.math.missouri.edu [128.206.184.213]) by wilberforce.math.missouri.edu (8.14.5/8.14.5) with ESMTP id q6JHB6IC047868; Thu, 19 Jul 2012 12:11:06 -0500 (CDT) (envelope-from stephen@missouri.edu) Message-ID: <50083FAA.109@missouri.edu> From: Stephen Montgomery-Smith Mail-Followup-To: freebsd-numerics@freebsd.org User-Agent: Mozilla/5.0 (X11; Linux i686; rv:13.0) Gecko/20120615 Thunderbird/13.0.1 MIME-Version: 1.0 To: Bruce Evans References: <20120714120432.GA70706@server.rulingia.com> <20120717084457.U3890@besplex.bde.org> <5004A5C7.1040405@missouri.edu> <5004DEA9.1050001@missouri.edu> <20120717200931.U6624@besplex.bde.org> <5006D13D.2080702@missouri.edu> <20120718205625.GA409@troutmask.apl.washington.edu> <500725F2.7060603@missouri.edu> <20120719025345.GA1376@troutmask.apl.washington.edu> <50077987.1080307@missouri.edu> <20120719032706.GA1558@troutmask.apl.washington.edu> <5007826D.7060806@missouri.edu> <20120719164458.G1927@besplex.bde.org> In-Reply-To: <20120719164458.G1927@besplex.bde.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: Diane Bruce , John Baldwin , David Chisnall , Bruce Evans , Steve Kargl , David Schultz , Peter Jeremy , Warner Losh Subject: Re: Use of C99 extra long double math functions after r236148 X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Date: Sun, 12 Aug 2012 23:13:23 -0000 X-Original-Date: Thu, 19 Jul 2012 12:11:06 -0500 X-List-Received-Date: Sun, 12 Aug 2012 23:13:23 -0000 On 07/19/2012 02:41 AM, Bruce Evans wrote: > It's OK for development. Not so OK for testing. My tests cover NaNs too, > and try not to have special knowledge of exceptional cases, so if the > NaN case takes many of times longer than the usual case it will slow down > the tests significantly. OK, I'll go back to the click code I found in the csqrt function. However, my personal experience is that when numerics has nans in it, that the program slows down to snails pace. This is with FreeBSD. This is actually quite annoying. A program that takes 30 seconds suddenly takes 10 minutes, and when you look at the output it is all nans. All that time for nothing! I will look in fdlibm as you suggest. From owner-freebsd-numerics@FreeBSD.ORG Sun Aug 12 23:13:27 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 7E634106567B for ; Sun, 12 Aug 2012 23:13:27 +0000 (UTC) (envelope-from peter@rulingia.com) Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by mx1.freebsd.org (Postfix) with ESMTP id DAF958FC16 for ; Sun, 12 Aug 2012 23:13:26 +0000 (UTC) Received: from server.rulingia.com (c220-239-249-137.belrs5.nsw.optusnet.com.au [220.239.249.137]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CNDQE5075960 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Mon, 13 Aug 2012 09:13:26 +1000 (EST) (envelope-from peter@rulingia.com) X-Bogosity: Ham, spamicity=0.000000 Received: from server.rulingia.com (localhost.rulingia.com [127.0.0.1]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CNDKUG021909 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Mon, 13 Aug 2012 09:13:20 +1000 (EST) (envelope-from peter@server.rulingia.com) Received: (from peter@localhost) by server.rulingia.com (8.14.5/8.14.5/Submit) id q7CNDKjH021908 for freebsd-numerics@freebsd.org; Mon, 13 Aug 2012 09:13:20 +1000 (EST) (envelope-from peter) Resent-From: Peter Jeremy Resent-Date: Mon, 13 Aug 2012 09:13:20 +1000 Resent-Message-ID: <20120812231320.GC20453@server.rulingia.com> Resent-To: freebsd-numerics@freebsd.org Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q6J1SQFH007828 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Thu, 19 Jul 2012 11:28:27 +1000 (EST) (envelope-from stephen@missouri.edu) Received: from wilberforce.math.missouri.edu (wilberforce.math.missouri.edu [128.206.184.213]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q6J1SNLB078820 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Thu, 19 Jul 2012 11:28:25 +1000 (EST) (envelope-from stephen@missouri.edu) Received: from [127.0.0.1] (wilberforce.math.missouri.edu [128.206.184.213]) by wilberforce.math.missouri.edu (8.14.5/8.14.5) with ESMTP id q6J1S1OD071141; Wed, 18 Jul 2012 20:28:01 -0500 (CDT) (envelope-from stephen@missouri.edu) Message-ID: <500762A1.5020601@missouri.edu> From: Stephen Montgomery-Smith Mail-Followup-To: freebsd-numerics@freebsd.org User-Agent: Mozilla/5.0 (X11; Linux i686; rv:13.0) Gecko/20120615 Thunderbird/13.0.1 MIME-Version: 1.0 To: Steve Kargl References: <201207130818.38535.jhb@freebsd.org> <9EB2DA4F-19D7-4BA5-8811-D9451CB1D907@theravensnest.org> <20120713155805.GC81965@zim.MIT.EDU> <20120714120432.GA70706@server.rulingia.com> <20120717084457.U3890@besplex.bde.org> <5004A5C7.1040405@missouri.edu> <5004DEA9.1050001@missouri.edu> <20120717200931.U6624@besplex.bde.org> <5006D13D.2080702@missouri.edu> <20120718205625.GA409@troutmask.apl.washington.edu> In-Reply-To: <20120718205625.GA409@troutmask.apl.washington.edu> Content-Type: multipart/mixed; boundary="------------050505070502080403090500" X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: Diane Bruce , John Baldwin , David Chisnall , Bruce Evans , Bruce Evans , David Schultz , Peter Jeremy , Warner Losh Subject: Re: Use of C99 extra long double math functions after r236148 X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Date: Sun, 12 Aug 2012 23:13:27 -0000 X-Original-Date: Wed, 18 Jul 2012 20:28:01 -0500 X-List-Received-Date: Sun, 12 Aug 2012 23:13:27 -0000 This is a multi-part message in MIME format. --------------050505070502080403090500 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Here is my attempt at the clog function. I also include the test code I used to check the edge cases. Feel free to critique and suggest changes, both the logic and the style. I copied some of the constructions from the csqrt function. I'll get started on measuring the ULP. --------------050505070502080403090500-- From owner-freebsd-numerics@FreeBSD.ORG Sun Aug 12 23:13:54 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id F39C7106564A for ; Sun, 12 Aug 2012 23:13:53 +0000 (UTC) (envelope-from peter@rulingia.com) Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by mx1.freebsd.org (Postfix) with ESMTP id 77D548FC1B for ; Sun, 12 Aug 2012 23:13:53 +0000 (UTC) Received: from server.rulingia.com (c220-239-249-137.belrs5.nsw.optusnet.com.au [220.239.249.137]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CNDrPP075983 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Mon, 13 Aug 2012 09:13:53 +1000 (EST) (envelope-from peter@rulingia.com) X-Bogosity: Ham, spamicity=0.000000 Received: from server.rulingia.com (localhost.rulingia.com [127.0.0.1]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CNDlmw021958 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Mon, 13 Aug 2012 09:13:47 +1000 (EST) (envelope-from peter@server.rulingia.com) Received: (from peter@localhost) by server.rulingia.com (8.14.5/8.14.5/Submit) id q7CNDlNN021957 for freebsd-numerics@freebsd.org; Mon, 13 Aug 2012 09:13:47 +1000 (EST) (envelope-from peter) Resent-From: Peter Jeremy Resent-Date: Mon, 13 Aug 2012 09:13:47 +1000 Resent-Message-ID: <20120812231347.GF20453@server.rulingia.com> Resent-To: freebsd-numerics@freebsd.org Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q6JH6je5015859 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Fri, 20 Jul 2012 03:06:46 +1000 (EST) (envelope-from stephen@missouri.edu) Received: from wilberforce.math.missouri.edu (wilberforce.math.missouri.edu [128.206.184.213]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q6JH6g7K081475 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Fri, 20 Jul 2012 03:06:45 +1000 (EST) (envelope-from stephen@missouri.edu) Received: from [127.0.0.1] (wilberforce.math.missouri.edu [128.206.184.213]) by wilberforce.math.missouri.edu (8.14.5/8.14.5) with ESMTP id q6JH6B9x047550; Thu, 19 Jul 2012 12:06:12 -0500 (CDT) (envelope-from stephen@missouri.edu) Message-ID: <50083E83.9090404@missouri.edu> From: Stephen Montgomery-Smith Mail-Followup-To: freebsd-numerics@freebsd.org User-Agent: Mozilla/5.0 (X11; Linux i686; rv:13.0) Gecko/20120615 Thunderbird/13.0.1 MIME-Version: 1.0 To: Bruce Evans References: <20120529045612.GB4445@server.rulingia.com> <20120711223247.GA9964@troutmask.apl.washington.edu> <20120713114100.GB83006@server.rulingia.com> <201207130818.38535.jhb@freebsd.org> <9EB2DA4F-19D7-4BA5-8811-D9451CB1D907@theravensnest.org> <20120713155805.GC81965@zim.MIT.EDU> <20120714120432.GA70706@server.rulingia.com> <20120717084457.U3890@besplex.bde.org> <5004A5C7.1040405@missouri.edu> <5004DEA9.1050001@missouri.edu> <20120717200931.U6624@besplex.bde.org> <5006D13D.2080702@missouri.edu> <20120719144432.N1596@besplex.bde.org> In-Reply-To: <20120719144432.N1596@besplex.bde.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: Diane Bruce , John Baldwin , David Chisnall , Bruce Evans , Steve Kargl , David Schultz , Peter Jeremy , Warner Losh Subject: Re: Use of C99 extra long double math functions after r236148 X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Date: Sun, 12 Aug 2012 23:13:54 -0000 X-Original-Date: Thu, 19 Jul 2012 12:06:11 -0500 X-List-Received-Date: Sun, 12 Aug 2012 23:13:54 -0000 On 07/19/2012 01:37 AM, Bruce Evans wrote: > On Wed, 18 Jul 2012, Stephen Montgomery-Smith wrote: > >> I went on a long road trip yesterday, so I didn't get any code >> written, but I did have a lot of thoughts about clog and casinh. >> >> First, the naive formula (here z=x+I*y) >> clog(z) = cpack(log(hypot(x,y)),atan2(x,y)) >> is going to work in a lot more edge cases then one might expect. This >> is because hypot and atan2, especially atan2, already do a rather good >> job getting the edge cases right. I am thinking in particular of when >> x or y are 0 or -0, or one of them is infinity or -infinity. > > Right, clog is deceptively simple. This is because it decomposes perfectly > into 2 real functions of 2 real variables and both of these functions are > standard and already implemented almost as well as possible. ISTR das > saying that it had a complicated case, but I don't see even one. atan2() > is supposed to handle all combinations of +-0. Now I remember a potential > problem. Complex functions should have only poles and zeros, with > projective infinity and "projective zero" (= inverse of projective > infinity). Real functions can and do have affine infinities and zeros > (+-Inf and +-0), with more detailed special cases. It's just impossible > to have useful, detailed special cases for all the ways of approaching > complex (projective) infinity and 0. > I think Kahan wanted projective infinity in IEEE7xx in ~1980. > Intel 8087 had both projective infinity and affine infinities, but > projective infinity didn't make it into the first IEEEE7xx, and > hardly anyone understood it and it was eventually dropped from > Intel FPUs (I think it was in 80287; then in i486 it was reduced > to a bit in the control word that can never be cleared (the bit is > to set affine infinities); then in SSE the bit went away too). > However, C99 tries too hard to make complex functions reduce to real > functions when everything is purely real or purely complex. So most > of the special cases for +-0 and +-Inf affect complex functions (for > other directions of approaching 0 and infinity, not much is specified > but you should try to be as continuous as possible, where continuity > has delicate unclear meanings since it is related to discontinuous > sign functions). Hopefully, the specification of imag(clog()) is > that it has the same sign behaviour as atan2(), so you can just use > atan2(). The sign conventions for both are arbitrary, but they > shouldn't be gratuitously different. You still have to check that > they aren't non-gratuitously different, because different conventions > became established. I checked. Actually the sign conventions are not that arbitrary. But as a mathematician I would say they are a bit useless, e.g. atan(infinity,infinity) = pi/4 = 45 degrees How do you know that the two infinities are the same? One could be double the other. If it had been up to me, there would have been finite numbers, and nan. And none of this -0. > I had forgotten that pari doesn't support -0 at all (AFAIK). I certainly > had to change a sign to get match the pari result for 0+y*I, but it was > the sign of y. Your original code seems to have y where it should have x: Oops. > I don't know what happens with zeros of complex inverse trig functions. > I think they don't have many (like log()), but their real and imaginary > parts do, and they are too general for accurate behaviour of the real > and imaginary parts relative to themselves to fall out. casinh(z) is zero only when z=0, and near that point I could use Taylor's series (but a lot of terms would be needed because the Taylot series converges quite slowly). I can now see that the separate cases of the real part and imaginary parts of casinh being zero is going to be hard. I'll probably end up reading the paper Jeremy suggested, and implementing that. But I always prefer self discovery first. From owner-freebsd-numerics@FreeBSD.ORG Sun Aug 12 23:14:06 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 49857106564A for ; Sun, 12 Aug 2012 23:14:06 +0000 (UTC) (envelope-from peter@rulingia.com) Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by mx1.freebsd.org (Postfix) with ESMTP id D63378FC0C for ; Sun, 12 Aug 2012 23:14:05 +0000 (UTC) Received: from server.rulingia.com (c220-239-249-137.belrs5.nsw.optusnet.com.au [220.239.249.137]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CNE55F075993 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Mon, 13 Aug 2012 09:14:05 +1000 (EST) (envelope-from peter@rulingia.com) X-Bogosity: Ham, spamicity=0.000000 Received: from server.rulingia.com (localhost.rulingia.com [127.0.0.1]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CNDxPm021976 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Mon, 13 Aug 2012 09:13:59 +1000 (EST) (envelope-from peter@server.rulingia.com) Received: (from peter@localhost) by server.rulingia.com (8.14.5/8.14.5/Submit) id q7CNDxQM021975 for freebsd-numerics@freebsd.org; Mon, 13 Aug 2012 09:13:59 +1000 (EST) (envelope-from peter) Resent-From: Peter Jeremy Resent-Date: Mon, 13 Aug 2012 09:13:59 +1000 Resent-Message-ID: <20120812231359.GH20453@server.rulingia.com> Resent-To: freebsd-numerics@freebsd.org Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q6K4HUOR024976 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Fri, 20 Jul 2012 14:17:30 +1000 (EST) (envelope-from stephen@missouri.edu) Received: from wilberforce.math.missouri.edu (wilberforce.math.missouri.edu [128.206.184.213]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q6K4HSws085357 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Fri, 20 Jul 2012 14:17:30 +1000 (EST) (envelope-from stephen@missouri.edu) Received: from [127.0.0.1] (wilberforce.math.missouri.edu [128.206.184.213]) by wilberforce.math.missouri.edu (8.14.5/8.14.5) with ESMTP id q6K4H690091312; Thu, 19 Jul 2012 23:17:06 -0500 (CDT) (envelope-from stephen@missouri.edu) Message-ID: <5008DBC2.4040304@missouri.edu> From: Stephen Montgomery-Smith Mail-Followup-To: freebsd-numerics@freebsd.org User-Agent: Mozilla/5.0 (X11; Linux i686; rv:14.0) Gecko/20120714 Thunderbird/14.0 MIME-Version: 1.0 To: Bruce Evans References: <20120529045612.GB4445@server.rulingia.com> <20120711223247.GA9964@troutmask.apl.washington.edu> <20120713114100.GB83006@server.rulingia.com> <201207130818.38535.jhb@freebsd.org> <9EB2DA4F-19D7-4BA5-8811-D9451CB1D907@theravensnest.org> <20120713155805.GC81965@zim.MIT.EDU> <20120714120432.GA70706@server.rulingia.com> <20120717084457.U3890@besplex.bde.org> <5004A5C7.1040405@missouri.edu> <5004DEA9.1050001@missouri.edu> <20120717200931.U6624@besplex.bde.org> <5006D13D.2080702@missouri.edu> <20120719144432.N1596@besplex.bde.org> <50083E83.9090404@missouri.edu> <20120720120802.F1061@besplex.bde.org> In-Reply-To: <20120720120802.F1061@besplex.bde.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: Diane Bruce , John Baldwin , David Chisnall , Bruce Evans , Steve Kargl , David Schultz , Peter Jeremy , Warner Losh Subject: Re: Use of C99 extra long double math functions after r236148 X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Date: Sun, 12 Aug 2012 23:14:06 -0000 X-Original-Date: Thu, 19 Jul 2012 23:17:06 -0500 X-List-Received-Date: Sun, 12 Aug 2012 23:14:06 -0000 On 07/19/2012 09:19 PM, Bruce Evans wrote: > On Thu, 19 Jul 2012, Stephen Montgomery-Smith wrote: > >> I can now see that the separate cases of the real part and imaginary >> parts of casinh being zero is going to be hard. > > I won't ask for that and will measure errors relative to the absolute value > of the result. But the algorithm in the paper by Hull et al (the paper recommended by Jeremy) manages to do this very effectively. So I will abandon my algorithm, and use Hull et al's algorithms. The real part of casinh(z) is zero only if z=I*y, |y|<=1, and the imaginary part of casinh(z) is zero only if z is real. This is much easier to quantify in floating point terms than the condition for clog(z) to be pure imaginary (|z|=1). Once you realize this, you see that it makes sense to compute casinh(z) by considering the real and imaginary parts separately. And this is exactly what Hull et al does. From owner-freebsd-numerics@FreeBSD.ORG Sun Aug 12 23:14:23 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id D6F9D106566B for ; Sun, 12 Aug 2012 23:14:22 +0000 (UTC) (envelope-from peter@rulingia.com) Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by mx1.freebsd.org (Postfix) with ESMTP id 6A8FC8FC1A for ; Sun, 12 Aug 2012 23:14:22 +0000 (UTC) Received: from server.rulingia.com (c220-239-249-137.belrs5.nsw.optusnet.com.au [220.239.249.137]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CNEMOD075996 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Mon, 13 Aug 2012 09:14:22 +1000 (EST) (envelope-from peter@rulingia.com) X-Bogosity: Ham, spamicity=0.000000 Received: from server.rulingia.com (localhost.rulingia.com [127.0.0.1]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CNEFvY021985 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Mon, 13 Aug 2012 09:14:16 +1000 (EST) (envelope-from peter@server.rulingia.com) Received: (from peter@localhost) by server.rulingia.com (8.14.5/8.14.5/Submit) id q7CNEFaf021984 for freebsd-numerics@freebsd.org; Mon, 13 Aug 2012 09:14:15 +1000 (EST) (envelope-from peter) Resent-From: Peter Jeremy Resent-Date: Mon, 13 Aug 2012 09:14:15 +1000 Resent-Message-ID: <20120812231415.GI20453@server.rulingia.com> Resent-To: freebsd-numerics@freebsd.org From: Peter Jeremy Mail-Followup-To: freebsd-numerics@freebsd.org To: David Schultz , Warner Losh , Steve Kargl , Stephen Montgomery-Smith Message-ID: <20120713120239.GA86153@server.rulingia.com> References: <20120708124047.GA44061@zim.MIT.EDU> <210816F0-7ED7-4481-ABFF-C94A700A3EA0@bsdimp.com> <20120708233624.GA53462@troutmask.apl.washington.edu> <4FFBF16D.2030007@gwdg.de> <2A1DE516-ABB4-49D7-8C3D-2C4DA2D9FCF5@bsdimp.com> <4FFC412B.4090202@gwdg.de> <20120710151115.GA56950@zim.MIT.EDU> <4FFC5E5D.8000502@gwdg.de> <20120710225801.GB58778@zim.MIT.EDU> <20120711005506.GA88249@server.rulingia.com> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="sm4nu43k4a2Rpi4c" Content-Disposition: inline In-Reply-To: <20120711005506.GA88249@server.rulingia.com> X-PGP-Key: http://www.rulingia.com/keys/peter.pgp User-Agent: Mutt/1.5.21 (2010-09-15) Cc: Bruce Evans Subject: Re: Use of C99 extra long double math functions after r236148 X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Date: Sun, 12 Aug 2012 23:14:23 -0000 X-Original-Date: Fri, 13 Jul 2012 22:02:39 +1000 X-List-Received-Date: Sun, 12 Aug 2012 23:14:23 -0000 --sm4nu43k4a2Rpi4c Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Hi all, On 2012-Jul-13 21:41:00 +1000, Peter Jeremy wro= te: >else's toes. As I said previously, I believe the existing wiki page >could be improved to form a central co-ordinating point to show what >what activity is (or isn't) occurring. I'm willing to start by cleaning up http://wiki.freebsd.org/MissingMathStuff if one of you would like to give PeterJeremy write access to it. As a start, I was thinking of a table that listed each outstanding function, together with: - a URL for code/patch to implement (& test) it - PR covering it - contact name for person working on it - details of commits to each branch - Any other notes. (If any of you have, or know of, uncommitted patches, I'd appreciate detail= s). I would also like to include details of relevant references that will be of use to anyone working on these functions. Off the top of my head (and from previous postings in this thread): http://dlmf.nist.gov http://docs.oracle.com/cd/E19957-01/806-3568/ncg_goldberg.html http://www.cs.berkeley.edu/~wkahan/ Is it worth writing up a paragraph or two to go into the forthcoming quarterly status report that at least says that we realize there is a gap and we would be happy for any assistance in filling it? --=20 Peter Jeremy --sm4nu43k4a2Rpi4c Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.19 (FreeBSD) iEYEARECAAYFAlAADl8ACgkQ/opHv/APuIdrEwCfR0vWnCjx8E5mWCfXvNnFy8YI wxMAn055dclfaNvQ8hSh1hkp2pHlw3ld =lIHs -----END PGP SIGNATURE----- --sm4nu43k4a2Rpi4c-- From owner-freebsd-numerics@FreeBSD.ORG Sun Aug 12 22:58:04 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 9C384106566B for ; Sun, 12 Aug 2012 22:58:04 +0000 (UTC) (envelope-from peter@rulingia.com) Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by mx1.freebsd.org (Postfix) with ESMTP id EE1AA8FC18 for ; Sun, 12 Aug 2012 22:58:03 +0000 (UTC) Received: from server.rulingia.com (c220-239-249-137.belrs5.nsw.optusnet.com.au [220.239.249.137]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CMw3Hw075516 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Mon, 13 Aug 2012 08:58:03 +1000 (EST) (envelope-from peter@rulingia.com) X-Bogosity: Ham, spamicity=0.000000 Received: from server.rulingia.com (localhost.rulingia.com [127.0.0.1]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CMvuk2020909 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Mon, 13 Aug 2012 08:57:56 +1000 (EST) (envelope-from peter@server.rulingia.com) Received: (from peter@localhost) by server.rulingia.com (8.14.5/8.14.5/Submit) id q7CMvuQ9020908 for freebsd-numerics@freebsd.org; Mon, 13 Aug 2012 08:57:56 +1000 (EST) (envelope-from peter) Resent-From: Peter Jeremy Resent-Date: Mon, 13 Aug 2012 08:57:56 +1000 Resent-Message-ID: <20120812225756.GK20453@server.rulingia.com> Resent-To: freebsd-numerics@freebsd.org Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q6I41xco090771 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Wed, 18 Jul 2012 14:01:59 +1000 (EST) (envelope-from brde@optusnet.com.au) Received: from mail05.syd.optusnet.com.au (mail05.syd.optusnet.com.au [211.29.132.186]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q6I41xkX071239 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Wed, 18 Jul 2012 14:01:59 +1000 (EST) (envelope-from brde@optusnet.com.au) Received: from c122-106-171-246.carlnfd1.nsw.optusnet.com.au (c122-106-171-246.carlnfd1.nsw.optusnet.com.au [122.106.171.246]) by mail05.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id q6I41gtH021603 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Wed, 18 Jul 2012 14:01:43 +1000 From: Bruce Evans Mail-Followup-To: freebsd-numerics@freebsd.org X-X-Sender: bde@besplex.bde.org To: Peter Jeremy In-Reply-To: <20120718001337.GA87817@server.rulingia.com> Message-ID: <20120718123627.D1575@besplex.bde.org> References: <20120713155805.GC81965@zim.MIT.EDU> <20120714120432.GA70706@server.rulingia.com> <20120717084457.U3890@besplex.bde.org> <5004A5C7.1040405@missouri.edu> <5004DEA9.1050001@missouri.edu> <20120717040118.GA86840@troutmask.apl.washington.edu> <20120717042125.GF66913@server.rulingia.com> <20120717043848.GB87001@troutmask.apl.washington.edu> <20120717225328.GA86902@server.rulingia.com> <20120717232740.GA95026@troutmask.apl.washington.edu> <20120718001337.GA87817@server.rulingia.com> MIME-Version: 1.0 Content-Type: MULTIPART/MIXED; BOUNDARY="0-1667853581-1342584102=:1575" X-Mailman-Approved-At: Sun, 12 Aug 2012 23:55:59 +0000 Cc: Diane Bruce , Steve Kargl , John Baldwin , David Chisnall , Stephen Montgomery-Smith , Bruce Evans , Bruce Evans , David Schultz , Warner Losh Subject: Re: Use of C99 extra long double math functions after r236148 X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Date: Sun, 12 Aug 2012 22:58:04 -0000 X-Original-Date: Wed, 18 Jul 2012 14:01:42 +1000 (EST) X-List-Received-Date: Sun, 12 Aug 2012 22:58:04 -0000 This message is in MIME format. The first part should be readable text, while the remaining parts are likely unreadable without MIME-aware tools. --0-1667853581-1342584102=:1575 Content-Type: TEXT/PLAIN; charset=X-UNKNOWN; format=flowed Content-Transfer-Encoding: QUOTED-PRINTABLE On Wed, 18 Jul 2012, Peter Jeremy wrote: > On 2012-Jul-17 16:27:40 -0700, Steve Kargl wrote: >> I won't have time to go over the code in detail until >> this weekend, but a quick peek showed some issues. The >> first is style. Although fdlibm has a rather interest >> coding style, new code should use KNF. > > I hope that was only the function declaration lines. I think > the rest is KNF. From=20style(9): % /* % * Multi-line comments look like this. Make them real sentences. F= ill % * them so they look like real paragraphs. % */ (beware than man(1) has mangled the indentation from 0 to 5 spaces. I added my usual quoting for code of "% ". From=20the new file: % /* % ** Calculate complex arc tangent using the identity: % ** catan(z) =3D -i catanh(iz) % */ Multi-line comments don't look like this. Another style point visible in this comment is how to write 'i' and multiplication. Multiplication by juxtaposition (iz) doesn't work near C code with long identifiers which might be named iz. It probably requires all variable names to be 1 letter in a special font for this use. I tried to use "I z" consistently in comments in c_ccosh*.c and to get everyone to follow this convention, but there are already some inconsistencies, and I now wonder if "z I" is better. The pari presentation uses "*" and puts "I" last, and uses spaces for "+" but not for "*" (e.g., "1 + 2*I"). % double complex % catanh(double complex z) % { % =09double zr, zi; % =09int cr, ci; %=20 % =09zr =3D creal(z); % =09cr =3D fpclassify(zr); % =09zi =3D cimag(z); % =09ci =3D fpclassify(zi); The standard classification macros are good for developing things, but they are very slow. All (?) committed complex functions use hard-coded bit test. These are almost as easy to write as the classification macros. Copy them from c_ccosh.c. Copy variable names from c_ccosh.c too. %=20 % =09/* % =09 * catanh(+0 + i0) returns +0 + i0. % =09 * catanh(+0 + iNaN) returns +0 + iNaN. % =09 */ This looks like the description in C99. ccosh.c uses something like: % =09/* % =09 * ctanh(+0 + I 0) =3D +0 + I 0. % =09 * atanh(+0 + I NaN) =3D +0 + I d(NaN). % =09 */ d(NaN) documents that the NaN returned is not necessarily the same as the NaN in the arg. It should be the original NaN with default conversions. The default conversion should be to only quieten signaling NaNs. Most arches have a quiet bit and the conversion sets this. Other conversions are very MD and related to bugs like giving different results depending on the precision used to pass args. d(NaN) is quite different from dNaN. The latter is a fixed default NaN that normally results from any invalid operation on non-NaNs. This is arcane and I probably got it wrong in many cases. My hope was that someday all of these comments could be turned into meta-info that is used to generate test vectors and assertions and maybe man pages. They don't belong in the code. But to generate test vectors and assertions, they need to be very formal and correct. For man pages, I think I prefer to hard-code the documentation but test that it agrees with the meta-info. % =09if (cr =3D=3D FP_NAN) { % =09=09/* % =09=09 * catanh(NaN + iInf) returns =C2=B10 + i=CF=80/2 % =09=09 * the sign of the real part of the result is not % =09=09 * specified by the standard so return +0. % =09=09 */ The UTF is similar to in C99 where it is used for the "+-" amd "infinity" symbols. It messes up n869.txt too (C and POSIX working group translations to text are poor. IIRC, "+-" gets mangled to "+", and "infinity gets mangled to "0"). Why Inf for the arg and not for the result? Here are my current fixes for committed versions of complex functions. They are mostly to fix comments and to remove redundant classifications. This is far from complete in getting all the functions written in a consistent style. They have some fixes for missing or wrong NaN conversions. Most of the bugs for NaN conversions were found by diffing source files and noticing style differences that were actually code bugs. % Index: s_ccoshf.c % =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D % RCS file: /home/ncvs/src/lib/msun/src/s_ccoshf.c,v % retrieving revision 1.2 % diff -u -2 -r1.2 s_ccoshf.c % --- s_ccoshf.c=0921 Oct 2011 06:29:32 -0000=091.2 % +++ s_ccoshf.c=0925 Oct 2011 14:04:11 -0000 % @@ -26,5 +26,5 @@ %=20 % /* % - * Hyperbolic cosine of a complex argument. See s_ccosh.c for details. % + * Hyperbolic cosine of a float complex argument. See s_ccosh.c for det= ails. % */ %=20 % @@ -63,5 +63,5 @@ % =09=09if (ix < 0x42b17218) { % =09=09=09/* x < 88.7: expf(|x|) won't overflow */ % -=09=09=09h =3D expf(fabsf(x)) * 0.5f; % +=09=09=09h =3D expf(fabsf(x)) * 0.5F; % =09=09=09return (cpackf(h * cosf(y), copysignf(h, x) * sinf(y))); % =09=09} else if (ix < 0x4340b1e7) { % @@ -76,20 +76,28 @@ % =09} %=20 % -=09if (ix =3D=3D 0 && iy >=3D 0x7f800000) % +=09if (ix =3D=3D 0)=09=09=09/* && iy >=3D 0x7f800000 */ % =09=09return (cpackf(y - y, copysignf(0, x * (y - y)))); %=20 % -=09if (iy =3D=3D 0 && ix >=3D 0x7f800000) { % -=09=09if ((hx & 0x7fffff) =3D=3D 0) % +=09if (iy =3D=3D 0) {=09=09=09/* && ix >=3D 0x7f800000 */ % +=09=09if (ix =3D=3D 0x7f800000) % =09=09=09return (cpackf(x * x, copysignf(0, x) * y)); % +=09=09/* % +=09=09 * This does a lot of work to get the same sign as the NaN % +=09=09 * sinhf(x =3D NaN) * (y =3D +-0). % +=09=09 * % +=09=09 * In s_ccosh.c, we document that the sign is unspecified, % +=09=09 * but don't document our choice, and d(NaN) is possibly % +=09=09 * not parenthesized correctly. % +=09=09 */ Float versions are supposed to look as much like double versions as possible, except routine and/or comments should be left out of the float version. IIRC, I wrote the large comment here because I originally wanted to limit this set of changes to mainly this file. % =09=09return (cpackf(x * x, copysignf(0, (x + x) * y))); % =09} %=20 % -=09if (ix < 0x7f800000 && iy >=3D 0x7f800000) % +=09if (ix < 0x7f800000)=09=09/* && iy >=3D 0x7f800000 */ % =09=09return (cpackf(y - y, x * (y - y))); %=20 % -=09if (ix >=3D 0x7f800000 && (hx & 0x7fffff) =3D=3D 0) { % +=09if (ix =3D=3D 0x7f800000) { % =09=09if (iy >=3D 0x7f800000) % -=09=09=09return (cpackf(x * x, x * (y - y))); % -=09=09return (cpackf((x * x) * cosf(y), x * sinf(y))); % +=09=09=09return (cpackf(INFINITY, x * (y - y))); % +=09=09return (cpackf(INFINITY * cosf(y), x * sinf(y))); The handling of infinities was wrong and/or unecessarily complicated. In general, we use the expression (x * x) to turn both +Inf and -Inf into +Inf, and also to generate d(NaN) or dNaN if x is invalid, but if we know that the result is always +Inf then we can just use INFINITY. Similarly, we use the expression (x + x) if we want to preserve the sign of +-Inf but generate NaNs. IIRC, there is a fix for a sign error somewhere in these patches, with the bug caused by using the wrong expressi= on. % =09} %=20 % Index: s_csinh.c % =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D % RCS file: /home/ncvs/src/lib/msun/src/s_csinh.c,v % retrieving revision 1.2 % diff -u -2 -r1.2 s_csinh.c % --- s_csinh.c=0921 Oct 2011 06:29:32 -0000=091.2 % +++ s_csinh.c=093 Jan 2012 06:20:18 -0000 % @@ -26,4 +26,11 @@ %=20 % /* % + * XXX TODO: % + * Change x +-I y to x + I (+-y) or vice versa? We currently use the % + * former former for args and the latter for results. % + * s/the invalid floating-point exception/FE_INVALID/g % + */ Oops, it was this and not putting "I" last that I want to change. "I" in the middle goes better with omitting "*". It serves as a delimiter for the real and complex parts. But when the "+-" os attached to "I", it is not arrached to the complex part. % + % +/* % * Hyperbolic sine of a complex argument z =3D x + i y. % * % @@ -64,5 +71,5 @@ % =09=09if ((iy | ly) =3D=3D 0) % =09=09=09return (cpack(sinh(x), y)); % -=09=09if (ix < 0x40360000)=09/* small x: normal case */ % +=09=09if (ix < 0x40360000)=09/* |x| < 22: normal case */ % =09=09=09return (cpack(sinh(x) * cos(y), cosh(x) * sin(y))); %=20 % @@ -84,14 +91,14 @@ %=20 % =09/* % -=09 * sinh(+-0 +- I Inf) =3D sign(d(+-0, dNaN))0 + I dNaN. % -=09 * The sign of 0 in the result is unspecified. Choice =3D normally % -=09 * the same as dNaN. Raise the invalid floating-point exception. % -=09 * % -=09 * sinh(+-0 +- I NaN) =3D sign(d(+-0, NaN))0 + I d(NaN). % -=09 * The sign of 0 in the result is unspecified. Choice =3D normally % -=09 * the same as d(NaN). % +=09 * sinh(+-0 +- I Inf) =3D +-0 + I dNaN. % +=09 * The sign of 0 in the result is unspecified. Choice =3D same sign % +=09 * as the argument. Raise the invalid floating-point exception. % +=09 * % +=09 * sinh(+-0 +- I NaN) =3D +-0 + I d(NaN). % +=09 * The sign of 0 in the result is unspecified. Choice =3D same sign % +=09 * as the argument. % =09 */ % -=09if ((ix | lx) =3D=3D 0 && iy >=3D 0x7ff00000) % -=09=09return (cpack(copysign(0, x * (y - y)), y - y)); % +=09if ((ix | lx) =3D=3D 0)=09=09/* && iy >=3D 0x7ff00000 */ % +=09=09return (cpack(x, y - y)); %=20 % =09/* Since the sign of the 0 is unspecified, don't use a complicated rule to get a particular one. % @@ -100,9 +107,6 @@ % =09 * sinh(NaN +- I 0) =3D d(NaN) + I +-0. % =09 */ % -=09if ((iy | ly) =3D=3D 0 && ix >=3D 0x7ff00000) { % -=09=09if (((hx & 0xfffff) | lx) =3D=3D 0) % -=09=09=09return (cpack(x, y)); % -=09=09return (cpack(x, copysign(0, y))); % -=09} % +=09if ((iy | ly) =3D=3D 0)=09=09/* && ix >=3D 0x7ff00000 */ % +=09=09return (cpack(x + x, y)); %=20 % =09/* This was missing an operation on x, so signaling NaNs were not properly quieted. I probably missed this bug in early testing because I mostly tested on i386, and on i386 signaling NaNs in < long double precision are quieted just by loading them. This was unnecessarily complicated. % @@ -114,21 +118,21 @@ % =09 * nonzero x. Choice =3D don't raise (except for signaling NaNs). % =09 */ % -=09if (ix < 0x7ff00000 && iy >=3D 0x7ff00000) % -=09=09return (cpack(y - y, x * (y - y))); % +=09if (ix < 0x7ff00000)=09=09/* && iy >=3D 0x7ff00000 */ % +=09=09return (cpack(y - y, y - y)); %=20 % =09/* % =09 * sinh(+-Inf + I NaN) =3D +-Inf + I d(NaN). % -=09 * The sign of Inf in the result is unspecified. Choice =3D normally % -=09 * the same as d(NaN). % +=09 * The sign of Inf in the result is unspecified. Choice =3D same sig= n % +=09 * as the argument. % =09 * % =09 * sinh(+-Inf +- I Inf) =3D +Inf + I dNaN. % -=09 * The sign of Inf in the result is unspecified. Choice =3D always += =2E % -=09 * Raise the invalid floating-point exception. % +=09 * The sign of Inf in the result is unspecified. Choice =3D same sig= n % +=09 * as the argument. Raise the invalid floating-point exception. % =09 * % =09 * sinh(+-Inf + I y) =3D +-Inf cos(y) + I Inf sin(y) % =09 */ % -=09if (ix >=3D 0x7ff00000 && ((hx & 0xfffff) | lx) =3D=3D 0) { % +=09if (ix =3D=3D 0x7ff00000 && lx =3D=3D 0) { % =09=09if (iy >=3D 0x7ff00000) % -=09=09=09return (cpack(x * x, x * (y - y))); % +=09=09=09return (cpack(x, y - y)); % =09=09return (cpack(x * cos(y), INFINITY * sin(y))); % =09} % @@ -145,5 +149,5 @@ % =09 * nonzero y. Choice =3D don't raise (except for signaling NaNs). % =09 */ % -=09return (cpack((x * x) * (y - y), (x + x) * (y - y))); % +=09return (cpack((x + x) * (y - y), (x * x) * (y - y))); % } %=20 % @@ -152,5 +156,5 @@ % { %=20 % -=09/* csin(z) =3D -I * csinh(I * z) */ % +=09/* csin(z) =3D -I * csinh(I * z). */ % =09z =3D csinh(cpack(-cimag(z), creal(z))); % =09return (cpack(cimag(z), -creal(z))); % Index: s_csinhf.c % =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D % RCS file: /home/ncvs/src/lib/msun/src/s_csinhf.c,v % retrieving revision 1.2 % diff -u -2 -r1.2 s_csinhf.c % --- s_csinhf.c=0921 Oct 2011 06:29:32 -0000=091.2 % +++ s_csinhf.c=093 Jan 2012 06:20:30 -0000 % @@ -26,5 +26,5 @@ %=20 % /* % - * Hyperbolic sine of a complex argument z. See s_csinh.c for details. % + * Float version of csinh(). See s_csinh.c for details. % */ %=20 % @@ -57,5 +57,5 @@ % =09=09if (iy =3D=3D 0) % =09=09=09return (cpackf(sinhf(x), y)); % -=09=09if (ix < 0x41100000)=09/* small x: normal case */ % +=09=09if (ix < 0x41100000)=09/* |x| < 9: normal case */ % =09=09=09return (cpackf(sinhf(x) * cosf(y), coshf(x) * sinf(y))); %=20 % @@ -63,5 +63,5 @@ % =09=09if (ix < 0x42b17218) { % =09=09=09/* x < 88.7: expf(|x|) won't overflow */ % -=09=09=09h =3D expf(fabsf(x)) * 0.5f; % +=09=09=09h =3D expf(fabsf(x)) * 0.5F; % =09=09=09return (cpackf(copysignf(h, x) * cosf(y), h * sinf(y))); % =09=09} else if (ix < 0x4340b1e7) { % @@ -76,23 +76,20 @@ % =09} %=20 % -=09if (ix =3D=3D 0 && iy >=3D 0x7f800000) % -=09=09return (cpackf(copysignf(0, x * (y - y)), y - y)); % +=09if (ix =3D=3D 0)=09=09=09/* && iy >=3D 0x7f800000 */ % +=09=09return (cpackf(x, y - y)); %=20 % -=09if (iy =3D=3D 0 && ix >=3D 0x7f800000) { % -=09=09if ((hx & 0x7fffff) =3D=3D 0) % -=09=09=09return (cpackf(x, y)); % -=09=09return (cpackf(x, copysignf(0, y))); % -=09} % +=09if (iy =3D=3D 0)=09=09=09/* && ix >=3D 0x7f800000 */ % +=09=09return (cpackf(x + x , y)); %=20 % -=09if (ix < 0x7f800000 && iy >=3D 0x7f800000) % -=09=09return (cpackf(y - y, x * (y - y))); % +=09if (ix < 0x7f800000)=09=09/* && iy >=3D 0x7f800000 */ % +=09=09return (cpackf(y - y, y - y)); %=20 % -=09if (ix >=3D 0x7f800000 && (hx & 0x7fffff) =3D=3D 0) { % +=09if (ix =3D=3D 0x7f800000) { % =09=09if (iy >=3D 0x7f800000) % -=09=09=09return (cpackf(x * x, x * (y - y))); % +=09=09=09return (cpackf(x, y - y)); % =09=09return (cpackf(x * cosf(y), INFINITY * sinf(y))); % =09} %=20 % -=09return (cpackf((x * x) * (y - y), (x + x) * (y - y))); % +=09return (cpackf((x + x) * (y - y), (x * x) * (y - y))); The last change (swapping "*"s and "+"s might be only to be logically correct. The sign behaviour for sinh() is quited different than for cosh(), and too much code was copied from c_cosh*.c to create c_sinh*.c. % } %=20 % Index: s_csqrt.c % =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D % RCS file: /home/ncvs/src/lib/msun/src/s_csqrt.c,v % retrieving revision 1.4 % diff -u -2 -r1.4 s_csqrt.c % --- s_csqrt.c=098 Aug 2008 00:15:16 -0000=091.4 % +++ s_csqrt.c=0925 Oct 2011 14:49:27 -0000 % @@ -64,5 +64,5 @@ % =09if (isnan(a)) { % =09=09t =3D (b - b) / (b - b);=09/* raise invalid if b is not a NaN */ % -=09=09return (cpack(a, t));=09/* return NaN + NaN i */ % +=09=09return (cpack(a + a, t));=09/* return NaN + NaN i */ % =09} % =09if (isinf(a)) { All the csqrt() functions were missing NaN quieting. Their comment style is still quite different. % Index: s_csqrtf.c % =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D % RCS file: /home/ncvs/src/lib/msun/src/s_csqrtf.c,v % retrieving revision 1.3 % diff -u -2 -r1.3 s_csqrtf.c % --- s_csqrtf.c=098 Aug 2008 00:15:16 -0000=091.3 % +++ s_csqrtf.c=0925 Oct 2011 14:49:51 -0000 % @@ -55,5 +55,5 @@ % =09if (isnan(a)) { % =09=09t =3D (b - b) / (b - b);=09/* raise invalid if b is not a NaN */ % -=09=09return (cpackf(a, t));=09/* return NaN + NaN i */ % +=09=09return (cpackf(a + a, t));=09/* return NaN + NaN i */ % =09} % =09if (isinf(a)) { % Index: s_csqrtl.c % =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D % RCS file: /home/ncvs/src/lib/msun/src/s_csqrtl.c,v % retrieving revision 1.2 % diff -u -2 -r1.2 s_csqrtl.c % --- s_csqrtl.c=098 Aug 2008 00:15:16 -0000=091.2 % +++ s_csqrtl.c=0925 Oct 2011 14:50:01 -0000 % @@ -64,5 +64,5 @@ % =09if (isnan(a)) { % =09=09t =3D (b - b) / (b - b);=09/* raise invalid if b is not a NaN */ % -=09=09return (cpackl(a, t));=09/* return NaN + NaN i */ % +=09=09return (cpackl(a + a, t));=09/* return NaN + NaN i */ % =09} % =09if (isinf(a)) { % Index: s_ctanh.c % =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D % RCS file: /home/ncvs/src/lib/msun/src/s_ctanh.c,v % retrieving revision 1.2 % diff -u -2 -r1.2 s_ctanh.c % --- s_ctanh.c=0921 Oct 2011 06:30:16 -0000=091.2 % +++ s_ctanh.c=0925 Oct 2011 14:30:18 -0000 % @@ -86,4 +86,15 @@ %=20 % =09/* % +=09 * XXX this is missing the dNaN/d(NaN) notation, which tells us the % +=09 * following: % +=09 * dNaN is a default NaN unrelated to any NaN args % +=09 * d(NaN) is a unary conversion (usually quieting) of the arg `NaN' % +=09 * % +=09 * XXX everything is missing: % +=09 * d(NaN1, NaN2) and d(NaN, y) % +=09 * which should be used for binary conversions. % +=09 * % +=09 * XXX this misspells I as i. % +=09 * % =09 * ctanh(NaN + i 0) =3D NaN + i 0 % =09 * % @@ -103,5 +114,5 @@ % =09if (ix >=3D 0x7ff00000) { % =09=09if ((ix & 0xfffff) | lx)=09/* x is NaN */ % -=09=09=09return (cpack(x, (y =3D=3D 0 ? y : x * y))); % +=09=09=09return (cpack(x + x, y =3D=3D 0 ? y : x * y)); % =09=09SET_HIGH_WORD(x, hx - 0x40000000);=09/* x =3D copysign(1, x) */ % =09=09return (cpack(x, copysign(0, isinf(y) ? y : sin(y) * cos(y)))); % Index: s_ctanhf.c % =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D % RCS file: /home/ncvs/src/lib/msun/src/s_ctanhf.c,v % retrieving revision 1.2 % diff -u -2 -r1.2 s_ctanhf.c % --- s_ctanhf.c=0921 Oct 2011 06:30:16 -0000=091.2 % +++ s_ctanhf.c=0925 Oct 2011 14:30:57 -0000 % @@ -52,5 +52,5 @@ % =09if (ix >=3D 0x7f800000) { % =09=09if (ix & 0x7fffff) % -=09=09=09return (cpackf(x, (y =3D=3D 0 ? y : x * y))); % +=09=09=09return (cpackf(x + x, y =3D=3D 0 ? y : x * y)); % =09=09SET_FLOAT_WORD(x, hx - 0x40000000); % =09=09return (cpackf(x, s_ctanh*.c is also missing NaN quieting. I only got my tests in good enough shape to find the NaN quieting bugs for complex functions a little after the above were committed. The tests are still painful to configure for complex and long double functions. Recently I started using ones written in pari for long double functions. They run thousands of times slower than the C tests but work to higher precisions. One for logl looks like this: % \p 100 % \r ../lib/ftoa.gp % \r ../lib/ftoe.gp % \r ../lib/ftof.gp % \r ../lib/roundn.gp %=20 % show(x, y, z) =3D % { % =09local(libm_v, pari_v, relerr, rlibm_v, rpari_v, rx, ry, rz, ulps); %=20 % =09rx =3D roundn(x, prec); % =09ry =3D roundn(y, prec); % =09rz =3D roundn(z, prec); % =09libm_v =3D ry + rz; % =09pari_v =3D log(rx); % =09rlibm_v =3D roundn(libm_v, prec); % =09rpari_v =3D roundn(pari_v, prec); % =09if (rlibm_v !=3D rpari_v || % =09 roundn(libm_v, prec + 7) !=3D roundn(pari_v, prec + 7), % =09=09print("rx =3D " ftoe(rx, 36)); % =09=09print("rx =3D " ftoa(rx, prec)); % =09=09print("ry =3D " ftoa(ry, prec)); % =09=09print("rz =3D " ftoa(rz, prec)); % =09=09print("libm log(x) =3D " ftoa(libm_v, prec + 16)); % =09=09print("rlibm log(x) =3D " ftoa(rlibm_v, prec)); % =09=09print("pari log(x) =3D " ftoa(pari_v, prec + 16)); % =09=09print("rpari log(x) =3D " ftoa(rpari_v, prec)); % =09=09relerr =3D log(abs(libm_v / pari_v - 1)) / log(2); % =09=09print("relerr =3D 2**" ftof(relerr, 5)); % =09=09ulps =3D abs(rlibm_v - pari_v) / % =09=09 2^(ceil(log(abs(pari_v)) / log(2)) - prec); % =09=09print("rnderr =3D " ftof(ulps, 5) " ulps"); % =09); % } %=20 % prec =3D 113; % \r gin This takes an input file named gin that looks like this: % x =3D 0.998031648000081380208333333333333327; y =3D -0.001968431290575624= 91416520767945158578; z =3D -1.86045996676022698958009120338776959e-06; sho= w(x, y, z); % x =3D 0.998031618197758992513020833333333327; y =3D -0.001968461209313334= 43641252213257658578; z =3D -1.86040232903924733902992614348864124e-06; sho= w(x, y, z); % [... thousands or millions of lines. Takes to long to do the billions of cases routinely done in lower precisions by C tests] It produces output like this: % rx =3D 9.98039664824803670247395833333333327e-1 % rx =3D 0x1feff0e1111111111111111111111.0p-113 % ry =3D -0x100f38b08ee99e27e6493479dddef.0p-121 % rz =3D -0x1f7958665a30c56debcb8d67bfc00.0p-132 % libm log(x) =3D -0x101327db9bb4e4409406adeb8ad6e8000.0p-137 % rlibm log(x) =3D -0x101327db9bb4e4409406adeb8ad6e.0p-121 % pari log(x) =3D -0x101327db9bb4e4409406adeb8ad6e802e.0p-137 % rpari log(x) =3D -0x101327db9bb4e4409406adeb8ad6f.0p-121 % relerr =3D 2**-122.49 % rnderr =3D 0.50070 ulps Bruce --0-1667853581-1342584102=:1575-- From owner-freebsd-numerics@FreeBSD.ORG Sun Aug 12 22:58:44 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 64AB4106564A for ; Sun, 12 Aug 2012 22:58:44 +0000 (UTC) (envelope-from peter@rulingia.com) Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by mx1.freebsd.org (Postfix) with ESMTP id EBA7E8FC0A for ; Sun, 12 Aug 2012 22:58:43 +0000 (UTC) Received: from server.rulingia.com (c220-239-249-137.belrs5.nsw.optusnet.com.au [220.239.249.137]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CMwhhO075541 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Mon, 13 Aug 2012 08:58:43 +1000 (EST) (envelope-from peter@rulingia.com) X-Bogosity: Ham, spamicity=0.000000 Received: from server.rulingia.com (localhost.rulingia.com [127.0.0.1]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CMwbH0020966 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Mon, 13 Aug 2012 08:58:37 +1000 (EST) (envelope-from peter@server.rulingia.com) Received: (from peter@localhost) by server.rulingia.com (8.14.5/8.14.5/Submit) id q7CMwb4A020965 for freebsd-numerics@freebsd.org; Mon, 13 Aug 2012 08:58:37 +1000 (EST) (envelope-from peter) Resent-From: Peter Jeremy Resent-Date: Mon, 13 Aug 2012 08:58:37 +1000 Resent-Message-ID: <20120812225837.GP20453@server.rulingia.com> Resent-To: freebsd-numerics@freebsd.org Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q6K6Sl37026185 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Fri, 20 Jul 2012 16:28:48 +1000 (EST) (envelope-from brde@optusnet.com.au) Received: from mail35.syd.optusnet.com.au (mail35.syd.optusnet.com.au [211.29.133.51]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q6K6Slip085664 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Fri, 20 Jul 2012 16:28:47 +1000 (EST) (envelope-from brde@optusnet.com.au) Received: from c122-106-171-246.carlnfd1.nsw.optusnet.com.au (c122-106-171-246.carlnfd1.nsw.optusnet.com.au [122.106.171.246]) by mail35.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id q6K6SK0h002631 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Fri, 20 Jul 2012 16:28:22 +1000 From: Bruce Evans Mail-Followup-To: freebsd-numerics@freebsd.org X-X-Sender: bde@besplex.bde.org To: Steve Kargl In-Reply-To: <20120720043004.GA7404@troutmask.apl.washington.edu> Message-ID: <20120720162633.B2162@besplex.bde.org> References: <20120717040118.GA86840@troutmask.apl.washington.edu> <20120717042125.GF66913@server.rulingia.com> <20120717043848.GB87001@troutmask.apl.washington.edu> <20120717225328.GA86902@server.rulingia.com> <20120717232740.GA95026@troutmask.apl.washington.edu> <20120718001337.GA87817@server.rulingia.com> <20120718123627.D1575@besplex.bde.org> <20120719213944.GA21199@server.rulingia.com> <20120719234425.GA6280@troutmask.apl.washington.edu> <20120720130309.P814@besplex.bde.org> <20120720043004.GA7404@troutmask.apl.washington.edu> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Mailman-Approved-At: Sun, 12 Aug 2012 23:55:59 +0000 Cc: Diane Bruce , John Baldwin , David Chisnall , Stephen Montgomery-Smith , Bruce Evans , Bruce Evans , David Schultz , Peter Jeremy , Warner Losh Subject: Re: Use of C99 extra long double math functions after r236148 X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Date: Sun, 12 Aug 2012 22:58:44 -0000 X-Original-Date: Fri, 20 Jul 2012 16:28:20 +1000 (EST) X-List-Received-Date: Sun, 12 Aug 2012 22:58:44 -0000 On Thu, 19 Jul 2012, Steve Kargl wrote: > On Fri, Jul 20, 2012 at 02:12:20PM +1000, Bruce Evans wrote: >> On Thu, 19 Jul 2012, Steve Kargl wrote: >> >>> I collected some of the float and double into a cheat sheet. >>> >>> Idioms used in libm with float type: >>> >>> int32_t xsb; >>> u_int32_t hx; >>> >>> GET_FLOAT_WORD(hx, x); >>> >>> /* Get the sign bit of x */ >>> xsb = (hx >> 31) & 1; >> >> Getting it without shifting it (hx & 0x8000) is more efficient and common. >> You don't need to shift it in this example. > > I collected these from msun/src, when I was trying to understand > the magic numbers. I suppose someone should audit the code for > consistency. :-) > > laptop:kargl[219] grep " (hx>>31)&1" *c > e_exp.c: xsb = (hx>>31)&1; /* sign bit of x */ > e_expf.c: xsb = (hx>>31)&1; /* sign bit of x */ The full example of e_exp*.c uses xsb as an index in a table later (not a very good method, but perhaps better than a branch). Bruce From owner-freebsd-numerics@FreeBSD.ORG Sun Aug 12 23:00:13 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 2DB5F106566C for ; Sun, 12 Aug 2012 23:00:13 +0000 (UTC) (envelope-from peter@rulingia.com) Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by mx1.freebsd.org (Postfix) with ESMTP id BABFA8FC0A for ; Sun, 12 Aug 2012 23:00:10 +0000 (UTC) Received: from server.rulingia.com (c220-239-249-137.belrs5.nsw.optusnet.com.au [220.239.249.137]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CN0AKb075562 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Mon, 13 Aug 2012 09:00:10 +1000 (EST) (envelope-from peter@rulingia.com) X-Bogosity: Ham, spamicity=0.000000 Received: from server.rulingia.com (localhost.rulingia.com [127.0.0.1]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CN04kk021016 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Mon, 13 Aug 2012 09:00:04 +1000 (EST) (envelope-from peter@server.rulingia.com) Received: (from peter@localhost) by server.rulingia.com (8.14.5/8.14.5/Submit) id q7CN043G021015 for freebsd-numerics@freebsd.org; Mon, 13 Aug 2012 09:00:04 +1000 (EST) (envelope-from peter) Resent-From: Peter Jeremy Resent-Date: Mon, 13 Aug 2012 09:00:04 +1000 Resent-Message-ID: <20120812230004.GQ20453@server.rulingia.com> Resent-To: freebsd-numerics@freebsd.org Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q6K6Sl37026185 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Fri, 20 Jul 2012 16:28:48 +1000 (EST) (envelope-from brde@optusnet.com.au) Received: from mail35.syd.optusnet.com.au (mail35.syd.optusnet.com.au [211.29.133.51]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q6K6Slip085664 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Fri, 20 Jul 2012 16:28:47 +1000 (EST) (envelope-from brde@optusnet.com.au) Received: from c122-106-171-246.carlnfd1.nsw.optusnet.com.au (c122-106-171-246.carlnfd1.nsw.optusnet.com.au [122.106.171.246]) by mail35.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id q6K6SK0h002631 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Fri, 20 Jul 2012 16:28:22 +1000 From: Bruce Evans Mail-Followup-To: freebsd-numerics@freebsd.org X-X-Sender: bde@besplex.bde.org To: Steve Kargl In-Reply-To: <20120720043004.GA7404@troutmask.apl.washington.edu> Message-ID: <20120720162633.B2162@besplex.bde.org> References: <20120717040118.GA86840@troutmask.apl.washington.edu> <20120717042125.GF66913@server.rulingia.com> <20120717043848.GB87001@troutmask.apl.washington.edu> <20120717225328.GA86902@server.rulingia.com> <20120717232740.GA95026@troutmask.apl.washington.edu> <20120718001337.GA87817@server.rulingia.com> <20120718123627.D1575@besplex.bde.org> <20120719213944.GA21199@server.rulingia.com> <20120719234425.GA6280@troutmask.apl.washington.edu> <20120720130309.P814@besplex.bde.org> <20120720043004.GA7404@troutmask.apl.washington.edu> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Mailman-Approved-At: Sun, 12 Aug 2012 23:55:59 +0000 Cc: Diane Bruce , John Baldwin , David Chisnall , Stephen Montgomery-Smith , Bruce Evans , Bruce Evans , David Schultz , Peter Jeremy , Warner Losh Subject: Re: Use of C99 extra long double math functions after r236148 X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Date: Sun, 12 Aug 2012 23:00:13 -0000 X-Original-Date: Fri, 20 Jul 2012 16:28:20 +1000 (EST) X-List-Received-Date: Sun, 12 Aug 2012 23:00:13 -0000 On Thu, 19 Jul 2012, Steve Kargl wrote: > On Fri, Jul 20, 2012 at 02:12:20PM +1000, Bruce Evans wrote: >> On Thu, 19 Jul 2012, Steve Kargl wrote: >> >>> I collected some of the float and double into a cheat sheet. >>> >>> Idioms used in libm with float type: >>> >>> int32_t xsb; >>> u_int32_t hx; >>> >>> GET_FLOAT_WORD(hx, x); >>> >>> /* Get the sign bit of x */ >>> xsb = (hx >> 31) & 1; >> >> Getting it without shifting it (hx & 0x8000) is more efficient and common. >> You don't need to shift it in this example. > > I collected these from msun/src, when I was trying to understand > the magic numbers. I suppose someone should audit the code for > consistency. :-) > > laptop:kargl[219] grep " (hx>>31)&1" *c > e_exp.c: xsb = (hx>>31)&1; /* sign bit of x */ > e_expf.c: xsb = (hx>>31)&1; /* sign bit of x */ The full example of e_exp*.c uses xsb as an index in a table later (not a very good method, but perhaps better than a branch). Bruce From owner-freebsd-numerics@FreeBSD.ORG Sun Aug 12 23:00:29 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 74F66106566B for ; Sun, 12 Aug 2012 23:00:29 +0000 (UTC) (envelope-from peter@rulingia.com) Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by mx1.freebsd.org (Postfix) with ESMTP id EDCB78FC0A for ; Sun, 12 Aug 2012 23:00:28 +0000 (UTC) Received: from server.rulingia.com (c220-239-249-137.belrs5.nsw.optusnet.com.au [220.239.249.137]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CN0S7k075568 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Mon, 13 Aug 2012 09:00:28 +1000 (EST) (envelope-from peter@rulingia.com) X-Bogosity: Ham, spamicity=0.000000 Received: from server.rulingia.com (localhost.rulingia.com [127.0.0.1]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CN0MQf021030 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Mon, 13 Aug 2012 09:00:22 +1000 (EST) (envelope-from peter@server.rulingia.com) Received: (from peter@localhost) by server.rulingia.com (8.14.5/8.14.5/Submit) id q7CN0MXc021029 for freebsd-numerics@freebsd.org; Mon, 13 Aug 2012 09:00:22 +1000 (EST) (envelope-from peter) Resent-From: Peter Jeremy Resent-Date: Mon, 13 Aug 2012 09:00:22 +1000 Resent-Message-ID: <20120812230022.GS20453@server.rulingia.com> Resent-To: freebsd-numerics@freebsd.org Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q6L5lQpu072752 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Sat, 21 Jul 2012 15:47:27 +1000 (EST) (envelope-from brde@optusnet.com.au) Received: from mail06.syd.optusnet.com.au (mail06.syd.optusnet.com.au [211.29.132.187]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q6L5lQqG095713 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Sat, 21 Jul 2012 15:47:26 +1000 (EST) (envelope-from brde@optusnet.com.au) Received: from c122-106-171-246.carlnfd1.nsw.optusnet.com.au (c122-106-171-246.carlnfd1.nsw.optusnet.com.au [122.106.171.246]) by mail06.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id q6L5l8KX007586 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Sat, 21 Jul 2012 15:47:09 +1000 From: Bruce Evans Mail-Followup-To: freebsd-numerics@freebsd.org X-X-Sender: bde@besplex.bde.org To: Peter Jeremy In-Reply-To: <20120721003103.GA73662@server.rulingia.com> Message-ID: <20120721133600.R877@besplex.bde.org> References: <20120717084457.U3890@besplex.bde.org> <5004A5C7.1040405@missouri.edu> <5004DEA9.1050001@missouri.edu> <20120717040118.GA86840@troutmask.apl.washington.edu> <20120717042125.GF66913@server.rulingia.com> <20120717043848.GB87001@troutmask.apl.washington.edu> <20120717225328.GA86902@server.rulingia.com> <20120717232740.GA95026@troutmask.apl.washington.edu> <20120718001337.GA87817@server.rulingia.com> <20120718123627.D1575@besplex.bde.org> <20120721003103.GA73662@server.rulingia.com> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Mailman-Approved-At: Sun, 12 Aug 2012 23:55:59 +0000 Cc: Diane Bruce , Bruce Evans , John Baldwin , David Chisnall , Stephen Montgomery-Smith , Bruce Evans , Steve Kargl , David Schultz , Warner Losh Subject: Re: Use of C99 extra long double math functions after r236148 X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Date: Sun, 12 Aug 2012 23:00:29 -0000 X-Original-Date: Sat, 21 Jul 2012 15:47:08 +1000 (EST) X-List-Received-Date: Sun, 12 Aug 2012 23:00:29 -0000 On Sat, 21 Jul 2012, Peter Jeremy wrote: > Hi Bruce or das@ or Steve, > > I have a question on the following code from s_ccosh.c: > % /* > % * cosh(NaN + I NaN) = d(NaN) + I d(NaN). > % * > % * cosh(NaN +- I Inf) = d(NaN) + I d(NaN). > % * Optionally raises the invalid floating-point exception. > % * Choice = raise. > % * > % * cosh(NaN + I y) = d(NaN) + I d(NaN). > % * Optionally raises the invalid floating-point exception for finite > % * nonzero y. Choice = don't raise (except for signaling NaNs). > % */ > % return (cpack((x * x) * (y - y), (x + x) * (y - y))); > > x is always NaN so the real part presumably just needs to be quietened > before returning - ie (x + x) would seem to be sufficient. Why does > the code use ((x * x) * (y - y))? The more complex expression is to handle more cases with the same code (mainly so as to avoid extra tests and branches for the classification. For exceptional cases, we don't care much if the expressions in the return statements are slower because they are more complex so as to be general, or if they are simpler and faster but larger because the classification is finer). We could try to avoid extra tests and branches for the usual cases by first doing a coarse classification and then a fine classification of the exceptional sub-cases. This would probably be efficient (if the CPU has reasonable branch prediction), but it will give larger code. Such expressions are complicated mainly to get the signs and NaN mixing right in all cases. Quoting this line again: > % return (cpack((x * x) * (y - y), (x + x) * (y - y))); > y has no restriction on its value so an arithmetic operation with x is > a good way to convert it to a NaN. Wouldn't (y + x) be sufficient? Remember that cos[h]() is even and sin[h]() is odd. (x * x) is to get an even sign for cosh[h]() and (x + x) is to get an odd sign for sin[h](). After undoing this magic, the expression becomes: > % return (cpack(cosh(x) * (y - y), sinh(x) * (y - y))); and it remains to explain the (y - y) terms. (It would probably work to use this expression. cosh() and sinh() should get the signs and NaNs right. We optimize away these calls because we can. This depends on knowing that x is Inf or NaN. Oops, here x is only NaN, so there is more to explain: - in other return statements, x can be either Inf or NaN. We copy the magic expressions for simplicity. - signs are not very important for NaNs, but I like to make them consistent so that my test programs don't spew errors for unimportant inconsistencies (it is easier to make the inconsistencies not occur than to automatically decide if they are related to bugs. das@ doesn't like spending any time on this detail). - using the magic expressions gives the same behaviour for NaNs that calling the functions would. For example, cosh(x) naturally wants to square x. It may or may not do this explicitly for NaNs, but it should for the same reasons that we should (more so), and in fact it does: it returns x*x for both NaNs and Infs. This gets the sign right for Infs, and quietens NaNs, and gives a sign for the NaN result in the most natural way (whatever the hardware does for x*x) - Here we have the option of quietening NaNs using some other expression, perhaps x+y (since we have another arg), or x+x (since we don't care about the sign of NaNs(. But this might give gratuitously different results, depending on what the hardware does. - x86 hardware mostly doesn't touch the sign of NaNs when they are operated on. The main exceptions are than fabs() clears the sign bit and negation may toggle the sign bit, depending on how negation is implemented by the compiler (i387 fchs toggles the sign, but compilers a subtraction operation which doesn't). SSE gives annoying differences for NaNs, but IIRC these don't involve the sign bit for non-mixing operations. - the quiet bit can give annoying differences. NaNs could be mixed using the expression x+y (if you prefer signs preserved) or x*y (if you prefer signs multiplied). The result should be some functions of the inputs. IIRC, IEEE7xx specifies this, but doesn't specify the details. Most hardware does something like looking at the bits of the 2 NaNs and returning the bits of the "largest" one. The main differences are whether the sign and the quiet bits to be included in the decision, and if so, how they affect the ordering. There are annoying differences from them sometimes affecting the ordering, depending on the hardware and the operand size. The above operates first purely on each of x and y, in order to quieten them before mixing: x*y might use the quiet bit in the ordering decision (x*x)*(y-y) kills the quiet bit before mixing, so that it cannot affect the ordering descision Again, the second expression is what happens naturally if you write cosh(x)*sinh(y). ) Back to explaining the (y - y) terms: > % return (cpack(cosh(x) * (y - y), sinh(x) * (y - y))); These are to optimize sin(y) and cos(y). sin() is odd and cos() is even, but parity considerations no longer apply for y = +-Inf because the value is now NaN, so its sign is meaningless. The sign of this NaN is also unspecified, and we take advantage of this by letting it be whatever the hardware gives for (y - y). We also want to be compatible with what the function call would do, and fdlibm's sin() and cos() in fact do exactly this, for the usual technical reasons. Now y can be NaN, +-Inf or finite, and (y - y) handles all cases. In sin() and cos(), this expression handles the NaN and +-Inf cases, so we are consistent. > My understanding is that: > - Addition is generally faster than multiplication Apart from the technical (parity) reasons, we don't really care about efficiency in these exceptional cases, though we are doing perhaps excessive optimizations to avoid the function calls. Another organization of the function might start by evaluating the 4 sub-functions in all cases, so that the results can be used in all cases later. IIRC, this is not done mainly because it isn't clear that it doesn't give spurious exceptions. It also gives efficiency in a natural way for special but non-exceptional cases like pure real and pure imaginary. > - Signs are irrelevant for NaN so merging the sign of x doesn't matter. > - NaN + NaN returns the (quietened?) left-hand NaN See above. > - Inf + NaN returns the (quietened?) right-hand NaN I forget if IEEE7xx requires thus, but some C functions near this one are required to treat a NaN as an indeterminate value which always combines with Inf to produce Inf. The sign of a NaN could reasonably matter here: +Inf + +NaN = +Inf since the indeterminate value of +NaN cannot be -Inf +Inf + -NaN = NaN since the indeterminate value of -NaN can be -Inf > - finite + NaN returns the (quietened?) right-hand NaN > > Also, whilst things like ((x + x) * (y - y)) are reasonably efficient Another point is that using consistent expressions allows common code for the common parts (like x*x) if that is a good (space) optimization. > on x86, few (if any) RISC architectures support exceptional conditions > in hardware. My understanding is that SPARC would trap back into the > userland handler (lib/libc/sparc64/fpu) on each operation unless both > arguments and the result are normalised numbers. Explicitly fiddling > with the FPU state would seem faster than multiple traps. Here it is more the size of the magic expressions that gives slowness. Almost all of them either start with a NaN or create one. Even on x86, there is annoying slowness for overflow and underflow in a MD way. Exceptional cases for exp*() run much slower on i386(core2) than on i386(athlon-any(?)) or amd64(core2) or amd64(athlon), because instructions (if not their results) that overflow and underflow are handled slowly on core2(i387) but at full speed on all the other combinations including core2(SSE). Bruce From owner-freebsd-numerics@FreeBSD.ORG Sun Aug 12 23:01:45 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id DE8131065675 for ; Sun, 12 Aug 2012 23:01:45 +0000 (UTC) (envelope-from peter@rulingia.com) Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by mx1.freebsd.org (Postfix) with ESMTP id 625F28FC12 for ; Sun, 12 Aug 2012 23:01:45 +0000 (UTC) Received: from server.rulingia.com (c220-239-249-137.belrs5.nsw.optusnet.com.au [220.239.249.137]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CN1jtK075586 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Mon, 13 Aug 2012 09:01:45 +1000 (EST) (envelope-from peter@rulingia.com) X-Bogosity: Ham, spamicity=0.000000 Received: from server.rulingia.com (localhost.rulingia.com [127.0.0.1]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CN1cxZ021075 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Mon, 13 Aug 2012 09:01:39 +1000 (EST) (envelope-from peter@server.rulingia.com) Received: (from peter@localhost) by server.rulingia.com (8.14.5/8.14.5/Submit) id q7CN1cLZ021074 for freebsd-numerics@freebsd.org; Mon, 13 Aug 2012 09:01:38 +1000 (EST) (envelope-from peter) Resent-From: Peter Jeremy Resent-Date: Mon, 13 Aug 2012 09:01:38 +1000 Resent-Message-ID: <20120812230138.GW20453@server.rulingia.com> Resent-To: freebsd-numerics@freebsd.org Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q6N59PIu011623 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Mon, 23 Jul 2012 15:09:25 +1000 (EST) (envelope-from brde@optusnet.com.au) Received: from mail04.syd.optusnet.com.au (mail04.syd.optusnet.com.au [211.29.132.185]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q6N59OLQ011009 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Mon, 23 Jul 2012 15:09:25 +1000 (EST) (envelope-from brde@optusnet.com.au) Received: from c122-106-171-246.carlnfd1.nsw.optusnet.com.au (c122-106-171-246.carlnfd1.nsw.optusnet.com.au [122.106.171.246]) by mail04.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id q6N594gK000640 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Mon, 23 Jul 2012 15:09:05 +1000 From: Bruce Evans Mail-Followup-To: freebsd-numerics@freebsd.org X-X-Sender: bde@besplex.bde.org To: Peter Jeremy In-Reply-To: <20120722233134.GB8033@server.rulingia.com> Message-ID: <20120723143145.T1353@besplex.bde.org> References: <20120717042125.GF66913@server.rulingia.com> <20120717043848.GB87001@troutmask.apl.washington.edu> <20120717225328.GA86902@server.rulingia.com> <20120717232740.GA95026@troutmask.apl.washington.edu> <20120718001337.GA87817@server.rulingia.com> <20120718123627.D1575@besplex.bde.org> <20120722121219.GC73662@server.rulingia.com> <20120722220031.GA7791@server.rulingia.com> <20120722221614.GB53450@zim.MIT.EDU> <20120722231056.GA84338@troutmask.apl.washington.edu> <20120722233134.GB8033@server.rulingia.com> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Mailman-Approved-At: Sun, 12 Aug 2012 23:55:59 +0000 Cc: Diane Bruce , Steve Kargl , John Baldwin , David Chisnall , Stephen Montgomery-Smith , Bruce Evans , Bruce Evans , David Schultz , Warner Losh Subject: Re: Use of C99 extra long double math functions after r236148 X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Date: Sun, 12 Aug 2012 23:01:46 -0000 X-Original-Date: Mon, 23 Jul 2012 15:09:04 +1000 (EST) X-List-Received-Date: Sun, 12 Aug 2012 23:01:46 -0000 On Mon, 23 Jul 2012, Peter Jeremy wrote: > On 2012-Jul-22 16:10:56 -0700, Steve Kargl wrote: >> The above isn't necessarily true. The Fortran standards from >> 2003 and 2008, very care about NaN. Under certain conditions, >> if one has something like >> >> x = sin(NaN) >> >> in Fortran, then the returned NaN must be the one in the function >> call. > > Even if it was a SNaN? My understanding is that SNaN should be > quietened if they are used in any further floating point operations. This may depend on the language specification, but it can be hard to even pass signaling NaNs without causing an exception. (The following interesting behaviour occurs on x86: - On i387, loading a float or double involves a conversion to long double. This conversion is considered to be an operation and it generates an exception and quietens signalling NaNs. - On i386, the ABI requires FP args to be passed on the stack. Compilers may or may not load float and double args as part of putting them on the stack. If they do, then by the previous point, functions never see float or double signaling NaN args, but functions always see long double signaling NaN args. Then is a buggy function returns x when it should return x+x, the bug is not noticed for the float and double cases but it is noticed (if tested for) in the long double case. - On at least Athlon-xp and Athlon64 before Phenom, compilers should always load at least double args as part of putting them on the stack (or other memory), since copying them through integer registers is very slow (it causes store-to-load mismatches). Compilers don't understand this well, but newer gcc is better and goes through the FPU more (going through SSE is good too). Thus the previous points are affected by CFLAGS and compiler opt/pessimizations. - On SSE, no loads involve an implicit conversion and loads of signaling NaNs never trigger exception handling. - On amd64, SSE is the default for floats and doubles, so passing signaling NaNs never triggers the signal, although the ABI now the first few float and double args to be loaded into SSE registes for passing. - On i386, clang uses SSE too much, and thus triggers signaling NaNs less than gcc. ) >> Having libm, do >> >> if (x == NaN) >> return (x + x); >> >> does/may not return the correct NaN. It usually does return the correct NaN. There is no other reasonably portable way to quieten signaling NaNs. (x + x) works well for NaNs and Infs, but if the value might also be finite, I haven't found anything better than adding 0.0L to trigger the signaling NaNs without corrupting finite values. I changed places like e_hypot.c by adding 0 in the same precision as x (this should be written as plain 0, but I used 0.0 for e_hypot.c and 0.0F or (float)0.0 in e_hypotf.c) to give more consistent NaN mixing (convert before mixing). This still gave inconsistences for long doubles, on x86 due to the above messes and maybe on sparc64 too. Adding 0.0L causes x to be promoted to long double, giving the consistent i387 behaviour for its registers. This is free (makes no difference) on i386, but on amd64 it gives consistently by forcing (das would say) gratuitous use of the i387, and on other arches it gives whatever behaviour long doubles give. > I presume you mean > if (isnan(x)) > return (x + x); Nah, in fdlibm, NaNs are classified using magic bit tests on bits that are kept handy in integer registers for other reasons :-). Using mainly bits is a significant optimization, but it can be better to mix bit operations with FP operations even for test operations, to keep more ALUs streaming. > Do you have a test case that shows that? As far as I can tell, all > the FPUs I have access to will return a quietened variant of the > input NaN in this case (ie, only the signalling bit is altered). I test amd64, i386, ia64 and sparc64 and haven't noticed this failing, though I've noticed a few hundred billion other failures for NaNs. Bruce From owner-freebsd-numerics@FreeBSD.ORG Sun Aug 12 23:01:53 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 75D41106564A for ; Sun, 12 Aug 2012 23:01:53 +0000 (UTC) (envelope-from peter@rulingia.com) Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by mx1.freebsd.org (Postfix) with ESMTP id D6AAB8FC08 for ; Sun, 12 Aug 2012 23:01:52 +0000 (UTC) Received: from server.rulingia.com (c220-239-249-137.belrs5.nsw.optusnet.com.au [220.239.249.137]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CN1qQC075591 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Mon, 13 Aug 2012 09:01:52 +1000 (EST) (envelope-from peter@rulingia.com) X-Bogosity: Ham, spamicity=0.000000 Received: from server.rulingia.com (localhost.rulingia.com [127.0.0.1]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CN1kUZ021087 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Mon, 13 Aug 2012 09:01:46 +1000 (EST) (envelope-from peter@server.rulingia.com) Received: (from peter@localhost) by server.rulingia.com (8.14.5/8.14.5/Submit) id q7CN1kLL021085 for freebsd-numerics@freebsd.org; Mon, 13 Aug 2012 09:01:46 +1000 (EST) (envelope-from peter) Resent-From: Peter Jeremy Resent-Date: Mon, 13 Aug 2012 09:01:46 +1000 Resent-Message-ID: <20120812230146.GX20453@server.rulingia.com> Resent-To: freebsd-numerics@freebsd.org Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q6N4UK8u011309 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Mon, 23 Jul 2012 14:30:21 +1000 (EST) (envelope-from brde@optusnet.com.au) Received: from mail36.syd.optusnet.com.au (mail36.syd.optusnet.com.au [211.29.133.76]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q6N4UKhd010905 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Mon, 23 Jul 2012 14:30:20 +1000 (EST) (envelope-from brde@optusnet.com.au) Received: from c122-106-171-246.carlnfd1.nsw.optusnet.com.au (c122-106-171-246.carlnfd1.nsw.optusnet.com.au [122.106.171.246]) by mail36.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id q6N4U1QB004152 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Mon, 23 Jul 2012 14:30:01 +1000 From: Bruce Evans Mail-Followup-To: freebsd-numerics@freebsd.org X-X-Sender: bde@besplex.bde.org To: Peter Jeremy In-Reply-To: <20120722220031.GA7791@server.rulingia.com> Message-ID: <20120723141319.P1189@besplex.bde.org> References: <5004A5C7.1040405@missouri.edu> <5004DEA9.1050001@missouri.edu> <20120717040118.GA86840@troutmask.apl.washington.edu> <20120717042125.GF66913@server.rulingia.com> <20120717043848.GB87001@troutmask.apl.washington.edu> <20120717225328.GA86902@server.rulingia.com> <20120717232740.GA95026@troutmask.apl.washington.edu> <20120718001337.GA87817@server.rulingia.com> <20120718123627.D1575@besplex.bde.org> <20120722121219.GC73662@server.rulingia.com> <20120722220031.GA7791@server.rulingia.com> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Mailman-Approved-At: Sun, 12 Aug 2012 23:55:59 +0000 Cc: Diane Bruce , Bruce Evans , John Baldwin , David Chisnall , Stephen Montgomery-Smith , Bruce Evans , Steve Kargl , David Schultz , Warner Losh Subject: Re: Use of C99 extra long double math functions after r236148 X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Date: Sun, 12 Aug 2012 23:01:53 -0000 X-Original-Date: Mon, 23 Jul 2012 14:30:01 +1000 (EST) X-List-Received-Date: Sun, 12 Aug 2012 23:01:53 -0000 On Mon, 23 Jul 2012, Peter Jeremy wrote: > On 2012-Jul-22 22:12:19 +1000, Peter Jeremy wrote: >> A have simplified the default (NaN + I*NaN) return from catanh() to >> the minimun to ensure that both real & imaginary parts return as NaN. >> I've been doing some experiments on mixing NaNs using x87, SSE, SPARC64 >> and ARM (last on Linux) and have come to the conclusion that there is >> no standard behaviour: Given x & y as NaNs, (x+y) can return either >> x or y, possibly with the sign bit from the other operand. depending >> on the FPU. > > I've tried running my exception test program on Solaris/SPARC using > SunStudio and it gives different results to FreeBSD/sparc64 in some > cases so it looks like the FreeBSD/sparc64 exception handling code > is also buggy. It is certainly MD, but I think it should be fixed within an arch. Not vary with CFLAGS depending on optimizations and which register set is selected, as happens on x86 due to the differences between x87 and SSE and the compiler's choice of the register set. For sparc64, 6 months ago before sparc64 switched from the old NetBSD(?)- contribed emulation to soft-float, the emulation was just broken, and I had to change parts of the library involving NaNs to get consistent behaviour (fortunately, the bugs seem to be all in user space). The behaviour varied with -mhard-quad-float. This option is not the default for sparc64 because no known sparc64 implementation implements it in hardware. The hardware only implements the opcodes, and traps to emulate them, while with soft-float the emulation uses similar code (but was more broken for NaNs) without traps. The traps just slow things down. I used -mhard-quad-float a bit anyway because it is easier to debug. In the disassembky it gives nice opcodes while soft-float gives large code for libcalls, and gdb makes a mess of both stepping over the libcalls if you don't want to see them (gdb steps into inline functions when you don't want this) and of displaying them when you do want to see them (display of register variables in inline functions is broken on most arches, and the envionment for the sparc64 libcall and trap code for emulation is especially challenging). > And, when the base gcc tries to shortcut floating point expressions > and execute them at compile time, it also gets exception handling > wrong in several cases (it'll correctly detect that a constant > expression evaluates to Inf or NaN but, in many cases, the NaN it > calculates is different to the x87 or SSE evaluation of the same > arguments). Possibly invalid optimizations, but I've had good results from evaluating 1.0/0 and 0.0/0 at compile time. gcc actually warns about these when I really want these to be evaluated without side effects (exceptions). Bruce From owner-freebsd-numerics@FreeBSD.ORG Sun Aug 12 23:02:15 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 88B20106566C for ; Sun, 12 Aug 2012 23:02:15 +0000 (UTC) (envelope-from peter@rulingia.com) Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by mx1.freebsd.org (Postfix) with ESMTP id 0DFF58FC12 for ; Sun, 12 Aug 2012 23:02:14 +0000 (UTC) Received: from server.rulingia.com (c220-239-249-137.belrs5.nsw.optusnet.com.au [220.239.249.137]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CN2Ejo075601 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Mon, 13 Aug 2012 09:02:15 +1000 (EST) (envelope-from peter@rulingia.com) X-Bogosity: Ham, spamicity=0.000000 Received: from server.rulingia.com (localhost.rulingia.com [127.0.0.1]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CN28Xn021111 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Mon, 13 Aug 2012 09:02:08 +1000 (EST) (envelope-from peter@server.rulingia.com) Received: (from peter@localhost) by server.rulingia.com (8.14.5/8.14.5/Submit) id q7CN284p021110 for freebsd-numerics@freebsd.org; Mon, 13 Aug 2012 09:02:08 +1000 (EST) (envelope-from peter) Resent-From: Peter Jeremy Resent-Date: Mon, 13 Aug 2012 09:02:08 +1000 Resent-Message-ID: <20120812230208.GA20453@server.rulingia.com> Resent-To: freebsd-numerics@freebsd.org Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q6O1QuWR064160 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Tue, 24 Jul 2012 11:26:56 +1000 (EST) (envelope-from brde@optusnet.com.au) Received: from mail30.syd.optusnet.com.au (mail30.syd.optusnet.com.au [211.29.133.193]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q6O1QtZW017601 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Tue, 24 Jul 2012 11:26:56 +1000 (EST) (envelope-from brde@optusnet.com.au) Received: from c122-106-171-246.carlnfd1.nsw.optusnet.com.au (c122-106-171-246.carlnfd1.nsw.optusnet.com.au [122.106.171.246]) by mail30.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id q6O1QWLP003837 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Tue, 24 Jul 2012 11:26:33 +1000 From: Bruce Evans Mail-Followup-To: freebsd-numerics@freebsd.org X-X-Sender: bde@besplex.bde.org To: Stephen Montgomery-Smith In-Reply-To: <500D6182.8010003@missouri.edu> Message-ID: <20120724100014.I934@besplex.bde.org> References: <5004A5C7.1040405@missouri.edu> <5004DEA9.1050001@missouri.edu> <20120717040118.GA86840@troutmask.apl.washington.edu> <20120717042125.GF66913@server.rulingia.com> <20120717043848.GB87001@troutmask.apl.washington.edu> <20120717225328.GA86902@server.rulingia.com> <20120717232740.GA95026@troutmask.apl.washington.edu> <20120718001337.GA87817@server.rulingia.com> <20120718123627.D1575@besplex.bde.org> <20120722121219.GC73662@server.rulingia.com> <20120722220031.GA7791@server.rulingia.com> <20120723141319.P1189@besplex.bde.org> <500CD98E.9080103@missouri.edu> <500D6182.8010003@missouri.edu> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Mailman-Approved-At: Sun, 12 Aug 2012 23:55:59 +0000 Cc: Diane Bruce , Bruce Evans , John Baldwin , David Chisnall , Bruce Evans , Steve Kargl , David Schultz , Peter Jeremy , Warner Losh Subject: Re: Use of C99 extra long double math functions after r236148 X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Date: Sun, 12 Aug 2012 23:02:15 -0000 X-Original-Date: Tue, 24 Jul 2012 11:26:32 +1000 (EST) X-List-Received-Date: Sun, 12 Aug 2012 23:02:15 -0000 On Mon, 23 Jul 2012, Stephen Montgomery-Smith wrote: > On 07/22/2012 11:56 PM, Stephen Montgomery-Smith wrote: >> This is the work I have done to produce casinh, casin, cacos and cacosh. >> The latter two took me a lot more time than I expected. It took me a >> lot of time to try to find the correct branches. > > Once I have cacos, cacosh turns out to be much easier than I thought: > > double complex > cacosh(double complex z) > { > complex double w; > > w = cacos(z); > if (signbit(cimag(w)) == 0) > return cpack(cimag(w),-creal(w)); > else > return cpack(-cimag(w),creal(w)); > } It's easy to fix all (?) the style and efficiency and exceptional case bugs in such a small example: - always use 'double complex', not 'complex double' (although this is backwards compared with the function names -- they have a c prefix and an an [fl] suffix -- it is what C99 uses) - always put spaces after commas in parameter lists - in new functions, always put silly parentheses around return values, to conform to KNF. This is done form most or all complex functions that have been committed (s_ccosh.c, etc). - in old functions, never put silly parentheses around return values, since this is not fdlibm style. (Similarly for some spaces.) - always use fdlibm classification methods, using bits. Never use signbit(). signbit() is currently only used by csqrt(), which sets a bad example. It might be a compiler builtin, but it is apparently not, so it is a slow function call (takes about half as long as a whole function for medium-weight functions like cosf()). This doesn't matter much here, but it is easier to practice using the fdlibm classification methods in a simple context. - consider using copysign() instead of the '-' operator. Now copysign() is the slow extern function for sign handling while '-' is a fast builtin. But it isn't clear that '-' is correct. It probably isn't, since s_ccosh.c uses copysign() a lot. I think the '-' operator works right on +-0, so s_ccosh.c is only using it to get the signs right for NaNs (the same as would happen using a simple formula). All the comples functions are required to have certain relection properties, and I like the reflection to apply to the sign bit of NaNs too. Otherwise the sign of NaN results is unpredicable. das wouldn't care about this of course. What actually happens for the '-' operator applied to a NaN is very MD. i387 has a negation operator (fchs). It and the i387 absolute value operator (fabs) are about the only operators that can change the sign of a NaN. gcc generates an fchs for 'return (-x)'. Thus it is expected that -x toggles the sign of a NaN on i386. However, gcc also does the invalid optimization of rewriting 'return (x + (-x))' to 'return (x - x)'. i387 also has a reverse subtraction operator which may cause problems. Implementing -x as (0 -x) would be invalid for +-0 too, and the reverse subtraction operator makes this bug more likely. Even addition is not commutative for NaNs on some arches where the NaN mixing rules for x+y are bad. (Of course x+y != y+x for all NaNs in FP, but the bad hardware makes it different in bits too.) fabs and fchs also fail to trigger or quieten signaling NaNs. This is consistent with some dubious optimizations in software. Even fdlibm fabs*() just clears the sign bit and doesn't try to trigger signaling NaNs. SSE doesn't have absolute value or negation operators for FP, and these seem to always be implemented by setting and toggling bits, so the behaviour is consistent. There have been the following developments for i387: - gcc-3.3 implements fabs(x) and -x using builtin bit toggling that doesn't use the i387, even with -mi386 -O0. - gcc-4.2 knows that direct hacking on FP like this is bad, or at least that it doesn't understand FP, so it uses the hardware. It now loads the values and uses hardware fabs and fchs on them. Now it is the hardware's fault for not triggering signaling NaNs. This accidentally fixes the case of floats and doubles, since signaling NaNs are triggered by the conversion on loading them. The optimization might go the other way, and change most of the copysign()s in s_ccosh.c to negations. When I worked on ccosh(), I didn't want to know the details and first just sprinkled the copysign() calls until the results were consistent with alternative implementations. Later I checked that some of them were consistent with reflection principles. Now even more than you want to know about NaNs: yesterday I checked whether soft-float on sparc64 had fixed some bugs with NaNs, so that I could delete my code that does extra work to not see these bugs. Unfortunately, the largest bugs are still there: signaling NaNs are never triggered on sparc64 with the default of -mno-hard-quad-float. The emulation just doesn't emulate, so it fails to trigger them in cases not like fabs() where everyone except the buggy emulation agrees that they should trigger. The non-triggering is complete: they aren't quietened, and they don't generate FE_INVALID. Compiling with -mhard-quad-float fixes this. But no one uses this, since it is about 4 times slower. Even without this, long doubles are about 1000 times slower on sparc64 than on x86 (only 300 times slower in cycle counts, but x86 has faster CPU clocks). Bruce From owner-freebsd-numerics@FreeBSD.ORG Sun Aug 12 23:02:54 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 9F3E9106564A for ; Sun, 12 Aug 2012 23:02:54 +0000 (UTC) (envelope-from peter@rulingia.com) Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by mx1.freebsd.org (Postfix) with ESMTP id 24D778FC0A for ; Sun, 12 Aug 2012 23:02:53 +0000 (UTC) Received: from server.rulingia.com (c220-239-249-137.belrs5.nsw.optusnet.com.au [220.239.249.137]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CN2rbb075616 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Mon, 13 Aug 2012 09:02:54 +1000 (EST) (envelope-from peter@rulingia.com) X-Bogosity: Ham, spamicity=0.000000 Received: from server.rulingia.com (localhost.rulingia.com [127.0.0.1]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CN2lhO021143 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Mon, 13 Aug 2012 09:02:47 +1000 (EST) (envelope-from peter@server.rulingia.com) Received: (from peter@localhost) by server.rulingia.com (8.14.5/8.14.5/Submit) id q7CN2lsU021142 for freebsd-numerics@freebsd.org; Mon, 13 Aug 2012 09:02:47 +1000 (EST) (envelope-from peter) Resent-From: Peter Jeremy Resent-Date: Mon, 13 Aug 2012 09:02:47 +1000 Resent-Message-ID: <20120812230247.GD20453@server.rulingia.com> Resent-To: freebsd-numerics@freebsd.org Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q6O3I8IZ065193 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Tue, 24 Jul 2012 13:18:08 +1000 (EST) (envelope-from brde@optusnet.com.au) Received: from mail04.syd.optusnet.com.au (mail04.syd.optusnet.com.au [211.29.132.185]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q6O3I8K2017903 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Tue, 24 Jul 2012 13:18:08 +1000 (EST) (envelope-from brde@optusnet.com.au) Received: from c122-106-171-246.carlnfd1.nsw.optusnet.com.au (c122-106-171-246.carlnfd1.nsw.optusnet.com.au [122.106.171.246]) by mail04.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id q6O3HqKj021486 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Tue, 24 Jul 2012 13:17:53 +1000 From: Bruce Evans Mail-Followup-To: freebsd-numerics@freebsd.org X-X-Sender: bde@besplex.bde.org To: Stephen Montgomery-Smith In-Reply-To: <500DAD41.5030104@missouri.edu> Message-ID: <20120724113214.G934@besplex.bde.org> References: <20120717084457.U3890@besplex.bde.org> <5004A5C7.1040405@missouri.edu> <5004DEA9.1050001@missouri.edu> <20120717040118.GA86840@troutmask.apl.washington.edu> <20120717042125.GF66913@server.rulingia.com> <20120717043848.GB87001@troutmask.apl.washington.edu> <20120717225328.GA86902@server.rulingia.com> <20120717232740.GA95026@troutmask.apl.washington.edu> <20120718001337.GA87817@server.rulingia.com> <20120718123627.D1575@besplex.bde.org> <20120722121219.GC73662@server.rulingia.com> <500DAD41.5030104@missouri.edu> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Mailman-Approved-At: Sun, 12 Aug 2012 23:55:59 +0000 Cc: Diane Bruce , Bruce Evans , John Baldwin , David Chisnall , Bruce Evans , Steve Kargl , David Schultz , Peter Jeremy , Warner Losh Subject: Re: Use of C99 extra long double math functions after r236148 X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Date: Sun, 12 Aug 2012 23:02:54 -0000 X-Original-Date: Tue, 24 Jul 2012 13:17:52 +1000 (EST) X-List-Received-Date: Sun, 12 Aug 2012 23:02:54 -0000 On Mon, 23 Jul 2012, Stephen Montgomery-Smith wrote: > I just realized that catan(z) = reverse( catanh(reverse (z))), just like > casin relates to casinh (remember reverse(x+I*y) = y+I*x). This is a > consequence of catan and catanh being odd functions, as well as the standard > relation catan(z) = -I*catanh(I*z). > > So I would modify Peter's code by taking out the minus signs. > > Maybe it would make a difference if the answers involved -0. Quite likely. Note that keeping the real and imaginary parts separate is giving perfect reflection across some axes. Reflecting 0 is for other axes and it doesn't fall out so accidentally. > I feel that I am done with these functions for now. I tried to change my > comments to conform to the style given to me by Bruce. However spacing > inside mathematical expressions is something where I am inconsistent. Run it through indent to find all the inconsistencies. > The functions still need a lot of work to handle -0, infs and NaNs correctly. > I will leave that to you guys, because you seem so much better at it than me. > I still don't understand why the proper test is "if (x!=x) return(x+x)" > rather than "if (isnan(x)) return(NAN)". First, signaling NaNs are required to signal, in a strong way: (IEEE doesn't specify complex functions of course, but it requires this for all the functions and most of the operations that it specifies. The main exceptions (in IEEE-854-1987) are: - copying a signaling NaN without changing its precision. Whether this triggers is implementation-defined. So i387 conforms by triggering for changing the precision on loads iff the load does a non-null change to long double precision - the sign of a NaN is not specified. Triggering for signaling NaNs is still required "for every operation listed in Section 5". These operations don't seem to include absolute values or negation. Thus the i387 fabs and fchs conform because they are extensions, but they should never be used!?, since any use doesn't conform to the spirit of the standard.) Every "operation" involving a signaling NaN or invalid operation shall, if no "trap" occurs and a floating point result is to be delivered, deliver "a" quiet NaN as its result. "if (isnan(x)) return(NAN)" does extra work to fail to conform to this. First it uses the slow isnan() classification to be sure that no "trap" occurs. (Whether (x != x) triggers signaling NaNs, and whether the triggering involves a trap or just setting FE_INVALID is delicate. This was broken in the original i8087. Actually, on checking the details, I found that it was for quiet NaNs that was broken. (x != x) should and does set FE_INVALID for signaling NaNs on all x87. This traps iff it is unmasked. Comparison doesn't deliver an FP result, so quietening of x is irrelevant. So (x != x) works right for signaling NaNs on all x87. The i8087 bug was that its comparison operators all set FE_INVALID, etc for quiet NaNs too. Later x87 have comparison operators that work right on all types of NaNs, and compilers use them. Thus you can write isnan() as (x != x) on at least x87 without getting a spurious FE_INVALID for quiet NaNs. I think this is unportable, so isnan() should be used if it is necessary to avoid such FE_INVALID. But where we want FE_INVALID, and we don't care if we get it from both the comparison and (x+x). Second, returning x wouldn't conform, since the returned x might not be quiet if the original x isn't. Returning the `NAN' does conform, provided NaN is quiet, since the standard allows "a" (that is, any quiet NaN to be returned). BTW, C has very bad names for NaNs and Infs. `NaN' should be spelled with its 'a' in lower case, as in the standard. Here the NaN returned should be spelled qNaN or as my dNaN (default NaN) to emphasize that it is a quiet default NaN. The standard then recommends which NaN "should" be returned, and it isn't the C default NaN: Every operation [...] shall [...] if a floating point result is to be delivered, [when this result is specified to be "a" quiet NaN,] then this NaN _should_ be _one_ of its input NaNs. Note that precision conversions might be unable to deliver the same NaN. This is only a "should", but returning the default C NaN fails to conform to at least its spirit, by making no attempt to return the original NaN. Returning the an original one is especially simple when there is only one arg. This arg just needs to be quietened and undergo any necessary conversions. (x+x) does this (modulo excessive conversions due to C's promotion rules). When the arg is a signaling NaN, it is impossible to return exactly it, so returning an unrelated default would be more reasonable, but doing this would take more code. It is best to depend on the hardware doing the right thing for (x+x) on signaling NaN x. The hardware is specified to give a quiet result and would need much the same complications as us return a default instead of x slightly modified for signaling NaNs only. This also specifies NaN mixing: when there are _two_ args, _one_ of them _should_ be returned. Unfortunately, which one is not specified. But hardware operations should pick one, and we can do about as well as the hardware by mixing (= choosing) using (x+y). Bruce From owner-freebsd-numerics@FreeBSD.ORG Sun Aug 12 23:06:23 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id D47A8106566B for ; Sun, 12 Aug 2012 23:06:23 +0000 (UTC) (envelope-from peter@rulingia.com) Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by mx1.freebsd.org (Postfix) with ESMTP id 40EB38FC1B for ; Sun, 12 Aug 2012 23:06:22 +0000 (UTC) Received: from server.rulingia.com (c220-239-249-137.belrs5.nsw.optusnet.com.au [220.239.249.137]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CN6M9j075707 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Mon, 13 Aug 2012 09:06:22 +1000 (EST) (envelope-from peter@rulingia.com) X-Bogosity: Ham, spamicity=0.000000 Received: from server.rulingia.com (localhost.rulingia.com [127.0.0.1]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CN6GOp021352 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Mon, 13 Aug 2012 09:06:16 +1000 (EST) (envelope-from peter@server.rulingia.com) Received: (from peter@localhost) by server.rulingia.com (8.14.5/8.14.5/Submit) id q7CN6G42021351 for freebsd-numerics@freebsd.org; Mon, 13 Aug 2012 09:06:16 +1000 (EST) (envelope-from peter) Resent-From: Peter Jeremy Resent-Date: Mon, 13 Aug 2012 09:06:16 +1000 Resent-Message-ID: <20120812230616.GX20453@server.rulingia.com> Resent-To: freebsd-numerics@freebsd.org Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q6HBanFE074433 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Tue, 17 Jul 2012 21:36:49 +1000 (EST) (envelope-from brde@optusnet.com.au) Received: from mail05.syd.optusnet.com.au (mail05.syd.optusnet.com.au [211.29.132.186]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q6HBanHI067069 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Tue, 17 Jul 2012 21:36:49 +1000 (EST) (envelope-from brde@optusnet.com.au) Received: from c122-106-171-246.carlnfd1.nsw.optusnet.com.au (c122-106-171-246.carlnfd1.nsw.optusnet.com.au [122.106.171.246]) by mail05.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id q6HBaRkk002705 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Tue, 17 Jul 2012 21:36:28 +1000 From: Bruce Evans Mail-Followup-To: freebsd-numerics@freebsd.org X-X-Sender: bde@besplex.bde.org To: Bruce Evans In-Reply-To: <20120717200931.U6624@besplex.bde.org> Message-ID: <20120717211625.E6848@besplex.bde.org> References: <20120529045612.GB4445@server.rulingia.com> <20120711223247.GA9964@troutmask.apl.washington.edu> <20120713114100.GB83006@server.rulingia.com> <201207130818.38535.jhb@freebsd.org> <9EB2DA4F-19D7-4BA5-8811-D9451CB1D907@theravensnest.org> <20120713155805.GC81965@zim.MIT.EDU> <20120714120432.GA70706@server.rulingia.com> <20120717084457.U3890@besplex.bde.org> <5004A5C7.1040405@missouri.edu> <5004DEA9.1050001@missouri.edu> <20120717200931.U6624@besplex.bde.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Mailman-Approved-At: Sun, 12 Aug 2012 23:55:59 +0000 Cc: Diane Bruce , John Baldwin , David Chisnall , Stephen Montgomery-Smith , Bruce Evans , Steve Kargl , David Schultz , Peter Jeremy , Warner Losh Subject: Re: Use of C99 extra long double math functions after r236148 X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Date: Sun, 12 Aug 2012 23:06:23 -0000 X-Original-Date: Tue, 17 Jul 2012 21:36:27 +1000 (EST) X-List-Received-Date: Sun, 12 Aug 2012 23:06:23 -0000 On Tue, 17 Jul 2012, Bruce Evans wrote: > On Mon, 16 Jul 2012, Stephen Montgomery-Smith wrote: > >> On 07/16/2012 06:37 PM, Stephen Montgomery-Smith wrote: >>> ... >>> We might get lucky, and find that the definitions of csqrt and clog in >>> the C99 standard are already set up so that the naive formulas for >>> cacosh, etc, just work. But whether they do or whether they don't, I >>> think I can do it. (As a first guess, I think that catanh and casinh >>> will work "out of the box" but cacosh is going to take a bit more work.) > > See below what happened for naive formulars for ccosh. I forgot the below before. The following is ccosh() using mainly the naive formula. % /* % * Hyperbolic cosine of a double complex argument. % * % * Most exceptional values are automatically correctly handled by the % * standard formula. See n1124.pdf for details. % */ % % #include % #include % % #include "math_private.h" % % double complex % ccosh1(double complex z) % { % double x, y; % % x = creal(z); % y = cimag(z); % % /* % * This is subtler than it looks. We handle the cases x == 0 and % * y == 0 directly not for efficiency, but to avoid multiplications % * that don't work like we need. In these cases, the result must % * be almost pure real or a pure imaginary, except it has type % * float complex and its impure part may be -0. Multiplication of % * +-0 by an infinity or a NaN would give a NaN for the impure part, % * and would raise an unwanted exception. % * % * We depend on cos(y) and sin(y) never being precisely +-0 except % * when y is +-0 to prevent getting NaNs from other cases of % * +-Inf*+-0. This is true in infinite precision (since pi is % * irrational), and checking shows that it is also true after % * rounding to float precision. % */ % if (x == 0 && !isfinite(y)) % return (cpack(y - y, copysign(0, x * (y - y)))); % if (y == 0) % return (cpack(cosh(x), isnan(x) ? copysign(0, (x + x) * y) : % copysign(0, x) * y)); % if (isinf(x) && !isfinite(y)) % return (cpack(x * x, x * (y - y))); % if (fabs(x) > 710 && fabs(x) < 1455) { % z = __ldexp_cexp(cpack(fabs(x), y), -1); % return (cpack(creal(z), cimag(z) * copysign(1, x))); % } % return (cpack(cosh(x) * cos(y), sinh(x) * sin(y))); % } This was compared with the committed version and changed minimally to have the same behaviour for exceptional args. (Actually, more the reverse -- try the naive version and see what behaviour falls out of it for exceptional args. Then change both to match the behaviour specified in C99. This version was changed minimally to keep it as short as possible and the committed version was changed maximally to have at least a comment for all the exceptional cases.) One problem with this is that it is hard to see even what the exceptional cases are. Another is that the classification macros are very slow and only slightly easier to use than the bit-based classifications in the committed version (the latter could probably be macro-ized to make them almost as asy to use as the classification macros). I added the __ldexp_cexp() code later to maintain the binary compatibility of this with the comiitted version. So it's not using the naive formula for large |x|. Bruce From owner-freebsd-numerics@FreeBSD.ORG Sun Aug 12 23:09:20 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 24B68106566C for ; Sun, 12 Aug 2012 23:09:20 +0000 (UTC) (envelope-from peter@rulingia.com) Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by mx1.freebsd.org (Postfix) with ESMTP id B22A18FC0C for ; Sun, 12 Aug 2012 23:09:19 +0000 (UTC) Received: from server.rulingia.com (c220-239-249-137.belrs5.nsw.optusnet.com.au [220.239.249.137]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CN9Jhb075779 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Mon, 13 Aug 2012 09:09:19 +1000 (EST) (envelope-from peter@rulingia.com) X-Bogosity: Ham, spamicity=0.000000 Received: from server.rulingia.com (localhost.rulingia.com [127.0.0.1]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CN9DXw021519 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Mon, 13 Aug 2012 09:09:13 +1000 (EST) (envelope-from peter@server.rulingia.com) Received: (from peter@localhost) by server.rulingia.com (8.14.5/8.14.5/Submit) id q7CN9DlJ021518 for freebsd-numerics@freebsd.org; Mon, 13 Aug 2012 09:09:13 +1000 (EST) (envelope-from peter) Resent-From: Peter Jeremy Resent-Date: Mon, 13 Aug 2012 09:09:13 +1000 Resent-Message-ID: <20120812230913.GL20453@server.rulingia.com> Resent-To: freebsd-numerics@freebsd.org Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q6JBGW5W011534 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Thu, 19 Jul 2012 21:16:32 +1000 (EST) (envelope-from brde@optusnet.com.au) Received: from mail17.syd.optusnet.com.au (mail17.syd.optusnet.com.au [211.29.132.198]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q6JBGWZi080367 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Thu, 19 Jul 2012 21:16:32 +1000 (EST) (envelope-from brde@optusnet.com.au) Received: from c122-106-171-246.carlnfd1.nsw.optusnet.com.au (c122-106-171-246.carlnfd1.nsw.optusnet.com.au [122.106.171.246]) by mail17.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id q6JBG8fE019274 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Thu, 19 Jul 2012 21:16:10 +1000 From: Bruce Evans Mail-Followup-To: freebsd-numerics@freebsd.org X-X-Sender: bde@besplex.bde.org To: Stephen Montgomery-Smith In-Reply-To: <5007AD41.9070000@missouri.edu> Message-ID: <20120719205347.T2601@besplex.bde.org> References: <20120714120432.GA70706@server.rulingia.com> <20120717084457.U3890@besplex.bde.org> <5004A5C7.1040405@missouri.edu> <5004DEA9.1050001@missouri.edu> <20120717200931.U6624@besplex.bde.org> <5006D13D.2080702@missouri.edu> <20120718205625.GA409@troutmask.apl.washington.edu> <500725F2.7060603@missouri.edu> <20120719025345.GA1376@troutmask.apl.washington.edu> <50077987.1080307@missouri.edu> <20120719032706.GA1558@troutmask.apl.washington.edu> <5007826D.7060806@missouri.edu> <5007AD41.9070000@missouri.edu> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Mailman-Approved-At: Sun, 12 Aug 2012 23:55:59 +0000 Cc: Diane Bruce , Steve Kargl , John Baldwin , David Chisnall , Bruce Evans , Bruce Evans , David Schultz , Peter Jeremy , Warner Losh Subject: Re: Use of C99 extra long double math functions after r236148 X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Date: Sun, 12 Aug 2012 23:09:20 -0000 X-Original-Date: Thu, 19 Jul 2012 21:16:08 +1000 (EST) X-List-Received-Date: Sun, 12 Aug 2012 23:09:20 -0000 On Thu, 19 Jul 2012, Stephen Montgomery-Smith wrote: > I did a ULP test on clog. The test code is attached. (Not the cleanest > code, I know, but it does the job.) It needs the mpfr and unuran ports > installed. > > To my shock, I found that under certain circumstances, the ULP in the real > part was huge. The problem is when hypot(x,y) is close to 1, because then > the real part of clog is close to zero. I was seeing ULPs in the thousands. Better than GULPs in the thousands :-). This is not the problem that I first thought it might be. > I struggled to find a solution, and now I think I have the ULP down to about > 2. I am going to work on it more tomorrow to see if I can get ULP down even > further. I peeked at the Apple code (complex.c). It is not very sophisticated, but usages log1p() and doesn't use hypot(). I don't know if this is inherently more accurate or is just more efficient. Apple code has a more sophisticated casinh() based on Kahan's old papers about branch cuts and nothing's sign bit. I don't know if Apple's copyright inibits us using complex.c. Its uglyness prevents me using it. We didn't use it for cexp() or ccosh(), and our code is better. Bruce From owner-freebsd-numerics@FreeBSD.ORG Sun Aug 12 23:09:26 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 51AAC106564A for ; Sun, 12 Aug 2012 23:09:26 +0000 (UTC) (envelope-from peter@rulingia.com) Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by mx1.freebsd.org (Postfix) with ESMTP id DE9018FC16 for ; Sun, 12 Aug 2012 23:09:25 +0000 (UTC) Received: from server.rulingia.com (c220-239-249-137.belrs5.nsw.optusnet.com.au [220.239.249.137]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CN9PL9075784 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Mon, 13 Aug 2012 09:09:25 +1000 (EST) (envelope-from peter@rulingia.com) X-Bogosity: Ham, spamicity=0.000000 Received: from server.rulingia.com (localhost.rulingia.com [127.0.0.1]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CN9JGR021531 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Mon, 13 Aug 2012 09:09:19 +1000 (EST) (envelope-from peter@server.rulingia.com) Received: (from peter@localhost) by server.rulingia.com (8.14.5/8.14.5/Submit) id q7CN9Jcv021530 for freebsd-numerics@freebsd.org; Mon, 13 Aug 2012 09:09:19 +1000 (EST) (envelope-from peter) Resent-From: Peter Jeremy Resent-Date: Mon, 13 Aug 2012 09:09:19 +1000 Resent-Message-ID: <20120812230919.GN20453@server.rulingia.com> Resent-To: freebsd-numerics@freebsd.org Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q6JICldO017633 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Fri, 20 Jul 2012 04:12:47 +1000 (EST) (envelope-from brde@optusnet.com.au) Received: from mail17.syd.optusnet.com.au (mail17.syd.optusnet.com.au [211.29.132.198]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q6JIClaP081936 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Fri, 20 Jul 2012 04:12:47 +1000 (EST) (envelope-from brde@optusnet.com.au) Received: from c122-106-171-246.carlnfd1.nsw.optusnet.com.au (c122-106-171-246.carlnfd1.nsw.optusnet.com.au [122.106.171.246]) by mail17.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id q6JICPAb003129 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Fri, 20 Jul 2012 04:12:33 +1000 From: Bruce Evans Mail-Followup-To: freebsd-numerics@freebsd.org X-X-Sender: bde@besplex.bde.org To: Stephen Montgomery-Smith In-Reply-To: <50084322.7020401@missouri.edu> Message-ID: <20120720035001.W4053@besplex.bde.org> References: <20120714120432.GA70706@server.rulingia.com> <20120717084457.U3890@besplex.bde.org> <5004A5C7.1040405@missouri.edu> <5004DEA9.1050001@missouri.edu> <20120717200931.U6624@besplex.bde.org> <5006D13D.2080702@missouri.edu> <20120718205625.GA409@troutmask.apl.washington.edu> <500725F2.7060603@missouri.edu> <20120719025345.GA1376@troutmask.apl.washington.edu> <50077987.1080307@missouri.edu> <20120719032706.GA1558@troutmask.apl.washington.edu> <5007826D.7060806@missouri.edu> <5007AD41.9070000@missouri.edu> <20120719205347.T2601@besplex.bde.org> <50084322.7020401@missouri.edu> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Mailman-Approved-At: Sun, 12 Aug 2012 23:55:59 +0000 Cc: Diane Bruce , Bruce Evans , John Baldwin , David Chisnall , Bruce Evans , Steve Kargl , David Schultz , Peter Jeremy , Warner Losh Subject: Re: Use of C99 extra long double math functions after r236148 X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Date: Sun, 12 Aug 2012 23:09:26 -0000 X-Original-Date: Fri, 20 Jul 2012 04:12:25 +1000 (EST) X-List-Received-Date: Sun, 12 Aug 2012 23:09:26 -0000 On Thu, 19 Jul 2012, Stephen Montgomery-Smith wrote: > I have the ULP down to about 1.2 now. I don't see how I can do better, > because I have to invoke log functions twice, and probably each one has a ULP > of about 0.6. > > Also I decided to use 1/2 log(x*x+y*y) when x and y are not too large. That's close to Apple complex.c clog(). Once you don't use hypot(), it is clearly best to use log1p(): log(sqrt(x*x + y*y)) = log(|x|) + 1/2 log(1 + (y*y)/(x*x)) = log(|x|) + 1/2 log1p((y*y)/(x*x)) where |x| >= |y| so that log1p()'s arg is as small as possible. > I am really rather proud of how I got around the large ULP when hypot(x,y) is > close to 1. I would be glad if any of you could look at the code when you > get a chance. WIll look more closely later. I see that you already use log1p() and much more. Apple clog() uses not so much more, mainly by depending on extra precision in hardware. The above also avoids overflow and use of hypot() for all finite x and and y, but is probably too simple. Bruce From owner-freebsd-numerics@FreeBSD.ORG Sun Aug 12 23:09:59 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 27BDE1065674 for ; Sun, 12 Aug 2012 23:09:59 +0000 (UTC) (envelope-from peter@rulingia.com) Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by mx1.freebsd.org (Postfix) with ESMTP id A1EB98FC16 for ; Sun, 12 Aug 2012 23:09:58 +0000 (UTC) Received: from server.rulingia.com (c220-239-249-137.belrs5.nsw.optusnet.com.au [220.239.249.137]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CN9waE075800 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Mon, 13 Aug 2012 09:09:58 +1000 (EST) (envelope-from peter@rulingia.com) X-Bogosity: Ham, spamicity=0.000000 Received: from server.rulingia.com (localhost.rulingia.com [127.0.0.1]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CN9qYR021565 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Mon, 13 Aug 2012 09:09:52 +1000 (EST) (envelope-from peter@server.rulingia.com) Received: (from peter@localhost) by server.rulingia.com (8.14.5/8.14.5/Submit) id q7CN9qaS021564 for freebsd-numerics@freebsd.org; Mon, 13 Aug 2012 09:09:52 +1000 (EST) (envelope-from peter) Resent-From: Peter Jeremy Resent-Date: Mon, 13 Aug 2012 09:09:52 +1000 Resent-Message-ID: <20120812230952.GQ20453@server.rulingia.com> Resent-To: freebsd-numerics@freebsd.org Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q6K7xrkM027009 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Fri, 20 Jul 2012 17:59:53 +1000 (EST) (envelope-from brde@optusnet.com.au) Received: from mail05.syd.optusnet.com.au (mail05.syd.optusnet.com.au [211.29.132.186]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q6K7xrK5085896 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Fri, 20 Jul 2012 17:59:53 +1000 (EST) (envelope-from brde@optusnet.com.au) Received: from c122-106-171-246.carlnfd1.nsw.optusnet.com.au (c122-106-171-246.carlnfd1.nsw.optusnet.com.au [122.106.171.246]) by mail05.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id q6K7xboh024498 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Fri, 20 Jul 2012 17:59:38 +1000 From: Bruce Evans Mail-Followup-To: freebsd-numerics@freebsd.org X-X-Sender: bde@besplex.bde.org To: Stephen Montgomery-Smith In-Reply-To: <50085441.4090305@missouri.edu> Message-ID: <20120720162953.N2162@besplex.bde.org> References: <20120714120432.GA70706@server.rulingia.com> <20120717084457.U3890@besplex.bde.org> <5004A5C7.1040405@missouri.edu> <5004DEA9.1050001@missouri.edu> <20120717200931.U6624@besplex.bde.org> <5006D13D.2080702@missouri.edu> <20120718205625.GA409@troutmask.apl.washington.edu> <500725F2.7060603@missouri.edu> <20120719025345.GA1376@troutmask.apl.washington.edu> <50077987.1080307@missouri.edu> <20120719032706.GA1558@troutmask.apl.washington.edu> <5007826D.7060806@missouri.edu> <5007AD41.9070000@missouri.edu> <20120719205347.T2601@besplex.bde.org> <50084322.7020401@missouri.edu> <20120720035001.W4053@besplex.bde.org> <50085441.4090305@missouri.edu> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Mailman-Approved-At: Sun, 12 Aug 2012 23:55:59 +0000 Cc: Diane Bruce , Bruce Evans , John Baldwin , David Chisnall , Bruce Evans , Steve Kargl , David Schultz , Peter Jeremy , Warner Losh Subject: Re: Use of C99 extra long double math functions after r236148 X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Date: Sun, 12 Aug 2012 23:09:59 -0000 X-Original-Date: Fri, 20 Jul 2012 17:59:37 +1000 (EST) X-List-Received-Date: Sun, 12 Aug 2012 23:09:59 -0000 On Thu, 19 Jul 2012, Stephen Montgomery-Smith wrote: First a bit from a previous reply: bde: >> WIll look more closely later. I see that you already use log1p() and much >> more. Apple clog() uses not so much more, mainly by depending on extra >> precision in hardware. The above also avoids overflow and use of hypot() >> for all finite x and and y, but is probably too simple. Now looked more closely. I put the real part of it it in my test framework as loghypot() and tested this against alternative versions like ordinary hypot(). stephen: > The Apple solution has a problem. The two invocations of log might > produce results that are nearly identical, but with opposite signs. > Think about x = y = 1/sqrt(2). Simple log(hypot()) has huge errors, but the Apple version seemed to be generally accurate and was much more accurate near |z| = 1. This pointed to some bugs in your version. These show up before any cancelation bugs in the Apple version. > I think their solution merely avoids the overflow/underflow problem, and was > not meant to address the problem I worked on. However, their solution will > fail if y=1e100 and y=1e-100. > > This caused me to realize that my solution failed to account for underflow, > so here is my next iteration. Tested this version. First I reformatted it all (the comment indentation was furthest from KNF). indent(1) did a good job and I only reformatted a couple of long lines manually: % #include % #include % #include % #include % % #include "math_private.h" % % /* % * gcc doesn't implement complex multiplication or division correctly, so we % * need to handle infinities specially. We turn on this pragma to notify % * conforming c99 compilers that the fast-but-incorrect code that gcc % * generates is acceptable, since the special cases have already been % * handled. % */ % #pragma STDC CX_LIMITED_RANGE ON % % double complex % clog(double complex z) % { % double x, y, h, t1, t2, t3; % double x0, y0, x1, y1; % % x = creal(z); % y = cimag(z); % % #define NANMIX_APPLE_CLOG_COMPAT 1 % /* Handle special cases when x or y is NAN. */ % if (isnan(x)) { % if (isinf(y)) % return (cpack(INFINITY, NAN)); % else { % #if NANMIX_HYPOTF_COMPAT % y = fabs(y + 0); % t1 = (y - y) / (y - y); /* Raise invalid flag if y is % * not NaN */ % t1 = fabs(x + 0) + t1; /* Mix NaN(s). */ % #elif NANMIX_APPLE_CLOG_COMPAT % /* No actual mixing. */ % return (cpack(x, copysign(x, y))); % #else % t1 = (y - y) / (y - y); /* Raise invalid flag if y is % * not NaN */ % #endif % return (cpack(t1, t1)); % } % } else if (isnan(y)) { % if (isinf(x)) % return (cpack(INFINITY, NAN)); % else { % #ifdef NANMIX_HYPOTF_COMPAT % x = fabs(x + 0); % t1 = (x - x) / (x - x); /* Raise invalid flag if x is % * not NaN */ % t1 = t1 + fabs(y + 0); /* Mix NaN(s). */ % #elif NANMIX_APPLE_CLOG_COMPAT % /* No actual mixing. */ % return (cpack(y, y)); % #else % t1 = (x - x) / (x - x); /* Raise invalid flag if x is % * not NaN */ % #endif % return (cpack(t1, t1)); % } Here I added sqillions of ifdefs so that the results come out the same for NaNs as with log(hypot()) and with apple clog(). This is not very important, but my tests report all differences and it is hard to see important differences when there are many for NaNs. In general, the result of an operation with 1 NaN are should be the same NaN quieted, and the result of an operation with 2 NaN args should combine both args in a similar way to a hardware operation. For hypot(x, y), the result should be the same as from the naive formula sqrt(x*x+y*y) executed by the hardware. The real part of clog() should be the result of log() on this (according to the simpler rules for 1 NaN). Unfortunately, fdlibm hypot() historically first replaces x and y by |x| and |y|. fabs() clears the sign bit even for NaNs, and the result is quite different from that given by the naive formula. The above adds some fabs()'s and other tweaks for consistency with hypot(). I don't try hard to make the imaginary part consistent with atan2(). It would be simpler to just call hypot() for NaN cases. Apple clog() mostly doesn't mix pairs of NaNs. It mostly doesn't even try to apply the default conversions to a single NaN. The above adds another set of tweaks for consistency with it. % } else if (isfinite(x) && isfinite(y) && % (fabs(x) > 1e308 || fabs(y) > 1e308)) % /* % * To avoid unnecessary overflow, if x or y are very large, % * divide x and y by M_E, and then add 1 to the logarithm. % * This depends on M_E being larger than sqrt(2). % */ % return (cpack(log(hypot(x / M_E, y / M_E) + 1), atan2(y, x))); % else if (fabs(x) < 1e-50 || fabs(y) < 1e-50 || % fabs(x) > 1e50 || fabs(y) > 1e50) % /* % * Because atan2 and hypot conform to C99, this also covers % * all the edge cases when x or y are 0 or infinite. % */ % return (cpack(log(hypot(x, y)), atan2(y, x))); % else { % /* We don't need to worry about overflow in x*x+y*y. */ % h = x * x + y * y; % if (h < 0.1 || h > 10) % return (cpack(log(h) / 2, atan2(y, x))); % /* Take extra care if h is moderately close to 1 */ % else { % /* % * x0 and y0 are good approximations to x and y, but % * have their bits trimmed so that double precision % * floating point is capable of calculating x0*x0 + % * y0*y0 - 1 exactly. % */ % x0 = ldexp(floor(ldexp(x, 24)), -24); % x1 = x - x0; % y0 = ldexp(floor(ldexp(y, 24)), -24); % y1 = y - y0; This fails for example when x is near 1 and y is tiny. Say x == 1 and y == +-1e-19. The result for the real part should be about y**2/2 (almost exactly for y this tiny I think), but the above makes the following messes, since the ldexp()s don't actually trim to 24 bits: - for tiny y > 0, y times 2**24 is < 1 and floor() gives 0; x0 is 0 and x1 is x. The error has been moved to a worse place. - for tiny y < 0, y times 2**24 is > -1 and floor() gives -1; x0 is -2**-24, which is much larger in magnitude than y, and x1 is about as large to compensate. The error has been increased as well as moved. Since y is tiny, the final result was just 0 in all cases that I looked at. Apple clog() handles this case by using the y**2/2 approximation. The result of 0 instead of y**2/2 is about 9 GULPS of error, even when eveything is reduced to float precision (multiply this by 2**29 for the error in in double precision. I don't remember the SI prefixes that high). I tried simple fixes, but they made the total number of errors larger. ldexp() is slow, and the following are standard ways of trimming low bits much mure efficiently and correctly: - fdlibm mostly uses GET/EXTRACT to extract the bits, then masks low bits, then uses SET/INSERT to put the changed bits in an FP variable. (The naming differences are historical bugs.) - for double precision, most hi+lo decompositions split the double almost in half, with 26 hi bits and 27 lo bits. But often you only need 24 hi bits. Then it is usually faster to assign the double to a float. to get 24 hi bits. This requires that the double be small enough to be almost representable as a float: x0 = (float)x; x1 = x - x0; Here x0 can be float or the same type as x, or even different. On i386, there are compiler bugs which may require using STRICT_ASSIGN() here (at least if x is an expression). The compiler bugs are best avoided by working throughout with variables of type double_t (for double functions). - a more general method is to add and subtract a magic number like 0x1p40 or 0x1.8p40: x0 = x + 0x1p40 - 0x1p40; x1 = x - x0; This is usually faster, but more fragile. The magic 40 trims to 24 (+-1?) bits, provided the evaluation precision is 64 bits and there are no compiler bugs. On i386, the evaluation precision is variable but usually 53, and libm uses code like the above with magic for 53 in a couple of places. In my attempted simple fix, I used the float cast method. This failed because y = +-1e-19 is too tiny for anything good to happen. The actual y was already exactly representable in 24 bits, so y0 was y and y1 was 0... % /* Notice that mathematically, h = t1*(1+t3). */ % t1 = x0 * x0 + y0 * y0; ... t1 = 1 + 1e-38 rounds to 1, and there is no extra precision. I tried logl(hypotl()). The extra precision provided by long doubles was very far from making much difference. hypot()s methods don't seem to apply, since for tiny y hypot(1, y) ~= 1 + y**2/2 ~= 1, so the case of tiny y is especially easy for hypot(). % t2 = 2 * x0 * x1 + x1 * x1 + 2 * y0 * y1 + y1 * y1; % t3 = t2 / t1; % return (cpack((log1p(t1 - 1) + log1p(t3)) / 2, % atan2(y, x))); Isn't log1p(t1 -1) no different from log(t1)? % } % } % } I was going to leave all the fixes to you, but now want to try the approximation for tiny y :-). Bruce From owner-freebsd-numerics@FreeBSD.ORG Sun Aug 12 23:10:05 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 450E71065672 for ; Sun, 12 Aug 2012 23:10:05 +0000 (UTC) (envelope-from peter@rulingia.com) Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by mx1.freebsd.org (Postfix) with ESMTP id BA1B58FC15 for ; Sun, 12 Aug 2012 23:10:04 +0000 (UTC) Received: from server.rulingia.com (c220-239-249-137.belrs5.nsw.optusnet.com.au [220.239.249.137]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CNA4LA075803 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Mon, 13 Aug 2012 09:10:04 +1000 (EST) (envelope-from peter@rulingia.com) X-Bogosity: Ham, spamicity=0.000000 Received: from server.rulingia.com (localhost.rulingia.com [127.0.0.1]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CN9whB021573 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Mon, 13 Aug 2012 09:09:58 +1000 (EST) (envelope-from peter@server.rulingia.com) Received: (from peter@localhost) by server.rulingia.com (8.14.5/8.14.5/Submit) id q7CN9wj8021572 for freebsd-numerics@freebsd.org; Mon, 13 Aug 2012 09:09:58 +1000 (EST) (envelope-from peter) Resent-From: Peter Jeremy Resent-Date: Mon, 13 Aug 2012 09:09:58 +1000 Resent-Message-ID: <20120812230958.GR20453@server.rulingia.com> Resent-To: freebsd-numerics@freebsd.org Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q6K9JVu5027846 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Fri, 20 Jul 2012 19:19:31 +1000 (EST) (envelope-from brde@optusnet.com.au) Received: from mail14.syd.optusnet.com.au (mail14.syd.optusnet.com.au [211.29.132.195]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q6K9JTo7086076 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Fri, 20 Jul 2012 19:19:29 +1000 (EST) (envelope-from brde@optusnet.com.au) Received: from c122-106-171-246.carlnfd1.nsw.optusnet.com.au (c122-106-171-246.carlnfd1.nsw.optusnet.com.au [122.106.171.246]) by mail14.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id q6K9JFpB006604 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Fri, 20 Jul 2012 19:19:16 +1000 From: Bruce Evans Mail-Followup-To: freebsd-numerics@freebsd.org X-X-Sender: bde@besplex.bde.org To: Bruce Evans In-Reply-To: <20120720162953.N2162@besplex.bde.org> Message-ID: <20120720184114.B2790@besplex.bde.org> References: <20120714120432.GA70706@server.rulingia.com> <20120717084457.U3890@besplex.bde.org> <5004A5C7.1040405@missouri.edu> <5004DEA9.1050001@missouri.edu> <20120717200931.U6624@besplex.bde.org> <5006D13D.2080702@missouri.edu> <20120718205625.GA409@troutmask.apl.washington.edu> <500725F2.7060603@missouri.edu> <20120719025345.GA1376@troutmask.apl.washington.edu> <50077987.1080307@missouri.edu> <20120719032706.GA1558@troutmask.apl.washington.edu> <5007826D.7060806@missouri.edu> <5007AD41.9070000@missouri.edu> <20120719205347.T2601@besplex.bde.org> <50084322.7020401@missouri.edu> <20120720035001.W4053@besplex.bde.org> <50085441.4090305@missouri.edu> <20120720162953.N2162@besplex.bde.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Mailman-Approved-At: Sun, 12 Aug 2012 23:55:59 +0000 Cc: Diane Bruce , John Baldwin , David Chisnall , Stephen Montgomery-Smith , Bruce Evans , Steve Kargl , David Schultz , Peter Jeremy , Warner Losh Subject: Re: Use of C99 extra long double math functions after r236148 X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Date: Sun, 12 Aug 2012 23:10:05 -0000 X-Original-Date: Fri, 20 Jul 2012 19:19:15 +1000 (EST) X-List-Received-Date: Sun, 12 Aug 2012 23:10:05 -0000 On Fri, 20 Jul 2012, Bruce Evans wrote: > I was going to leave all the fixes to you, but now want to try the > approximation for tiny y :-). This version now beats Apple clog() for accuracy. The maximum difference observed is down from the exa-ulp range to 16 ulps with all of the 4-16 ulp differences checked against pari being innaccuracies in Apple clog(). 2**28 cases were tested, but most of the errors found look like this: % x = 0x3fe0000000000000 0.5 % y = 0x3fec000000000000 0.875 % loghypota(x, y) = 0x3ff7fe054587e01ea000 0x3f7fc0a8b0fc03d4 0.00775209326798261336156 % loghypot(x, y) = 0x3ff7fe054587e01f2000 0x3f7fc0a8b0fc03e4 0.00775209326798262723934 % err = +0x8000 16.00000 % pari log(0.5+0.875*I): % 0.007752093267982627075427023021 + 1.051650212548373667459867312*I so my tests barely cover fractions like 1/8 and the coverage may be too limited. New errors kept turning up as I expanded the coverage. % #include % #include % #include % #include % % #include "math_private.h" % % /* % * gcc doesn't implement complex multiplication or division correctly, so we % * need to handle infinities specially. We turn on this pragma to notify % * conforming c99 compilers that the fast-but-incorrect code that gcc % * generates is acceptable, since the special cases have already been % * handled. % */ % #pragma STDC CX_LIMITED_RANGE ON % % double complex % clog(double complex z) % { % double x, y, h, t1, t2, t3; % double x0, y0, x1, y1; % % x = creal(z); % y = cimag(z); % % #define NANMIX_APPLE_CLOG_COMPAT 1 % /* Handle special cases when x or y is NAN. */ % if (isnan(x)) { % if (isinf(y)) % return (cpack(INFINITY, NAN)); % else { % #if NANMIX_HYPOTF_COMPAT % y = fabs(y + 0); % t1 = (y - y) / (y - y); /* Raise invalid flag if y is % * not NaN */ % t1 = fabs(x + 0) + t1; /* Mix NaN(s). */ % #elif NANMIX_APPLE_CLOG_COMPAT % /* No actual mixing. */ % return (cpack(x, copysign(x, y))); % #else % t1 = (y - y) / (y - y); /* Raise invalid flag if y is % * not NaN */ % #endif % return (cpack(t1, t1)); % } % } else if (isnan(y)) { % if (isinf(x)) % return (cpack(INFINITY, NAN)); % else { % #ifdef NANMIX_HYPOTF_COMPAT % x = fabs(x + 0); % t1 = (x - x) / (x - x); /* Raise invalid flag if x is % * not NaN */ % t1 = t1 + fabs(y + 0); /* Mix NaN(s). */ % #elif NANMIX_APPLE_CLOG_COMPAT % /* No actual mixing. */ % return (cpack(y, y)); % #else % t1 = (x - x) / (x - x); /* Raise invalid flag if x is % * not NaN */ % #endif % return (cpack(t1, t1)); % } % } else if (isfinite(x) && isfinite(y) && % (fabs(x) > 1e308 || fabs(y) > 1e308)) % /* % * To avoid unnecessary overflow, if x or y are very large, % * divide x and y by M_E, and then add 1 to the logarithm. % * This depends on M_E being larger than sqrt(2). % */ % return (cpack(log(hypot(x / M_E, y / M_E)) + 1, atan2(y, x))); Important fix here. 1 was added to the log() arg instead of to log(). % else if (fabs(x) < 1e-50 && fabs(y) != 1 || % fabs(y) < 1e-50 && fabs(x) != 1 || % fabs(x) > 1e50 || fabs(y) > 1e50) The special case for |x| == 1 and |y| == 1 was defeated by returning for it here. % /* % * Because atan2 and hypot conform to C99, this also covers % * all the edge cases when x or y are 0 or infinite. % */ % return (cpack(log(hypot(x, y)), atan2(y, x))); % else { % /* We don't need to worry about overflow in x*x+y*y. */ % h = x * x + y * y; % if (h < 0.1 || h > 10) % return (cpack(log(h) / 2, atan2(y, x))); % /* Take extra care if h is moderately close to 1 */ % else { % #if 1 % if (fabs(x) == 1) % return (cpack(log1p(y * y) / 2, % atan2(y, x))); % if (fabs(y) == 1) % return (cpack(log1p(x * x) / 2, % atan2(y, x))); % #endif Special case. It seems too special, but Apple clog() doesn't do any more, and this with the other special cases is enough to beat Apple clog(). % /* % * x0 and y0 are good approximations to x and y, but % * have their bits trimmed so that double precision % * floating point is capable of calculating x0*x0 + % * y0*y0 - 1 exactly. % */ The only way for x*x + y*y to be _very_ near 1 in infinite precision is for |x| or y| to be 1 (I think). Other cases are bounded away from 1, and if you are lucky the bound is fairly far from 1 so that sloppier approximations work OK. Mathematicians should determine the bound exactly using continued fractions or something like they do for approximations to N*Pi/2. This becomes especially interesting in high precisions where you can't hope to get near the worst case by random testing. % x0 = ldexp(floor(ldexp(x, 24)), -24); % x1 = x - x0; % y0 = ldexp(floor(ldexp(y, 24)), -24); % y1 = y - y0; This has a chance of working iff the bound away from 1 is something like 2**-24. Otherwise, multiplying by 2**24 and flooring a positive value will just produce 0. 2**-24 seems much too small a bound. My test coverage is not wide enough to hit many bad cases. % /* Notice that mathematically, h = t1*(1+t3). */ % t1 = x0 * x0 + y0 * y0; % t2 = 2 * x0 * x1 + x1 * x1 + 2 * y0 * y1 + y1 * y1; % t3 = t2 / t1; % return (cpack((log1p(t1 - 1) + log1p(t3)) / 2, % atan2(y, x))); % } % } % } Bruce From owner-freebsd-numerics@FreeBSD.ORG Sun Aug 12 23:10:13 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 2E1E010656B7 for ; Sun, 12 Aug 2012 23:10:13 +0000 (UTC) (envelope-from peter@rulingia.com) Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by mx1.freebsd.org (Postfix) with ESMTP id 4D6338FC17 for ; Sun, 12 Aug 2012 23:10:12 +0000 (UTC) Received: from server.rulingia.com (c220-239-249-137.belrs5.nsw.optusnet.com.au [220.239.249.137]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CNACvT075811 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Mon, 13 Aug 2012 09:10:12 +1000 (EST) (envelope-from peter@rulingia.com) X-Bogosity: Ham, spamicity=0.000000 Received: from server.rulingia.com (localhost.rulingia.com [127.0.0.1]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CNA5vq021599 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Mon, 13 Aug 2012 09:10:06 +1000 (EST) (envelope-from peter@server.rulingia.com) Received: (from peter@localhost) by server.rulingia.com (8.14.5/8.14.5/Submit) id q7CNA5Yg021598 for freebsd-numerics@freebsd.org; Mon, 13 Aug 2012 09:10:05 +1000 (EST) (envelope-from peter) Resent-From: Peter Jeremy Resent-Date: Mon, 13 Aug 2012 09:10:05 +1000 Resent-Message-ID: <20120812231005.GT20453@server.rulingia.com> Resent-To: freebsd-numerics@freebsd.org Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q6KGPYTX032431 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Sat, 21 Jul 2012 02:25:34 +1000 (EST) (envelope-from brde@optusnet.com.au) Received: from mail03.syd.optusnet.com.au (mail03.syd.optusnet.com.au [211.29.132.184]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q6KGPYnS087133 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Sat, 21 Jul 2012 02:25:34 +1000 (EST) (envelope-from brde@optusnet.com.au) Received: from c122-106-171-246.carlnfd1.nsw.optusnet.com.au (c122-106-171-246.carlnfd1.nsw.optusnet.com.au [122.106.171.246]) by mail03.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id q6KGPHnj024560 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Sat, 21 Jul 2012 02:25:19 +1000 From: Bruce Evans Mail-Followup-To: freebsd-numerics@freebsd.org X-X-Sender: bde@besplex.bde.org To: Stephen Montgomery-Smith In-Reply-To: <50095CDE.4050507@missouri.edu> Message-ID: <20120721011112.D5008@besplex.bde.org> References: <20120714120432.GA70706@server.rulingia.com> <20120717084457.U3890@besplex.bde.org> <5004A5C7.1040405@missouri.edu> <5004DEA9.1050001@missouri.edu> <20120717200931.U6624@besplex.bde.org> <5006D13D.2080702@missouri.edu> <20120718205625.GA409@troutmask.apl.washington.edu> <500725F2.7060603@missouri.edu> <20120719025345.GA1376@troutmask.apl.washington.edu> <50077987.1080307@missouri.edu> <20120719032706.GA1558@troutmask.apl.washington.edu> <5007826D.7060806@missouri.edu> <5007AD41.9070000@missouri.edu> <20120719205347.T2601@besplex.bde.org> <50084322.7020401@missouri.edu> <20120720035001.W4053@besplex.bde.org> <50085441.4090305@missouri.edu> <20120720162953.N2162@besplex.bde.org> <20120720184114.B2790@besplex.bde.org> <50095CDE.4050507@missouri.edu> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Mailman-Approved-At: Sun, 12 Aug 2012 23:55:59 +0000 Cc: Diane Bruce , Bruce Evans , John Baldwin , David Chisnall , Bruce Evans , Steve Kargl , David Schultz , Peter Jeremy , Warner Losh Subject: Re: Use of C99 extra long double math functions after r236148 X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Date: Sun, 12 Aug 2012 23:10:13 -0000 X-Original-Date: Sat, 21 Jul 2012 02:25:16 +1000 (EST) X-List-Received-Date: Sun, 12 Aug 2012 23:10:13 -0000 On Fri, 20 Jul 2012, Stephen Montgomery-Smith wrote: > I worked quite a bit on my clog function last night. Could you look at this > one? Will look later. Another set of fixes for the old one now. The maximum error now seems to be about 1 ulp. I prefer it inline, but can duplicate it. % } else if (isfinite(x) && isfinite(y) && % (fabs(x) > 1e308 || fabs(y) > 1e308)) % /* % * To avoid unnecessary overflow, if x or y are very large, % * divide x and y by M_E, and then add 1 to the logarithm. % * This depends on M_E being larger than sqrt(2). % */ % return (cpack(log(hypot(x / M_E, y / M_E)) + 1, atan2(y, x))); Add 1 in the right place here. % else if (fabs(x) < 1e-50 && fabs(y) != 1 || % fabs(y) < 1e-50 && fabs(x) != 1 || % fabs(x) > 1e50 || fabs(y) > 1e50) Old modifications for |x| == 1 and |y| == 1. This needs to be expanded a bit, since cases near 1 are now handled too. % /* % * Because atan2 and hypot conform to C99, this also covers % * all the edge cases when x or y are 0 or infinite. % */ % return (cpack(log(hypot(x, y)), atan2(y, x))); % else { % /* We don't need to worry about overflow in x*x+y*y. */ % h = x * x + y * y; It is also good to avoid underflow. Underflow used to be avoided by using hypot() if x or y is tiny, but the above modification lets these cases through to here when |x| or |y| is 1. The square then underflows for values below about 1e-172. Except on i386, extra exponent range may prevent underflow, depending on optimization. To be fixed. % if (h < 0.1 || h > 10) % return (cpack(log(h) * 0.5, atan2(y, x))); Start replacing multiplications by divisions. % /* Take extra care if h is moderately close to 1 */ % else { % double ox = x; % double oy = y; I want to take absolute values and swap x and y so that 0 <= y <= x, so preserve the original values here. % x = fabs(x); % y = fabs(y); % if (x < y) { % x0 = x; % x = y; % y = x0; % } Normalize. % /* % * x0 and y0 are good approximations to x and y, but % * have their bits trimmed so that double precision % * floating point is capable of calculating x0*x0 + % * y0*y0 - 1 exactly. % */ It wasn't exact, even after fixing the hi+lo decompositions of x and y. Only the products were. The following fixes both. % x0 = (float)x; % x1 = x - x0; % y0 = (float)y; % y1 = y - y0; A good way to do the hi+lo decompositions. % if (y0 < 1e-50) { % t2 = y / x; % return (cpack(logf(x) + t2 * (t2 * 0.5), % atan2(oy, ox))); % } When y is tiny, this is necessary to handle the case x == 1. It also handles many nearby cases. Note that when y is tiny, x is fairly large, since h is fairly large. Thus t2 is also tiny. The formula is sloppy when x is not exactly 1, but works OK. t2*t2/2 must be parenthesized as above to avoid spurious underflow when t2 ~= sqrt(smallest denormal). Spurious underflow gives very large relative errors (I forget if they were ~8 ulps, ~436 ulps, or mega-ulps...). One of the main reasons for this special case is to avoid this spurious underflow. I developed this mainly for the float precision case on i386 and saw confusing behaviour when i386's extra precision sometimes avoided the underflow. % /* Notice that mathematically, h = t1*(1+t3). */ % t1 = x0 * x0; /* Exact. */ % t2 = y0 * y0; /* Exact. */ % STRICT_ASSIGN(double, t3, t1 + t2); % t2 = (t1 - t3) + t2; % t1 = t3; /* Now t1+t2 is hi+lo for x0*x0+y0*y0.*/ t1+t2 is not necessarily exact, and we must do multi-precision addition of it to get enough accuracy. Even this is not accurate enough when the final t3 is tiny. % t2 += 2 * x0 * x1 + x1 * x1 + 2 * y0 * y1 + y1 * y1; Add the rest of the low terms to t2. As before, except now t2 starts as usually nonzero to hold the low part of x0*x0+y0*y0. % t3 = t2 / t1; t2 is tiny relative to t1 (about 2**53 times smaller), so the accuracy of this is unimportant, except when t1 is near 1, in which case log() of it reduces it to near 1 and the error becomes dominated by that of t3. The special case for tiny y above is supposed to avoid reaching here in such cases, but the classification above is sloppy so it isn't clear that it does. Ww want to avoid reaching here unless t1 is a few tens or hundreds or thousands of ulps away from 1. % return (cpack((log(t1) + log1p(t3)) * 0.5, % atan2(oy, ox))); Suppose that t1 is exactly 1 and we reach here. t3 is only guaranteed to be less than about 2**-53. Even when it is 2**-205 times larger than that, we have a large error for it. One bad case is when it underflows. log1p(t3) * 0.5 gives a grouping of terms which allows t3 to underflow to 0. We can't put the factor of 0.5 inside log1p() without using a special case as above. Plain log(t1) is the same as log1p(t1 - 1) here. % } % } % } Bruce From owner-freebsd-numerics@FreeBSD.ORG Sun Aug 12 23:10:24 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 2AD9B1065678 for ; Sun, 12 Aug 2012 23:10:24 +0000 (UTC) (envelope-from peter@rulingia.com) Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by mx1.freebsd.org (Postfix) with ESMTP id B36FF8FC0A for ; Sun, 12 Aug 2012 23:10:23 +0000 (UTC) Received: from server.rulingia.com (c220-239-249-137.belrs5.nsw.optusnet.com.au [220.239.249.137]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CNANtk075819 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Mon, 13 Aug 2012 09:10:23 +1000 (EST) (envelope-from peter@rulingia.com) X-Bogosity: Ham, spamicity=0.000000 Received: from server.rulingia.com (localhost.rulingia.com [127.0.0.1]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CNAGUL021612 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Mon, 13 Aug 2012 09:10:17 +1000 (EST) (envelope-from peter@server.rulingia.com) Received: (from peter@localhost) by server.rulingia.com (8.14.5/8.14.5/Submit) id q7CNAG5H021611 for freebsd-numerics@freebsd.org; Mon, 13 Aug 2012 09:10:16 +1000 (EST) (envelope-from peter) Resent-From: Peter Jeremy Resent-Date: Mon, 13 Aug 2012 09:10:16 +1000 Resent-Message-ID: <20120812231016.GV20453@server.rulingia.com> Resent-To: freebsd-numerics@freebsd.org Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q6L2Y9Qg063003 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Sat, 21 Jul 2012 12:34:10 +1000 (EST) (envelope-from brde@optusnet.com.au) Received: from mail11.syd.optusnet.com.au (mail11.syd.optusnet.com.au [211.29.132.192]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q6L2Y9fR095299 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Sat, 21 Jul 2012 12:34:09 +1000 (EST) (envelope-from brde@optusnet.com.au) Received: from c122-106-171-246.carlnfd1.nsw.optusnet.com.au (c122-106-171-246.carlnfd1.nsw.optusnet.com.au [122.106.171.246]) by mail11.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id q6L2XrP4003178 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Sat, 21 Jul 2012 12:33:54 +1000 From: Bruce Evans Mail-Followup-To: freebsd-numerics@freebsd.org X-X-Sender: bde@besplex.bde.org To: Stephen Montgomery-Smith In-Reply-To: <5009BB46.3050001@missouri.edu> Message-ID: <20120721122309.R856@besplex.bde.org> References: <20120714120432.GA70706@server.rulingia.com> <20120717084457.U3890@besplex.bde.org> <5004A5C7.1040405@missouri.edu> <5004DEA9.1050001@missouri.edu> <20120717200931.U6624@besplex.bde.org> <5006D13D.2080702@missouri.edu> <20120718205625.GA409@troutmask.apl.washington.edu> <500725F2.7060603@missouri.edu> <20120719025345.GA1376@troutmask.apl.washington.edu> <50077987.1080307@missouri.edu> <20120719032706.GA1558@troutmask.apl.washington.edu> <5007826D.7060806@missouri.edu> <5007AD41.9070000@missouri.edu> <20120719205347.T2601@besplex.bde.org> <50084322.7020401@missouri.edu> <20120720035001.W4053@besplex.bde.org> <50085441.4090305@missouri.edu> <20120720162953.N2162@besplex.bde.org> <20120720184114.B2790@besplex.bde.org> <50095CDE.4050507@missouri.edu> <20120721011112.D5008@besplex.bde.org> <5009BB46.3050001@missouri.edu> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Mailman-Approved-At: Sun, 12 Aug 2012 23:56:00 +0000 Cc: Diane Bruce , Bruce Evans , John Baldwin , David Chisnall , Bruce Evans , Steve Kargl , David Schultz , Peter Jeremy , Warner Losh Subject: Re: Use of C99 extra long double math functions after r236148 X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Date: Sun, 12 Aug 2012 23:10:24 -0000 X-Original-Date: Sat, 21 Jul 2012 12:33:53 +1000 (EST) X-List-Received-Date: Sun, 12 Aug 2012 23:10:24 -0000 On Fri, 20 Jul 2012, Stephen Montgomery-Smith wrote: > On 07/20/2012 11:25 AM, Bruce Evans wrote: >> On Fri, 20 Jul 2012, Stephen Montgomery-Smith wrote: > > >> % x0 = (float)x; >> % x1 = x - x0; >> % y0 = (float)y; >> % y1 = y - y0; >> >> A good way to do the hi+lo decompositions. > > That was the way I tried first. But it didn't work for me! > > But I see you changed things further down, so that is probably why it works > for you. I didn't understand what was happening before, but think I can explain it now: - the above gives correct hi+lo decompositions. Both hi and lo are usually nonzero. The code below did't really understand hi+lo decompositions, and often increases the final error (relative to naive code). - your code often gives null but backwards hi+lo decompositions, with hi = 0 and lo = full value. The code below did't really understand hi+lo decompositions. But when hi = 0, it is especially easy to add and multiply it exactly, so the final error isn't increased so often. Bruce From owner-freebsd-numerics@FreeBSD.ORG Sun Aug 12 23:10:32 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id D0A771065672 for ; Sun, 12 Aug 2012 23:10:32 +0000 (UTC) (envelope-from peter@rulingia.com) Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by mx1.freebsd.org (Postfix) with ESMTP id 2BFEF8FC14 for ; Sun, 12 Aug 2012 23:10:32 +0000 (UTC) Received: from server.rulingia.com (c220-239-249-137.belrs5.nsw.optusnet.com.au [220.239.249.137]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CNAVFS075825 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Mon, 13 Aug 2012 09:10:31 +1000 (EST) (envelope-from peter@rulingia.com) X-Bogosity: Ham, spamicity=0.000000 Received: from server.rulingia.com (localhost.rulingia.com [127.0.0.1]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CNAPjM021630 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Mon, 13 Aug 2012 09:10:25 +1000 (EST) (envelope-from peter@server.rulingia.com) Received: (from peter@localhost) by server.rulingia.com (8.14.5/8.14.5/Submit) id q7CNAP5G021627 for freebsd-numerics@freebsd.org; Mon, 13 Aug 2012 09:10:25 +1000 (EST) (envelope-from peter) Resent-From: Peter Jeremy Resent-Date: Mon, 13 Aug 2012 09:10:25 +1000 Resent-Message-ID: <20120812231025.GX20453@server.rulingia.com> Resent-To: freebsd-numerics@freebsd.org X-source-folder: /home/peter/mail/mosh Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q6MJTvfv006628 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Mon, 23 Jul 2012 05:29:57 +1000 (EST) (envelope-from brde@optusnet.com.au) Received: from mail05.syd.optusnet.com.au (mail05.syd.optusnet.com.au [211.29.132.186]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q6MJTvis009491 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Mon, 23 Jul 2012 05:29:57 +1000 (EST) (envelope-from brde@optusnet.com.au) Received: from c122-106-171-246.carlnfd1.nsw.optusnet.com.au (c122-106-171-246.carlnfd1.nsw.optusnet.com.au [122.106.171.246]) by mail05.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id q6MJTaub002417 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Mon, 23 Jul 2012 05:29:37 +1000 From: Bruce Evans Mail-Followup-To: freebsd-numerics@freebsd.org X-X-Sender: bde@besplex.bde.org To: Stephen Montgomery-Smith In-Reply-To: <50095CDE.4050507@missouri.edu> Message-ID: <20120723044308.X6145@besplex.bde.org> References: <20120714120432.GA70706@server.rulingia.com> <20120717084457.U3890@besplex.bde.org> <5004A5C7.1040405@missouri.edu> <5004DEA9.1050001@missouri.edu> <20120717200931.U6624@besplex.bde.org> <5006D13D.2080702@missouri.edu> <20120718205625.GA409@troutmask.apl.washington.edu> <500725F2.7060603@missouri.edu> <20120719025345.GA1376@troutmask.apl.washington.edu> <50077987.1080307@missouri.edu> <20120719032706.GA1558@troutmask.apl.washington.edu> <5007826D.7060806@missouri.edu> <5007AD41.9070000@missouri.edu> <20120719205347.T2601@besplex.bde.org> <50084322.7020401@missouri.edu> <20120720035001.W4053@besplex.bde.org> <50085441.4090305@missouri.edu> <20120720162953.N2162@besplex.bde.org> <20120720184114.B2790@besplex.bde.org> <50095CDE.4050507@missouri.edu> MIME-Version: 1.0 Content-Type: MULTIPART/MIXED; BOUNDARY="0-1467877092-1342985376=:6145" X-Mailman-Approved-At: Sun, 12 Aug 2012 23:56:00 +0000 Cc: Diane Bruce , Bruce Evans , John Baldwin , David Chisnall , Bruce Evans , Steve Kargl , David Schultz , Peter Jeremy , Warner Losh Subject: Re: Use of C99 extra long double math functions after r236148 X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Date: Sun, 12 Aug 2012 23:10:32 -0000 X-Original-Date: Mon, 23 Jul 2012 05:29:36 +1000 (EST) X-List-Received-Date: Sun, 12 Aug 2012 23:10:32 -0000 This message is in MIME format. The first part should be readable text, while the remaining parts are likely unreadable without MIME-aware tools. --0-1467877092-1342985376=:6145 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Replying again to this... On Fri, 20 Jul 2012, Stephen Montgomery-Smith wrote: > I worked quite a bit on my clog function last night. Could you look at this > one? I ended up deleting most of your changes in this one. > The trick for when hypot(x,y) is close to 1 can only work so well, and you > are testing it out of its range of applicability. But for the special case x > is close to 1, and y is close to zero, I have a much better formula. I can > produce three other formula so that I can handle |x| close to 1, y small, and > |y| close to 1 and x small. > > After fixing it up, could you send it back as an attachment? That will make > it easier for me to put it back into my system, and work more on it. It will be painful for everyone to understand and merge. This time as an attachment as well as (partly) inline with commentary. % #include % #include % #include % % #include "math_private.h" % % static inline void % xonorm(double *a, double *b) % { % double w; % % if (fabs(*a) < fabs(*b)) { % w = *a; % *a = *b; % *b = w; % } % STRICT_ASSIGN(double, w, *a + *b); % *b = (*a - w) + *b; % *a = w; % } % % #define xnorm(a, b) xonorm(&(a), &(b)) % % #define xspadd(a, b, c) do { \ % double __tmp; \ % \ % __tmp = (c); \ % xonorm(&__tmp, &(a)); \ % (b) += (a); \ % (a) = __tmp; \ % } while (0) % % static inline void % xonormf(float *a, float *b) % { % float w; % % if (fabsf(*a) < fabsf(*b)) { % w = *a; % *a = *b; % *b = w; % } % STRICT_ASSIGN(float, w, *a + *b); % *b = (*a - w) + *b; % *a = w; % } % % #define xnormf(a, b) xonormf(&(a), &(b)) % % #define xspaddf(a, b, c) do { \ % float __tmp; \ % \ % __tmp = (c); \ % xonormf(&__tmp, &(a)); \ % (b) += (a); \ % (a) = __tmp; \ % } while (0) Above are my standard extra-precision macros from my math_private.h, cut down for use here and named with an x. Then expanded and pessimized to swap the args. Optimal callers ensure that they don't need swapping. See s_fma.c for a fuller algorithm that doesn't need swapping but does more operations. I started to copy that, but s_fma.c doesn't seem to have anything as convenient as xspaddf(). Further expanded and pessimized() to do STRICT_ASSIGN(). Optimal callers use float_t and double_t so that STRICT_ASSIGN() is unnecessary. Compiler bugs break algorithms like the above on i386 unless float_t and double_t, or STRICT_ASSIGN() are used. Fixing the bugs would give the same slowness as STRICT_ASSIGN(), but globally by doing it for all assignments, so even I now consider these bugs to be features and C standards to be broken for not allowing them. I first discussed fixing them with gcc maintainers over 20 years ago. % % double complex % clog(double complex z) % { % double x, y, h, t1, t2, t3; % double ax, ay, x0, y0, x1, y1; % % x = creal(z); % y = cimag(z); % % /* Handle NaNs using the general formula to mix them right. */ % if (x != x || y != y) % return (cpack(log(hypot(x, y)), atan2(y, x))); I replaced all my messy ifdefs for this by the function call. Also changes isnan() to a not-so-magic test. Though isnan() is about the only FP classification macro that I trust the compiler for. % % ax = fabs(x); % ay = fabs(y); % if (ax < ay) { % t1 = ax; % ax = ay; % ay = t1; % } I got tired of repeating fabs()'s, and need to know which arg is larger. % % /* % * To avoid unnecessary overflow, if x or y are very large, divide x % * and y by M_E, and then add 1 to the logarithm. This depends on % * M_E being larger than sqrt(2). % * % * XXX bugs to fix: % * - underflow if one of x or y is tiny. e_hypot.c avoids this % * problem, and optimizes for the case that the ratio of the % * args is very large, by returning the absolute value of % * the largest arg in this case. % * - not very accurate. Could divide by 2 and add log(2) in extra % * precision. A general scaling step that divides by 2**k and % * adds k*log(2) in extra precision might be good for reducing % * the range so that we don't have to worry about overflow or % * underflow in the general steps. This needs the previous step % * of eliminating large ratios of args so that the args can be % * scaled on the same scale. % * - s/are/is/ in comment. % */ % if (ax > 1e308) % return (cpack(log(hypot(x / M_E, y / M_E)) + 1, atan2(y, x))); No need to avoid infinities here. No need to test y now that we know ax is largest. % % if (ax == 1) { % if (ay < 1e-150) % return (cpack((ay * 0.5) * ay, atan2(y, x))); % return (cpack(log1p(ay * ay) * 0.5, atan2(y, x))); % } Special case mainly for when (ay * ay) is rounded down to the smallest denormal. % % /* % * Because atan2 and hypot conform to C99, this also covers all the % * edge cases when x or y are 0 or infinite. % */ % if (ax < 1e-50 || ay < 1e-50 || ax > 1e50 || ay > 1e50) % return (cpack(log(hypot(x, y)), atan2(y, x))); Not quite right. It is the ratio that matters more than these magic magnitudes. % % /* We don't need to worry about overflow in x*x+y*y. */ % % /* % * Take extra care so that ULP of real part is small if h is % * moderately close to 1. If one only cares about the relative error % * of the whole result (real and imaginary part taken together), this % * algorithm is overkill. % * % * This algorithm does a rather good job if |h-1| >= 1e-5. The only % * algorithm that I can think of that would work for any h close to % * one would require hypot(x,y) being computed using double double % * precision precision (i.e. double as many bits in the mantissa as % * double precision). % * % * x0 and y0 are good approximations to x and y, but have their bits % * trimmed so that double precision floating point is capable of % * calculating x0*x0 + y0*y0 - 1 exactly. % */ Comments not all updated. This one especially out of date. % x0 = ax; % SET_LOW_WORD(x0, 0); % x1 = ax - x0; % y0 = ay; % SET_LOW_WORD(y0, 0); % y1 = ay - y0; Sloppy decomposition with only 21 bits in hi part. Since we are short of bits, we shouldn't burn 5 like this for efficency. In float precision, all the multiplications are exact since 24 splits exactly. % /* Notice that mathematically, h = t1*(1+t3). */ % #if 0 Old version. Still drops bits unnecessary, although I added several full hi/lo decomposition steps to it. % t1 = x0 * x0; /* Exact. */ % t2 = y0 * y0; /* Exact. */ Comments not quite right. All of the muliplications are as exact as possible. They would all be exact if we could split 53 in half, and did so. % STRICT_ASSIGN(double, t3, t1 + t2); % t2 = (t1 - t3) + t2; % t1 = t3; /* Now t1+t2 is hi+lo for x0*x0+y0*y0.*/ % t2 += 2 * x0 * x1; % STRICT_ASSIGN(double, t3, t1 + t2); % t2 = (t1 - t3) + t2; % t1 = t3; % t2 += 2 * y0 * y1; % STRICT_ASSIGN(double, t3, t1 + t2); % t2 = (t1 - t3) + t2; % t1 = t3; % t2 += x1 * x1 + y1 * y1; % STRICT_ASSIGN(double, t3, t1 + t2); % t2 = (t1 - t3) + t2; % t1 = t3; /* Now t1+t2 is hi+lo for x*x+y*y.*/ % #else % t1 = x1 * x1; % t2 = y1 * y1; % xnorm(t1, t2); % xspadd(t1, t2, 2 * y0 * y1); % xspadd(t1, t2, 2 * x0 * x1); % xspadd(t1, t2, y0 * y0); % xspadd(t1, t2, x0 * x0); % xnorm(t1, t2); It was too hard to turn the above into this without using the macros. Now all the multiplications are as exact as possible, and extra precision is used for all the additions (this mattered even for the first 2 terms). Terms should be added from the smallest to the highest. This happens in most cases and some bits are lost when it isn't. % #endif % t3 = t2 / t1; % /* % * |t3| ~< 2**-22 since we work with 24 extra bits of precision, so % * log1p(t3) can be evaluated with about 13 extra bits of precision % * using 2 terms of its power series. But there are complexities % * to avoid underflow. % */ Complexities to avoid underflow incomplete and not here yet. Comment otherwise anachronistic/anaspac(sp?)istic. 22 and 13 are for the float version. The final xnorm() step (to maximize accuracy) ensures that |t3| < 2**-24 precisely for floats (half an ulp). 2**-53 for doubles. % return (cpack((t3 - t3*0.5*t3 + log(t1)) * 0.5, atan2(y, x))); The second term for log1p(t3) is probably nonsense . We lose by inaccuracies in t3 itself. % } % % float complex % clogf(float complex z) % { This is a routine translation. It duplicates too many comments. Only this has been tested much with the latest accuracty fixes. % float x, y, h, t1, t2, t3; % float ax, ay, x0, y0, x1, y1; % uint32_t hx, hy; % % x = crealf(z); % y = cimagf(z); % % /* Handle NaNs using the general formula to mix them right. */ % if (x != x || y != y) % return (cpack(log(hypot(x, y)), atan2(y, x))); Oops, copied too much -- forgot to add f's. Bruce --0-1467877092-1342985376=:6145 Content-Type: TEXT/PLAIN; charset=US-ASCII; name="cplex.c" Content-Transfer-Encoding: BASE64 Content-ID: <20120723052936.D6145@besplex.bde.org> Content-Description: Content-Disposition: attachment; filename="cplex.c" I2luY2x1ZGUgPGNvbXBsZXguaD4NCiNpbmNsdWRlIDxmbG9hdC5oPg0KI2lu Y2x1ZGUgPG1hdGguaD4NCg0KI2luY2x1ZGUgIm1hdGhfcHJpdmF0ZS5oIg0K DQpzdGF0aWMgaW5saW5lIHZvaWQNCnhvbm9ybShkb3VibGUgKmEsIGRvdWJs ZSAqYikNCnsNCglkb3VibGUgdzsNCg0KCWlmIChmYWJzKCphKSA8IGZhYnMo KmIpKSB7DQoJCXcgPSAqYTsNCgkJKmEgPSAqYjsNCgkJKmIgPSB3Ow0KCX0N CglTVFJJQ1RfQVNTSUdOKGRvdWJsZSwgdywgKmEgKyAqYik7DQoJKmIgPSAo KmEgLSB3KSArICpiOw0KCSphID0gdzsNCn0NCg0KI2RlZmluZQl4bm9ybShh LCBiKQl4b25vcm0oJihhKSwgJihiKSkNCg0KI2RlZmluZQl4c3BhZGQoYSwg YiwgYykgZG8gewlcDQoJZG91YmxlIF9fdG1wOwkJXA0KCQkJCVwNCglfX3Rt cCA9IChjKTsJCVwNCgl4b25vcm0oJl9fdG1wLCAmKGEpKTsJXA0KCShiKSAr PSAoYSk7CQlcDQoJKGEpID0gX190bXA7CQlcDQp9IHdoaWxlICgwKQ0KDQpz dGF0aWMgaW5saW5lIHZvaWQNCnhvbm9ybWYoZmxvYXQgKmEsIGZsb2F0ICpi KQ0Kew0KCWZsb2F0IHc7DQoNCglpZiAoZmFic2YoKmEpIDwgZmFic2YoKmIp KSB7DQoJCXcgPSAqYTsNCgkJKmEgPSAqYjsNCgkJKmIgPSB3Ow0KCX0NCglT VFJJQ1RfQVNTSUdOKGZsb2F0LCB3LCAqYSArICpiKTsNCgkqYiA9ICgqYSAt IHcpICsgKmI7DQoJKmEgPSB3Ow0KfQ0KDQojZGVmaW5lCXhub3JtZihhLCBi KQl4b25vcm1mKCYoYSksICYoYikpDQoNCiNkZWZpbmUJeHNwYWRkZihhLCBi LCBjKSBkbyB7CVwNCglmbG9hdCBfX3RtcDsJCVwNCgkJCQlcDQoJX190bXAg PSAoYyk7CQlcDQoJeG9ub3JtZigmX190bXAsICYoYSkpOwlcDQoJKGIpICs9 IChhKTsJCVwNCgkoYSkgPSBfX3RtcDsJCVwNCn0gd2hpbGUgKDApDQoNCmRv dWJsZSBjb21wbGV4DQpjbG9nKGRvdWJsZSBjb21wbGV4IHopDQp7DQoJZG91 YmxlIHgsIHksIGgsIHQxLCB0MiwgdDM7DQoJZG91YmxlIGF4LCBheSwgeDAs IHkwLCB4MSwgeTE7DQoNCgl4ID0gY3JlYWwoeik7DQoJeSA9IGNpbWFnKHop Ow0KDQoJLyogSGFuZGxlIE5hTnMgdXNpbmcgdGhlIGdlbmVyYWwgZm9ybXVs YSB0byBtaXggdGhlbSByaWdodC4gKi8NCglpZiAoeCAhPSB4IHx8IHkgIT0g eSkNCgkJcmV0dXJuIChjcGFjayhsb2coaHlwb3QoeCwgeSkpLCBhdGFuMih5 LCB4KSkpOw0KDQoJYXggPSBmYWJzKHgpOw0KCWF5ID0gZmFicyh5KTsNCglp ZiAoYXggPCBheSkgew0KCQl0MSA9IGF4Ow0KCQlheCA9IGF5Ow0KCQlheSA9 IHQxOw0KCX0NCg0KCS8qDQoJICogVG8gYXZvaWQgdW5uZWNlc3Nhcnkgb3Zl cmZsb3csIGlmIHggb3IgeSBhcmUgdmVyeSBsYXJnZSwgZGl2aWRlIHgNCgkg KiBhbmQgeSBieSBNX0UsIGFuZCB0aGVuIGFkZCAxIHRvIHRoZSBsb2dhcml0 aG0uICBUaGlzIGRlcGVuZHMgb24NCgkgKiBNX0UgYmVpbmcgbGFyZ2VyIHRo YW4gc3FydCgyKS4NCgkgKg0KCSAqIFhYWCBidWdzIHRvIGZpeDoNCgkgKiAt IHVuZGVyZmxvdyBpZiBvbmUgb2YgeCBvciB5IGlzIHRpbnkuICBlX2h5cG90 LmMgYXZvaWRzIHRoaXMNCgkgKiAgIHByb2JsZW0sIGFuZCBvcHRpbWl6ZXMg Zm9yIHRoZSBjYXNlIHRoYXQgdGhlIHJhdGlvIG9mIHRoZQ0KCSAqICAgYXJn cyBpcyB2ZXJ5IGxhcmdlLCBieSByZXR1cm5pbmcgdGhlIGFic29sdXRlIHZh bHVlIG9mDQoJICogICB0aGUgbGFyZ2VzdCBhcmcgaW4gdGhpcyBjYXNlLg0K CSAqIC0gbm90IHZlcnkgYWNjdXJhdGUuICBDb3VsZCBkaXZpZGUgYnkgMiBh bmQgYWRkIGxvZygyKSBpbiBleHRyYQ0KCSAqICAgcHJlY2lzaW9uLiAgQSBn ZW5lcmFsIHNjYWxpbmcgc3RlcCB0aGF0IGRpdmlkZXMgYnkgMioqayBhbmQN CgkgKiAgIGFkZHMgaypsb2coMikgaW4gZXh0cmEgcHJlY2lzaW9uIG1pZ2h0 IGJlIGdvb2QgZm9yIHJlZHVjaW5nDQoJICogICB0aGUgcmFuZ2Ugc28gdGhh dCB3ZSBkb24ndCBoYXZlIHRvIHdvcnJ5IGFib3V0IG92ZXJmbG93IG9yDQoJ ICogICB1bmRlcmZsb3cgaW4gdGhlIGdlbmVyYWwgc3RlcHMuICBUaGlzIG5l ZWRzIHRoZSBwcmV2aW91cyBzdGVwDQoJICogICBvZiBlbGltaW5hdGluZyBs YXJnZSByYXRpb3Mgb2YgYXJncyBzbyB0aGF0IHRoZSBhcmdzIGNhbiBiZQ0K CSAqICAgc2NhbGVkIG9uIHRoZSBzYW1lIHNjYWxlLg0KCSAqIC0gcy9hcmUv aXMvIGluIGNvbW1lbnQuDQoJICovDQoJaWYgKGF4ID4gMWUzMDgpDQoJCXJl dHVybiAoY3BhY2sobG9nKGh5cG90KHggLyBNX0UsIHkgLyBNX0UpKSArIDEs IGF0YW4yKHksIHgpKSk7DQoNCglpZiAoYXggPT0gMSkgew0KCQlpZiAoYXkg PCAxZS0xNTApDQoJCQlyZXR1cm4gKGNwYWNrKChheSAqIDAuNSkgKiBheSwg YXRhbjIoeSwgeCkpKTsNCgkJcmV0dXJuIChjcGFjayhsb2cxcChheSAqIGF5 KSAqIDAuNSwgYXRhbjIoeSwgeCkpKTsNCgl9DQoNCgkvKg0KCSAqIEJlY2F1 c2UgYXRhbjIgYW5kIGh5cG90IGNvbmZvcm0gdG8gQzk5LCB0aGlzIGFsc28g Y292ZXJzIGFsbCB0aGUNCgkgKiBlZGdlIGNhc2VzIHdoZW4geCBvciB5IGFy ZSAwIG9yIGluZmluaXRlLg0KCSAqLw0KCWlmIChheCA8IDFlLTUwIHx8IGF5 IDwgMWUtNTAgfHwgYXggPiAxZTUwIHx8IGF5ID4gMWU1MCkNCgkJcmV0dXJu IChjcGFjayhsb2coaHlwb3QoeCwgeSkpLCBhdGFuMih5LCB4KSkpOw0KDQoJ LyogV2UgZG9uJ3QgbmVlZCB0byB3b3JyeSBhYm91dCBvdmVyZmxvdyBpbiB4 KngreSp5LiAqLw0KDQoJLyoNCgkgKiBUYWtlIGV4dHJhIGNhcmUgc28gdGhh dCBVTFAgb2YgcmVhbCBwYXJ0IGlzIHNtYWxsIGlmIGggaXMNCgkgKiBtb2Rl cmF0ZWx5IGNsb3NlIHRvIDEuICBJZiBvbmUgb25seSBjYXJlcyBhYm91dCB0 aGUgcmVsYXRpdmUgZXJyb3INCgkgKiBvZiB0aGUgd2hvbGUgcmVzdWx0IChy ZWFsIGFuZCBpbWFnaW5hcnkgcGFydCB0YWtlbiB0b2dldGhlciksIHRoaXMN CgkgKiBhbGdvcml0aG0gaXMgb3ZlcmtpbGwuDQoJICoNCgkgKiBUaGlzIGFs Z29yaXRobSBkb2VzIGEgcmF0aGVyIGdvb2Qgam9iIGlmIHxoLTF8ID49IDFl LTUuICBUaGUgb25seQ0KCSAqIGFsZ29yaXRobSB0aGF0IEkgY2FuIHRoaW5r IG9mIHRoYXQgd291bGQgd29yayBmb3IgYW55IGggY2xvc2UgdG8NCgkgKiBv bmUgd291bGQgcmVxdWlyZSBoeXBvdCh4LHkpIGJlaW5nIGNvbXB1dGVkIHVz aW5nIGRvdWJsZSBkb3VibGUNCgkgKiBwcmVjaXNpb24gcHJlY2lzaW9uIChp LmUuIGRvdWJsZSBhcyBtYW55IGJpdHMgaW4gdGhlIG1hbnRpc3NhIGFzDQoJ ICogZG91YmxlIHByZWNpc2lvbikuDQoJICoNCgkgKiB4MCBhbmQgeTAgYXJl IGdvb2QgYXBwcm94aW1hdGlvbnMgdG8geCBhbmQgeSwgYnV0IGhhdmUgdGhl aXIgYml0cw0KCSAqIHRyaW1tZWQgc28gdGhhdCBkb3VibGUgcHJlY2lzaW9u IGZsb2F0aW5nIHBvaW50IGlzIGNhcGFibGUgb2YNCgkgKiBjYWxjdWxhdGlu ZyB4MCp4MCArIHkwKnkwIC0gMSBleGFjdGx5Lg0KCSAqLw0KCXgwID0gYXg7 DQoJU0VUX0xPV19XT1JEKHgwLCAwKTsNCgl4MSA9IGF4IC0geDA7DQoJeTAg PSBheTsNCglTRVRfTE9XX1dPUkQoeTAsIDApOw0KCXkxID0gYXkgLSB5MDsN CgkvKiBOb3RpY2UgdGhhdCBtYXRoZW1hdGljYWxseSwgaCA9IHQxKigxK3Qz KS4gKi8NCiNpZiAwDQoJdDEgPSB4MCAqIHgwOwkJLyogRXhhY3QuICovDQoJ dDIgPSB5MCAqIHkwOwkJLyogRXhhY3QuICovDQoJU1RSSUNUX0FTU0lHTihk b3VibGUsIHQzLCB0MSArIHQyKTsNCgl0MiA9ICh0MSAtIHQzKSArIHQyOw0K CXQxID0gdDM7CQkvKiBOb3cgdDErdDIgaXMgaGkrbG8gZm9yIHgwKngwK3kw KnkwLiovDQoJdDIgKz0gMiAqIHgwICogeDE7DQoJU1RSSUNUX0FTU0lHTihk b3VibGUsIHQzLCB0MSArIHQyKTsNCgl0MiA9ICh0MSAtIHQzKSArIHQyOw0K CXQxID0gdDM7DQoJdDIgKz0gMiAqIHkwICogeTE7DQoJU1RSSUNUX0FTU0lH Tihkb3VibGUsIHQzLCB0MSArIHQyKTsNCgl0MiA9ICh0MSAtIHQzKSArIHQy Ow0KCXQxID0gdDM7DQoJdDIgKz0geDEgKiB4MSArIHkxICogeTE7DQoJU1RS SUNUX0FTU0lHTihkb3VibGUsIHQzLCB0MSArIHQyKTsNCgl0MiA9ICh0MSAt IHQzKSArIHQyOw0KCXQxID0gdDM7CQkvKiBOb3cgdDErdDIgaXMgaGkrbG8g Zm9yIHgqeCt5KnkuKi8NCiNlbHNlDQoJdDEgPSB4MSAqIHgxOw0KCXQyID0g eTEgKiB5MTsNCgl4bm9ybSh0MSwgdDIpOw0KCXhzcGFkZCh0MSwgdDIsIDIg KiB5MCAqIHkxKTsNCgl4c3BhZGQodDEsIHQyLCAyICogeDAgKiB4MSk7DQoJ eHNwYWRkKHQxLCB0MiwgeTAgKiB5MCk7DQoJeHNwYWRkKHQxLCB0MiwgeDAg KiB4MCk7DQoJeG5vcm0odDEsIHQyKTsNCiNlbmRpZg0KCXQzID0gdDIgLyB0 MTsNCgkvKg0KCSAqIHx0M3wgfjwgMioqLTIyIHNpbmNlIHdlIHdvcmsgd2l0 aCAyNCBleHRyYSBiaXRzIG9mIHByZWNpc2lvbiwgc28NCgkgKiBsb2cxcCh0 MykgY2FuIGJlIGV2YWx1YXRlZCB3aXRoIGFib3V0IDEzIGV4dHJhIGJpdHMg b2YgcHJlY2lzaW9uDQoJICogdXNpbmcgMiB0ZXJtcyBvZiBpdHMgcG93ZXIg c2VyaWVzLiAgQnV0IHRoZXJlIGFyZSBjb21wbGV4aXRpZXMNCgkgKiB0byBh dm9pZCB1bmRlcmZsb3cuDQoJICovDQoJcmV0dXJuIChjcGFjaygodDMgLSB0 MyowLjUqdDMgKyBsb2codDEpKSAqIDAuNSwgYXRhbjIoeSwgeCkpKTsNCn0N Cg0KZmxvYXQgY29tcGxleA0KY2xvZ2YoZmxvYXQgY29tcGxleCB6KQ0Kew0K CWZsb2F0IHgsIHksIGgsIHQxLCB0MiwgdDM7DQoJZmxvYXQgYXgsIGF5LCB4 MCwgeTAsIHgxLCB5MTsNCgl1aW50MzJfdCBoeCwgaHk7DQoNCgl4ID0gY3Jl YWxmKHopOw0KCXkgPSBjaW1hZ2Yoeik7DQoNCgkvKiBIYW5kbGUgTmFOcyB1 c2luZyB0aGUgZ2VuZXJhbCBmb3JtdWxhIHRvIG1peCB0aGVtIHJpZ2h0LiAq Lw0KCWlmICh4ICE9IHggfHwgeSAhPSB5KQ0KCQlyZXR1cm4gKGNwYWNrKGxv ZyhoeXBvdCh4LCB5KSksIGF0YW4yKHksIHgpKSk7DQoNCglheCA9IGZhYnNm KHgpOw0KCWF5ID0gZmFic2YoeSk7DQoJaWYgKGF4IDwgYXkpIHsNCgkJdDEg PSBheDsNCgkJYXggPSBheTsNCgkJYXkgPSB0MTsNCgl9DQoNCgkvKg0KCSAq IFRvIGF2b2lkIHVubmVjZXNzYXJ5IG92ZXJmbG93LCBpZiB4IG9yIHkgYXJl IHZlcnkgbGFyZ2UsIGRpdmlkZSB4DQoJICogYW5kIHkgYnkgTV9FLCBhbmQg dGhlbiBhZGQgMSB0byB0aGUgbG9nYXJpdGhtLiAgVGhpcyBkZXBlbmRzIG9u DQoJICogTV9FIGJlaW5nIGxhcmdlciB0aGFuIHNxcnQoMikuDQoJICoNCgkg KiBYWFggYnVncyB0byBmaXg6DQoJICogLSB1bmRlcmZsb3cgaWYgb25lIG9m IHggb3IgeSBpcyB0aW55LiAgZV9oeXBvdC5jIGF2b2lkcyB0aGlzDQoJICog ICBwcm9ibGVtLCBhbmQgb3B0aW1pemVzIGZvciB0aGUgY2FzZSB0aGF0IHRo ZSByYXRpbyBvZiB0aGUNCgkgKiAgIGFyZ3MgaXMgdmVyeSBsYXJnZSwgYnkg cmV0dXJuaW5nIHRoZSBhYnNvbHV0ZSB2YWx1ZSBvZg0KCSAqICAgdGhlIGxh cmdlc3QgYXJnIGluIHRoaXMgY2FzZS4NCgkgKiAtIG5vdCB2ZXJ5IGFjY3Vy YXRlLiAgQ291bGQgZGl2aWRlIGJ5IDIgYW5kIGFkZCBsb2coMikgaW4gZXh0 cmENCgkgKiAgIHByZWNpc2lvbi4gIEEgZ2VuZXJhbCBzY2FsaW5nIHN0ZXAg dGhhdCBkaXZpZGVzIGJ5IDIqKmsgYW5kDQoJICogICBhZGRzIGsqbG9nKDIp IGluIGV4dHJhIHByZWNpc2lvbiBtaWdodCBiZSBnb29kIGZvciByZWR1Y2lu Zw0KCSAqICAgdGhlIHJhbmdlIHNvIHRoYXQgd2UgZG9uJ3QgaGF2ZSB0byB3 b3JyeSBhYm91dCBvdmVyZmxvdyBvcg0KCSAqICAgdW5kZXJmbG93IGluIHRo ZSBnZW5lcmFsIHN0ZXBzLiAgVGhpcyBuZWVkcyB0aGUgcHJldmlvdXMgc3Rl cA0KCSAqICAgb2YgZWxpbWluYXRpbmcgbGFyZ2UgcmF0aW9zIG9mIGFyZ3Mg c28gdGhhdCB0aGUgYXJncyBjYW4gYmUNCgkgKiAgIHNjYWxlZCBvbiB0aGUg c2FtZSBzY2FsZS4NCgkgKiAtIHMvYXJlL2lzLyBpbiBjb21tZW50Lg0KCSAq Lw0KCWlmIChheCA+IDFlMzhGKQ0KCQlyZXR1cm4gKGNwYWNrKGxvZ2YoaHlw b3RmKHggLyAoZmxvYXQpTV9FLCB5IC8gKGZsb2F0KU1fRSkpICsgMSwNCgkJ ICAgIGF0YW4yZih5LCB4KSkpOw0KDQoJaWYgKGF4ID09IDEpIHsNCgkJaWYg KGF5IDwgMWUtMThGKQ0KCQkJcmV0dXJuIChjcGFja2YoKGF5ICogMC41Rikg KiBheSwgYXRhbjJmKHksIHgpKSk7DQoJCXJldHVybiAoY3BhY2tmKGxvZzFw ZihheSAqIGF5KSAqIDAuNUYsIGF0YW4yZih5LCB4KSkpOw0KCX0NCg0KCS8q DQoJICogQmVjYXVzZSBhdGFuMiBhbmQgaHlwb3QgY29uZm9ybSB0byBDOTks IHRoaXMgYWxzbyBjb3ZlcnMgYWxsIHRoZQ0KCSAqIGVkZ2UgY2FzZXMgd2hl biB4IG9yIHkgYXJlIDAgb3IgaW5maW5pdGUuDQoJICovDQoJaWYgKGF4IDwg MWUtMTBGIHx8IGF5IDwgMWUtMTBGIHx8IGF4ID4gMWUxMEYgfHwgYXkgPiAx ZTEwRikNCgkJcmV0dXJuIChjcGFja2YobG9nZihoeXBvdGYoeCwgeSkpLCBh dGFuMmYoeSwgeCkpKTsNCg0KCS8qIFdlIGRvbid0IG5lZWQgdG8gd29ycnkg YWJvdXQgb3ZlcmZsb3cgaW4geCp4K3kqeS4gKi8NCg0KCS8qDQoJICogVGFr ZSBleHRyYSBjYXJlIHNvIHRoYXQgVUxQIG9mIHJlYWwgcGFydCBpcyBzbWFs bCBpZiBoIGlzDQoJICogbW9kZXJhdGVseSBjbG9zZSB0byAxLiAgSWYgb25l IG9ubHkgY2FyZXMgYWJvdXQgdGhlIHJlbGF0aXZlIGVycm9yDQoJICogb2Yg dGhlIHdob2xlIHJlc3VsdCAocmVhbCBhbmQgaW1hZ2luYXJ5IHBhcnQgdGFr ZW4gdG9nZXRoZXIpLCB0aGlzDQoJICogYWxnb3JpdGhtIGlzIG92ZXJraWxs Lg0KCSAqDQoJICogVGhpcyBhbGdvcml0aG0gZG9lcyBhIHJhdGhlciBnb29k IGpvYiBpZiB8aC0xfCA+PSAxZS01LiAgVGhlIG9ubHkNCgkgKiBhbGdvcml0 aG0gdGhhdCBJIGNhbiB0aGluayBvZiB0aGF0IHdvdWxkIHdvcmsgZm9yIGFu eSBoIGNsb3NlIHRvDQoJICogb25lIHdvdWxkIHJlcXVpcmUgaHlwb3QoeCx5 KSBiZWluZyBjb21wdXRlZCB1c2luZyBkb3VibGUgZG91YmxlDQoJICogcHJl Y2lzaW9uIHByZWNpc2lvbiAoaS5lLiBkb3VibGUgYXMgbWFueSBiaXRzIGlu IHRoZSBtYW50aXNzYSBhcw0KCSAqIGRvdWJsZSBwcmVjaXNpb24pLg0KCSAq DQoJICogeDAgYW5kIHkwIGFyZSBnb29kIGFwcHJveGltYXRpb25zIHRvIHgg YW5kIHksIGJ1dCBoYXZlIHRoZWlyIGJpdHMNCgkgKiB0cmltbWVkIHNvIHRo YXQgZG91YmxlIHByZWNpc2lvbiBmbG9hdGluZyBwb2ludCBpcyBjYXBhYmxl IG9mDQoJICogY2FsY3VsYXRpbmcgeDAqeDAgKyB5MCp5MCAtIDEgZXhhY3Rs eS4NCgkgKi8NCg0KCUdFVF9GTE9BVF9XT1JEKGh4LCB4KTsNCglTRVRfRkxP QVRfV09SRCh4MCwgaHggJiAweGZmZmZmMDAwKTsNCgl4MSA9IHggLSB4MDsN CglHRVRfRkxPQVRfV09SRChoeSwgeSk7DQoJU0VUX0ZMT0FUX1dPUkQoeTAs IGh5ICYgMHhmZmZmZjAwMCk7DQoJeTEgPSB5IC0geTA7DQoJLyogTm90aWNl IHRoYXQgbWF0aGVtYXRpY2FsbHksIGggPSB0MSooMSt0MykuICovDQojaWYg MA0KCXQxID0geDAgKiB4MDsJCS8qIEV4YWN0LiAqLw0KCXQyID0geTAgKiB5 MDsJCS8qIEV4YWN0LiAqLw0KCVNUUklDVF9BU1NJR04oZmxvYXQsIHQzLCB0 MSArIHQyKTsNCgl0MiA9ICh0MSAtIHQzKSArIHQyOw0KCXQxID0gdDM7CQkv KiBOb3cgdDErdDIgaXMgaGkrbG8gZm9yIHgwKngwK3kwKnkwLiovDQoJdDIg Kz0gMiAqIHgwICogeDE7DQoJU1RSSUNUX0FTU0lHTihmbG9hdCwgdDMsIHQx ICsgdDIpOw0KCXQyID0gKHQxIC0gdDMpICsgdDI7DQoJdDEgPSB0MzsNCgl0 MiArPSAyICogeTAgKiB5MTsNCglTVFJJQ1RfQVNTSUdOKGZsb2F0LCB0Mywg dDEgKyB0Mik7DQoJdDIgPSAodDEgLSB0MykgKyB0MjsNCgl0MSA9IHQzOw0K CXQyICs9IHgxICogeDEgKyB5MSAqIHkxOw0KCVNUUklDVF9BU1NJR04oZmxv YXQsIHQzLCB0MSArIHQyKTsNCgl0MiA9ICh0MSAtIHQzKSArIHQyOw0KCXQx ID0gdDM7CQkvKiBOb3cgdDErdDIgaXMgaGkrbG8gZm9yIHgqeCt5KnkuKi8N CiNlbHNlDQoJdDEgPSB4MSAqIHgxOw0KCXQyID0geTEgKiB5MTsNCgl4bm9y bWYodDEsIHQyKTsNCgl4c3BhZGRmKHQxLCB0MiwgMiAqIHkwICogeTEpOw0K CXhzcGFkZGYodDEsIHQyLCAyICogeDAgKiB4MSk7DQoJeHNwYWRkZih0MSwg dDIsIHkwICogeTApOw0KCXhzcGFkZGYodDEsIHQyLCB4MCAqIHgwKTsNCgl4 bm9ybWYodDEsIHQyKTsNCiNlbmRpZg0KCXQzID0gdDIgLyB0MTsNCgkvKg0K CSAqIHx0M3wgfjwgMioqLTEwIHNpbmNlIHdlIHdvcmsgd2l0aCAxMiBleHRy YSBiaXRzIG9mIHByZWNpc2lvbiwgc28NCgkgKiBsb2cxcCh0MykgY2FuIGJl IGV2YWx1YXRlZCB3aXRoIGFib3V0IDYgZXh0cmEgYml0cyBvZiBwcmVjaXNp b24NCgkgKiB1c2luZyAyIHRlcm1zIG9mIGl0cyBwb3dlciBzZXJpZXMuICBC dXQgdGhlcmUgYXJlIGNvbXBsZXhpdGllcw0KCSAqIHRvIGF2b2lkIHVuZGVy Zmxvdy4NCgkgKi8NCglyZXR1cm4gKGNwYWNrZigodDMgLSB0MyowLjVGKnQz ICsgbG9nZih0MSkpICogMC41RiwgYXRhbjJmKHksIHgpKSk7DQp9DQo= --0-1467877092-1342985376=:6145-- From owner-freebsd-numerics@FreeBSD.ORG Sun Aug 12 23:10:45 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 611AA106566B for ; Sun, 12 Aug 2012 23:10:45 +0000 (UTC) (envelope-from peter@rulingia.com) Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by mx1.freebsd.org (Postfix) with ESMTP id E9AE58FC1E for ; Sun, 12 Aug 2012 23:10:44 +0000 (UTC) Received: from server.rulingia.com (c220-239-249-137.belrs5.nsw.optusnet.com.au [220.239.249.137]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CNAiGJ075834 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Mon, 13 Aug 2012 09:10:44 +1000 (EST) (envelope-from peter@rulingia.com) X-Bogosity: Ham, spamicity=0.000000 Received: from server.rulingia.com (localhost.rulingia.com [127.0.0.1]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CNAcaS021649 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Mon, 13 Aug 2012 09:10:38 +1000 (EST) (envelope-from peter@server.rulingia.com) Received: (from peter@localhost) by server.rulingia.com (8.14.5/8.14.5/Submit) id q7CNAcwV021648 for freebsd-numerics@freebsd.org; Mon, 13 Aug 2012 09:10:38 +1000 (EST) (envelope-from peter) Resent-From: Peter Jeremy Resent-Date: Mon, 13 Aug 2012 09:10:38 +1000 Resent-Message-ID: <20120812231038.GA20453@server.rulingia.com> Resent-To: freebsd-numerics@freebsd.org Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q6N4ChLK011175 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Mon, 23 Jul 2012 14:12:43 +1000 (EST) (envelope-from brde@optusnet.com.au) Received: from mail36.syd.optusnet.com.au (mail36.syd.optusnet.com.au [211.29.133.76]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q6N4ChfV010875 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Mon, 23 Jul 2012 14:12:43 +1000 (EST) (envelope-from brde@optusnet.com.au) Received: from c122-106-171-246.carlnfd1.nsw.optusnet.com.au (c122-106-171-246.carlnfd1.nsw.optusnet.com.au [122.106.171.246]) by mail36.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id q6N4CQQg013259 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Mon, 23 Jul 2012 14:12:28 +1000 From: Bruce Evans Mail-Followup-To: freebsd-numerics@freebsd.org X-X-Sender: bde@besplex.bde.org To: Stephen Montgomery-Smith In-Reply-To: <500C5EE5.4090602@missouri.edu> Message-ID: <20120723131233.U1189@besplex.bde.org> References: <20120714120432.GA70706@server.rulingia.com> <20120717084457.U3890@besplex.bde.org> <5004A5C7.1040405@missouri.edu> <5004DEA9.1050001@missouri.edu> <20120717200931.U6624@besplex.bde.org> <5006D13D.2080702@missouri.edu> <20120718205625.GA409@troutmask.apl.washington.edu> <500725F2.7060603@missouri.edu> <20120719025345.GA1376@troutmask.apl.washington.edu> <50077987.1080307@missouri.edu> <20120719032706.GA1558@troutmask.apl.washington.edu> <5007826D.7060806@missouri.edu> <5007AD41.9070000@missouri.edu> <20120719205347.T2601@besplex.bde.org> <50084322.7020401@missouri.edu> <20120720035001.W4053@besplex.bde.org> <50085441.4090305@missouri.edu> <20120720162953.N2162@besplex.bde.org> <20120720184114.B2790@besplex.bde.org> <50095CDE.4050507@missouri.edu> <20120723044308.X6145@besplex.bde.org> <500C5EE5.4090602@missouri.edu> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Mailman-Approved-At: Sun, 12 Aug 2012 23:56:00 +0000 Cc: Diane Bruce , Bruce Evans , John Baldwin , David Chisnall , Bruce Evans , Steve Kargl , David Schultz , Peter Jeremy , Warner Losh Subject: Re: Use of C99 extra long double math functions after r236148 X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Date: Sun, 12 Aug 2012 23:10:45 -0000 X-Original-Date: Mon, 23 Jul 2012 14:12:26 +1000 (EST) X-List-Received-Date: Sun, 12 Aug 2012 23:10:45 -0000 On Sun, 22 Jul 2012, Stephen Montgomery-Smith wrote: > But I will say that your latest version of clog doesn't do as well as mine > with this input: > > x = unur_sample_cont(gen); > y = unur_sample_cont(gen); > h = hypot(x,y); > x = x/h; > y = y/h; > > I was able to get ULPs less than 2. Your program gets ULPs more like up to > 4000. I may have broken the double version when working mostly on the float version recently. What are the actual x and y? I'm not set up to use mpfr. Since the float version gets errors of 4096 ulps (12 bits wrong), the double version is sure to get errors of [much more than] 12 + (53-24) = 41 bits wrong. That is 2 tera ulps. Not noticing such enormous errors indicates that the problematic cases haven't been tested. I think you are right that it needs more like tripled double precision -- with merely doubled double precision, it can probably get all 53 mantissa bits and the sign bit wrong too (sign of (|z|^2 - 1)). That is total loss of precision (TLOSS), and should be handled by returning NaN. Sign errors are especially interesting with complex functions and even for real log() applied to a real function, since they may change the branch. I got TLOSS including sign errors in the loghypotf() result in intermediate version due to bugs in the doubling of float precision. Before the attempted doubling, TLOSS might have been the usual case for z near 1! > I have to say that I consider a ULP of 4000 under these very extreme > circumstances to be acceptable. Definitely acceptable if the code goes a > whole lot faster than code that has a ULP of less than 2. "An ULP of 4000" is unusual terminology. An ulp is a unit, not a count. I haven't figured out how to cut down the amount of mail generated by this thread. Sorry to add to it :-). > On 07/22/2012 02:29 PM, Bruce Evans wrote: >> Replying again to this... Top posting is one way :-). Bruce From owner-freebsd-numerics@FreeBSD.ORG Sun Aug 12 23:10:56 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 12152106566B for ; Sun, 12 Aug 2012 23:10:56 +0000 (UTC) (envelope-from peter@rulingia.com) Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by mx1.freebsd.org (Postfix) with ESMTP id 728A98FC08 for ; Sun, 12 Aug 2012 23:10:55 +0000 (UTC) Received: from server.rulingia.com (c220-239-249-137.belrs5.nsw.optusnet.com.au [220.239.249.137]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CNAtm4075844 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Mon, 13 Aug 2012 09:10:55 +1000 (EST) (envelope-from peter@rulingia.com) X-Bogosity: Ham, spamicity=0.000000 Received: from server.rulingia.com (localhost.rulingia.com [127.0.0.1]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CNAm9B021665 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Mon, 13 Aug 2012 09:10:49 +1000 (EST) (envelope-from peter@server.rulingia.com) Received: (from peter@localhost) by server.rulingia.com (8.14.5/8.14.5/Submit) id q7CNAmM7021664 for freebsd-numerics@freebsd.org; Mon, 13 Aug 2012 09:10:48 +1000 (EST) (envelope-from peter) Resent-From: Peter Jeremy Resent-Date: Mon, 13 Aug 2012 09:10:48 +1000 Resent-Message-ID: <20120812231048.GC20453@server.rulingia.com> Resent-To: freebsd-numerics@freebsd.org Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q6N6ONBn012304 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Mon, 23 Jul 2012 16:24:23 +1000 (EST) (envelope-from brde@optusnet.com.au) Received: from mail34.syd.optusnet.com.au (mail34.syd.optusnet.com.au [211.29.133.218]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q6N6ONJk011191 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Mon, 23 Jul 2012 16:24:23 +1000 (EST) (envelope-from brde@optusnet.com.au) Received: from c122-106-171-246.carlnfd1.nsw.optusnet.com.au (c122-106-171-246.carlnfd1.nsw.optusnet.com.au [122.106.171.246]) by mail34.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id q6N6O3Ce002338 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Mon, 23 Jul 2012 16:24:04 +1000 From: Bruce Evans Mail-Followup-To: freebsd-numerics@freebsd.org X-X-Sender: bde@besplex.bde.org To: Stephen Montgomery-Smith In-Reply-To: <500CD87D.9060804@missouri.edu> Message-ID: <20120723154108.O1647@besplex.bde.org> References: <20120714120432.GA70706@server.rulingia.com> <20120717084457.U3890@besplex.bde.org> <5004A5C7.1040405@missouri.edu> <5004DEA9.1050001@missouri.edu> <20120717200931.U6624@besplex.bde.org> <5006D13D.2080702@missouri.edu> <20120718205625.GA409@troutmask.apl.washington.edu> <500725F2.7060603@missouri.edu> <20120719025345.GA1376@troutmask.apl.washington.edu> <50077987.1080307@missouri.edu> <20120719032706.GA1558@troutmask.apl.washington.edu> <5007826D.7060806@missouri.edu> <5007AD41.9070000@missouri.edu> <20120719205347.T2601@besplex.bde.org> <50084322.7020401@missouri.edu> <20120720035001.W4053@besplex.bde.org> <50085441.4090305@missouri.edu> <20120720162953.N2162@besplex.bde.org> <20120720184114.B2790@besplex.bde.org> <50095CDE.4050507@missouri.edu> <20120723044308.X6145@besplex.bde.org> <500C5EE5.4090602@missouri.edu> <20120723131233.U1189@besplex.bde.org> <500CD87D.9060804@missouri.edu> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Mailman-Approved-At: Sun, 12 Aug 2012 23:56:00 +0000 Cc: Diane Bruce , Bruce Evans , John Baldwin , David Chisnall , Bruce Evans , Steve Kargl , David Schultz , Peter Jeremy , Warner Losh Subject: Re: Use of C99 extra long double math functions after r236148 X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Date: Sun, 12 Aug 2012 23:10:56 -0000 X-Original-Date: Mon, 23 Jul 2012 16:24:03 +1000 (EST) X-List-Received-Date: Sun, 12 Aug 2012 23:10:56 -0000 On Sun, 22 Jul 2012, Stephen Montgomery-Smith wrote: > On 07/22/2012 11:12 PM, Bruce Evans wrote: >> On Sun, 22 Jul 2012, Stephen Montgomery-Smith wrote: >> >>> But I will say that your latest version of clog doesn't do as well as >>> mine with this input: >>> >>> x = unur_sample_cont(gen); >>> y = unur_sample_cont(gen); >>> h = hypot(x,y); >>> x = x/h; >>> y = y/h; >>> >>> I was able to get ULPs less than 2. Your program gets ULPs more like >>> up to 4000. >> ... >> What are the actual x and y? I'm not set up to use mpfr. > > The code segment didn't I showed didn't use mpfr. It used unuran. Basically > I am generating random numbers uniformly distributed on the disk |z|=1. You > could also do it using > x = cos(t) > y = sin(t) > where t is a random real number in the interval [0,2 pi]. I see. You are generating |z| near 1 as a side effect of the inaccuracies in sin^2 + cos^2 == 1 :-). I was thinking of generating such numbers using sqrt(). sqrt() is perfectly rounded, so this gives more control. Start with x in [sqrt(2), 1). (I would first try stepping (uniformly) through all of float space). Let y = sqrt(1 - x^2). Then x^2 + y^2 should be 1, but due to inaccuracies it won't be. Since sqrt(2) is irrational and sqrt() is perfectly rounded, in round-to-nearest mode y should always be too large or too small by less than half an ulp. x^2 is imprecise too, so the error for y may be larger. Check a few values on both sides to find the correctly rounded value y. Then this y or the `nextafter' value on one side if it makes x^2 + y^2 (in infinite precision) as close as possible to 1 as possible for this x (by monotonicity). I expect the differences to have a distribution that is somewhat unifom in the mantissa bits, so that the bad cases turn up fast with almost any distribution of x, but worst cases never turn up with blind nonexhaustive testing (testing twice as many cases tends to double the worst case error measured in ulps). There is no obvious reason why the worst cases for the size of |x^2 + y^2 - 1| should only be near x = 1. However, as x moves away from 1 towards sqrt(2), it becomes easier to calculate x^2 + y^2 accurately enough using only doubled precision. > I have been putting a lot of work into casinh, casin, cacosh and cacos, > getting the branches correct. That has exhausted me. I got exhausted with just clog :-). Bruce From owner-freebsd-numerics@FreeBSD.ORG Sun Aug 12 23:11:08 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 7963D1065675 for ; Sun, 12 Aug 2012 23:11:08 +0000 (UTC) (envelope-from peter@rulingia.com) Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by mx1.freebsd.org (Postfix) with ESMTP id F283C8FC15 for ; Sun, 12 Aug 2012 23:11:07 +0000 (UTC) Received: from server.rulingia.com (c220-239-249-137.belrs5.nsw.optusnet.com.au [220.239.249.137]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CNB7AP075866 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Mon, 13 Aug 2012 09:11:08 +1000 (EST) (envelope-from peter@rulingia.com) X-Bogosity: Ham, spamicity=0.000000 Received: from server.rulingia.com (localhost.rulingia.com [127.0.0.1]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CNB1ZZ021700 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Mon, 13 Aug 2012 09:11:01 +1000 (EST) (envelope-from peter@server.rulingia.com) Received: (from peter@localhost) by server.rulingia.com (8.14.5/8.14.5/Submit) id q7CNB1Nl021699 for freebsd-numerics@freebsd.org; Mon, 13 Aug 2012 09:11:01 +1000 (EST) (envelope-from peter) Resent-From: Peter Jeremy Resent-Date: Mon, 13 Aug 2012 09:11:01 +1000 Resent-Message-ID: <20120812231101.GF20453@server.rulingia.com> Resent-To: freebsd-numerics@freebsd.org Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q6KIr4Oq058321 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Sat, 21 Jul 2012 04:53:04 +1000 (EST) (envelope-from brde@optusnet.com.au) Received: from mail26.syd.optusnet.com.au (mail26.syd.optusnet.com.au [211.29.133.167]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q6KIr49H094081 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Sat, 21 Jul 2012 04:53:04 +1000 (EST) (envelope-from brde@optusnet.com.au) Received: from c122-106-171-246.carlnfd1.nsw.optusnet.com.au (c122-106-171-246.carlnfd1.nsw.optusnet.com.au [122.106.171.246]) by mail26.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id q6KIqhvu007254 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Sat, 21 Jul 2012 04:52:45 +1000 From: Bruce Evans Mail-Followup-To: freebsd-numerics@freebsd.org X-X-Sender: bde@besplex.bde.org To: Stephen Montgomery-Smith In-Reply-To: <50097128.6030405@missouri.edu> Message-ID: <20120721032448.X5744@besplex.bde.org> References: <20120714120432.GA70706@server.rulingia.com> <20120717084457.U3890@besplex.bde.org> <5004A5C7.1040405@missouri.edu> <5004DEA9.1050001@missouri.edu> <20120717200931.U6624@besplex.bde.org> <5006D13D.2080702@missouri.edu> <20120718205625.GA409@troutmask.apl.washington.edu> <500725F2.7060603@missouri.edu> <20120719025345.GA1376@troutmask.apl.washington.edu> <50077987.1080307@missouri.edu> <20120719032706.GA1558@troutmask.apl.washington.edu> <5007826D.7060806@missouri.edu> <5007AD41.9070000@missouri.edu> <20120719205347.T2601@besplex.bde.org> <50084322.7020401@missouri.edu> <20120720035001.W4053@besplex.bde.org> <50085441.4090305@missouri.edu> <20120720162953.N2162@besplex.bde.org> <20120720184114.B2790@besplex.bde.org> <50097128.6030405@missouri.edu> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Mailman-Approved-At: Sun, 12 Aug 2012 23:56:00 +0000 Cc: Diane Bruce , Bruce Evans , John Baldwin , David Chisnall , Bruce Evans , Steve Kargl , David Schultz , Peter Jeremy , Warner Losh Subject: Re: Use of C99 extra long double math functions after r236148 X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Date: Sun, 12 Aug 2012 23:11:08 -0000 X-Original-Date: Sat, 21 Jul 2012 04:52:43 +1000 (EST) X-List-Received-Date: Sun, 12 Aug 2012 23:11:08 -0000 On Fri, 20 Jul 2012, Stephen Montgomery-Smith wrote: > On 07/20/2012 04:19 AM, Bruce Evans wrote: >> % x0 = ldexp(floor(ldexp(x, 24)), -24); >> % x1 = x - x0; >> % y0 = ldexp(floor(ldexp(y, 24)), -24); >> % y1 = y - y0; >> >> This has a chance of working iff the bound away from 1 is something like >> 2**-24. Otherwise, multiplying by 2**24 and flooring a positive value >> will just produce 0. 2**-24 seems much too small a bound. My test >> coverage is not wide enough to hit many bad cases. > > This is meant to cover a situation where x = cos(t) and y = sin(t) for some t > not a multiple of PI/2. > > Now, hypot(x,y) will be 1, but only to within machine precision, i.e. an > error of about 1e-17. Actually more like DBL_MIN = 2e-308 for the real part (2e-321 for the smallest denormal). If you are sloppy and have an error of 1e-17, then the relative error is a 2e304 times :-). In ulps I only saw errors below 1e20 ulps. > So log(hypot(x,y)) will be about 1e-17. The true answer being 0, the ULP > will be infinite. The scaling for a correctly rounded result of 0 is unclear. But when the correctly rounded result is the smallest denormal, the scaling is clear: a result of 0 is off by 1 ulp, and a result of 1e-17 is off by 2e304 ulps :-). Both with a factor of 2 of fuzziness for the limited precision of the result. > BUT (and this goes with Goldberg's paper when he considers the quadratic > formula when the quadratic equation has nearly equal roots), suppose > > x = (double)(cos(t)) > > that is, x is not exactly cos(t), but it is the number "cos(t) written in > IEEE double precision". Similarly for y. That is, even though the formula > that produce x and y isn't exact, let's pretend that x and y are exact. > > Again log(hypot(x,y)) will be about 1e-17. But the true answer will also be > about 1e-17. But they won't be the same, and the ULP will be about 1e17. > > What my formula does is deal with the second case, but reduce the ULP to > about 1e8! That is, if x and y are exact numbers, and it so happens that > hypot(x,y) is very close to 1, my method will get you about 8 extra digits of > accuracy. It was still giving errors of 1e17 (maybe 1e304) ulps for me. > Now you have special formulas that handle the cases when z is close to 1, -1, > I and -I. That was my old version. Now I use your formula in most cases. It seems to give 300 or so extra digits after debugging it. I still need the special formula the smallest denormal result. That may be the only case (when y*y/2 == 0 instead of the smallest denormal). > Earlier this morning I sent you a formula, which I think might be > slightly more accurate than yours, for when z is close to 1. I think similar > formulas can be produced for when z is close to -1, I and -I. I think this only gives a small error relative to |log(z)|, so it gives a huge error relative to real(log(z)) when that is nearly 0. Indeed, the error increased from 1 ulp to 2e16 ulps. It can't handle the case of the smallest denormal: z = 1 + I*y where y*y/2 ~= smallest denormal. Then |z - 1| is |y| which is about 1e-162, and log(|z|) is about y*y/2 = smallest denormal. Denormals only go a few powers of 10 lower here. This magic 2e16 is essentially 2**(DBL_MANT_DIG + 1). Errors in ulps are never of the order of 1e304 or 1e162. > To get ULP of about 1 when x and y are exact, and it happens that hypot(x,y) > is close to 1, but z is not close to 1, -1, I or -I, would require, I think, > hypot(x,y)-1 being computed using double double precision (i.e. a mantissa of > 108 bits), and then feeding this into log1p. It does need about that, but this is routine for hi-lo decompositions, and you claimed to already have it :-) : @ x0 and y0 are good approximations to x and y, but have their bits trimmed @ so that double precision floating point is capable of calculating @ x0*x0 + y0*y0 - 1 exactly. */ Bruce From owner-freebsd-numerics@FreeBSD.ORG Sun Aug 12 23:11:18 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 12F791065672 for ; Sun, 12 Aug 2012 23:11:18 +0000 (UTC) (envelope-from peter@rulingia.com) Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by mx1.freebsd.org (Postfix) with ESMTP id 8BE3B8FC18 for ; Sun, 12 Aug 2012 23:11:17 +0000 (UTC) Received: from server.rulingia.com (c220-239-249-137.belrs5.nsw.optusnet.com.au [220.239.249.137]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CNBHdr075875 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Mon, 13 Aug 2012 09:11:17 +1000 (EST) (envelope-from peter@rulingia.com) X-Bogosity: Ham, spamicity=0.000000 Received: from server.rulingia.com (localhost.rulingia.com [127.0.0.1]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CNBBkR021721 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Mon, 13 Aug 2012 09:11:11 +1000 (EST) (envelope-from peter@server.rulingia.com) Received: (from peter@localhost) by server.rulingia.com (8.14.5/8.14.5/Submit) id q7CNBBGo021720 for freebsd-numerics@freebsd.org; Mon, 13 Aug 2012 09:11:11 +1000 (EST) (envelope-from peter) Resent-From: Peter Jeremy Resent-Date: Mon, 13 Aug 2012 09:11:11 +1000 Resent-Message-ID: <20120812231111.GH20453@server.rulingia.com> Resent-To: freebsd-numerics@freebsd.org Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q6L3Ylxh063492 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Sat, 21 Jul 2012 13:34:47 +1000 (EST) (envelope-from brde@optusnet.com.au) Received: from mail34.syd.optusnet.com.au (mail34.syd.optusnet.com.au [211.29.133.218]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q6L3Ykb0095439 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Sat, 21 Jul 2012 13:34:46 +1000 (EST) (envelope-from brde@optusnet.com.au) Received: from c122-106-171-246.carlnfd1.nsw.optusnet.com.au (c122-106-171-246.carlnfd1.nsw.optusnet.com.au [122.106.171.246]) by mail34.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id q6L3YW2C028811 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Sat, 21 Jul 2012 13:34:33 +1000 From: Bruce Evans Mail-Followup-To: freebsd-numerics@freebsd.org X-X-Sender: bde@besplex.bde.org To: Stephen Montgomery-Smith In-Reply-To: <5009BD6C.9050301@missouri.edu> Message-ID: <20120721123522.T877@besplex.bde.org> References: <20120714120432.GA70706@server.rulingia.com> <20120717084457.U3890@besplex.bde.org> <5004A5C7.1040405@missouri.edu> <5004DEA9.1050001@missouri.edu> <20120717200931.U6624@besplex.bde.org> <5006D13D.2080702@missouri.edu> <20120718205625.GA409@troutmask.apl.washington.edu> <500725F2.7060603@missouri.edu> <20120719025345.GA1376@troutmask.apl.washington.edu> <50077987.1080307@missouri.edu> <20120719032706.GA1558@troutmask.apl.washington.edu> <5007826D.7060806@missouri.edu> <5007AD41.9070000@missouri.edu> <20120719205347.T2601@besplex.bde.org> <50084322.7020401@missouri.edu> <20120720035001.W4053@besplex.bde.org> <50085441.4090305@missouri.edu> <20120720162953.N2162@besplex.bde.org> <20120720184114.B2790@besplex.bde.org> <50097128.6030405@missouri.edu> <20120721032448.X5744@besplex.bde.org> <5009BD6C.9050301@missouri.edu> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Mailman-Approved-At: Sun, 12 Aug 2012 23:56:00 +0000 Cc: Diane Bruce , Bruce Evans , John Baldwin , David Chisnall , Bruce Evans , Steve Kargl , David Schultz , Peter Jeremy , Warner Losh Subject: Re: Use of C99 extra long double math functions after r236148 X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Date: Sun, 12 Aug 2012 23:11:18 -0000 X-Original-Date: Sat, 21 Jul 2012 13:34:32 +1000 (EST) X-List-Received-Date: Sun, 12 Aug 2012 23:11:18 -0000 On Fri, 20 Jul 2012, Stephen Montgomery-Smith wrote: > Bruce, with both of us working at the same time on clog, it is getting hard > for me to follow. The version I sent this morning is the last change I made. > > How about if you come the owner of the code for a while. When you are > finished, send it back to me, and I will look over everything you have done. > I won't work on it until then. This works for me in other ways too, because > my life is very busy at the moment. I'd prefer you (or Somone Else) to keep working on it. I just plugged it into my test framework and started zapping errors... (I need to make my test framework easier to set up so that I don't have any investment in the not seeing the errors.) > If I do work on code, it will be on casinh/casin. I am looking over the > paper by Hull et al, and I am learning a lot from it. One thing I did It will be a larger task. I don't plan to do it (better not look at the paper :-). I looked at one of Kahan's old papers about this. For not seeing these papers, it helps that they are behind paywalls. I only saw the Kahan paper because someone sent me an obtained copy. I just remembered that log1p(x) has essentially the same problems with underflow that we are seeing for clog(): - log(1 + x) for small x. This is analytic with a zero at 0. Any such function shares the following numeric property: you expand it as P1*x + P2*x^2 + ... Even the x^2 term in this underflows for tiny x, so you must not evaluate. You must just return P1*x (with inexact if x != 0). There is a threshold below which this must be done to avoid underflow. There is a not much higher threshold above which this must not be done since it is too inaccurate. All functions in fdlibm try to be careful with these thresholds, and most succeed. I broke some of them by invalid optimizations to avoid branches, but learned better. - log(1 + x) for medium x. It has no direct problems with underflow or overflow. We have to be careful not to introduce underflow problems by doing things like evaluating x^100 where x is small or by writing x as hi+lo and evaluating lo^10. Naive code can produce the former problem by using too many terms in a power series. hi+lo decompositions are fairly immune to such problems, since in double precision the lo part is at most 2**54 times smaller than the hi part (usually about 2**26 times smaller). Underflow can never occur in the decomposition (except possibly in unusual rounding modes), and the lo part can be raised to a small power provided the original value is not too tiny. - log(1 + x) for large x. Think of it as log(x + 1), where x and 1 are the hi and lo parts of a decomposition. You scale this to 1 + 1/x, where 1 and 1/x are hi and lo parts. This is not a hi+lo decomposition of any representable number, since the whole point of log1p() is that 1+x cannot be represented exactly. Now the lo part can be up to about 2**1023 times smaller than the hi part, so even squaring it can underflow. The situation for 1+1/x is essentially the same as for 1+x, except there is a scale factor that gives extra sources if inaccuracies: log(1 + x) = log(x + 1) = log(x * (1 + 1/x)) = log(x) + log(1 + 1/x). For the log(1 + 1/x) part, there is a threshold (for x) above which 1/x must not be squared since it would underflow, and not much lower threshold below which the 1/x approximation must not be used since it is too inaccurate. It is technically easier to approximate log(1 + 1/x) by 0 instead of 1/x and use the threshold for that. For clog(), we need to evaluate log(x*x + y*y). This is harder than log(1 + x) since the first additive term is not constant and there are multiplications. We should first reduce so that |x| >= |y|. If |x| is 1 and y is small, we now have essentially the same setup as for log(1 + x) for small x. My initial quick fix was only for this case. In the general case, we should evaluate x*x + y*y as hi+lo to great accuracy (we use about 24 extra bits, and this seems to be enough). Then log(hi + lo) has essentially the setup as log(x + 1) for large x, after we pull out the hi factor from each: log(x + 1) = log(x) + log(1 + 1/x) log(hi + lo) = log(hi) + log(1 + lo/hi). We may or may not end up with a lo/hi term that causes underflow. There is the additional problem that the lo/hi term is itself a square (essentially (y/x)^2, so it may already have underflowed. Earlier special cases in our algorithm are apparently enough to avoid underflow for the general case. Bruce From owner-freebsd-numerics@FreeBSD.ORG Sun Aug 12 23:11:30 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 23373106566C for ; Sun, 12 Aug 2012 23:11:30 +0000 (UTC) (envelope-from peter@rulingia.com) Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by mx1.freebsd.org (Postfix) with ESMTP id AA4B78FC12 for ; Sun, 12 Aug 2012 23:11:29 +0000 (UTC) Received: from server.rulingia.com (c220-239-249-137.belrs5.nsw.optusnet.com.au [220.239.249.137]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CNBTZu075886 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Mon, 13 Aug 2012 09:11:29 +1000 (EST) (envelope-from peter@rulingia.com) X-Bogosity: Ham, spamicity=0.000000 Received: from server.rulingia.com (localhost.rulingia.com [127.0.0.1]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CNBMuc021745 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Mon, 13 Aug 2012 09:11:23 +1000 (EST) (envelope-from peter@server.rulingia.com) Received: (from peter@localhost) by server.rulingia.com (8.14.5/8.14.5/Submit) id q7CNBMxk021744 for freebsd-numerics@freebsd.org; Mon, 13 Aug 2012 09:11:22 +1000 (EST) (envelope-from peter) Resent-From: Peter Jeremy Resent-Date: Mon, 13 Aug 2012 09:11:22 +1000 Resent-Message-ID: <20120812231122.GK20453@server.rulingia.com> Resent-To: freebsd-numerics@freebsd.org Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q6L8KBJm005198 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Sat, 21 Jul 2012 18:20:12 +1000 (EST) (envelope-from brde@optusnet.com.au) Received: from mail08.syd.optusnet.com.au (mail08.syd.optusnet.com.au [211.29.132.189]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q6L8KBDH096067 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Sat, 21 Jul 2012 18:20:11 +1000 (EST) (envelope-from brde@optusnet.com.au) Received: from c122-106-171-246.carlnfd1.nsw.optusnet.com.au (c122-106-171-246.carlnfd1.nsw.optusnet.com.au [122.106.171.246]) by mail08.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id q6L8Jntq020790 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Sat, 21 Jul 2012 18:19:51 +1000 From: Bruce Evans Mail-Followup-To: freebsd-numerics@freebsd.org X-X-Sender: bde@besplex.bde.org To: Stephen Montgomery-Smith In-Reply-To: <500A2565.9090009@missouri.edu> Message-ID: <20120721181204.A1702@besplex.bde.org> References: <20120714120432.GA70706@server.rulingia.com> <20120717084457.U3890@besplex.bde.org> <5004A5C7.1040405@missouri.edu> <5004DEA9.1050001@missouri.edu> <20120717200931.U6624@besplex.bde.org> <5006D13D.2080702@missouri.edu> <20120718205625.GA409@troutmask.apl.washington.edu> <500725F2.7060603@missouri.edu> <20120719025345.GA1376@troutmask.apl.washington.edu> <50077987.1080307@missouri.edu> <20120719032706.GA1558@troutmask.apl.washington.edu> <5007826D.7060806@missouri.edu> <5007AD41.9070000@missouri.edu> <20120719205347.T2601@besplex.bde.org> <50084322.7020401@missouri.edu> <20120720035001.W4053@besplex.bde.org> <50085441.4090305@missouri.edu> <20120720162953.N2162@besplex.bde.org> <20120720184114.B2790@besplex.bde.org> <50097128.6030405@missouri.edu> <20120721032448.X5744@besplex.bde.org> <5009BD6C.9050301@missouri.edu> <20120721123522.T877@besplex.bde.org> <500A2565.9090009@missouri.edu> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Mailman-Approved-At: Sun, 12 Aug 2012 23:56:00 +0000 Cc: Diane Bruce , Bruce Evans , John Baldwin , David Chisnall , Bruce Evans , Steve Kargl , David Schultz , Peter Jeremy , Warner Losh Subject: Re: Use of C99 extra long double math functions after r236148 X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Date: Sun, 12 Aug 2012 23:11:30 -0000 X-Original-Date: Sat, 21 Jul 2012 18:19:49 +1000 (EST) X-List-Received-Date: Sun, 12 Aug 2012 23:11:30 -0000 On Fri, 20 Jul 2012, Stephen Montgomery-Smith wrote: > On 07/20/2012 10:34 PM, Bruce Evans wrote: >> On Fri, 20 Jul 2012, Stephen Montgomery-Smith wrote: >> >>> Bruce, with both of us working at the same time on clog, it is getting >>> hard for me to follow. The version I sent this morning is the last >>> change I made. >>> >>> How about if you come the owner of the code for a while. When you are >>> finished, send it back to me, and I will look over everything you have >>> done. I won't work on it until then. This works for me in other ways >>> too, because my life is very busy at the moment. >> >> I'd prefer you (or Somone Else) to keep working on it. I just plugged it >> into my test framework and started zapping errors... (I need to make my >> test framework easier to set up so that I don't have any investment in >> the not seeing the errors.) > > Do you have a piece of code after you made the changes? Or did you only > record the changes in the emails you sent to me? I was hoping that you could > send me a file as an attachment, with all your suggested changes. But if I > have to go through all the emails you sent in the last few days, I guess I'll > have to do that. I keep the code in a test program and copied it to the emails. I don't want to flood this thread with more copies of it yet :-), but won't lose it. In the old emails, it has my usual markup of '% ' indentation for code and no indentation for comments so that it can easily be extracted. Diffs would be unreadable since I reformatted everything. I was hoping you would reformat your version to KNF so that diffs can be small. Bruce From owner-freebsd-numerics@FreeBSD.ORG Sun Aug 12 23:13:18 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 7FA5B1065674 for ; Sun, 12 Aug 2012 23:13:18 +0000 (UTC) (envelope-from peter@rulingia.com) Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by mx1.freebsd.org (Postfix) with ESMTP id 057B98FC14 for ; Sun, 12 Aug 2012 23:13:17 +0000 (UTC) Received: from server.rulingia.com (c220-239-249-137.belrs5.nsw.optusnet.com.au [220.239.249.137]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CNDHYC075952 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Mon, 13 Aug 2012 09:13:18 +1000 (EST) (envelope-from peter@rulingia.com) X-Bogosity: Ham, spamicity=0.000000 Received: from server.rulingia.com (localhost.rulingia.com [127.0.0.1]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CNDBFC021892 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Mon, 13 Aug 2012 09:13:11 +1000 (EST) (envelope-from peter@server.rulingia.com) Received: (from peter@localhost) by server.rulingia.com (8.14.5/8.14.5/Submit) id q7CNDBu0021891 for freebsd-numerics@freebsd.org; Mon, 13 Aug 2012 09:13:11 +1000 (EST) (envelope-from peter) Resent-From: Peter Jeremy Resent-Date: Mon, 13 Aug 2012 09:13:11 +1000 Resent-Message-ID: <20120812231311.GA20453@server.rulingia.com> Resent-To: freebsd-numerics@freebsd.org Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q6J7fdjA011761 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Thu, 19 Jul 2012 17:41:39 +1000 (EST) (envelope-from brde@optusnet.com.au) Received: from mail13.syd.optusnet.com.au (mail13.syd.optusnet.com.au [211.29.132.194]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q6J7fd8u079783 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Thu, 19 Jul 2012 17:41:39 +1000 (EST) (envelope-from brde@optusnet.com.au) Received: from c122-106-171-246.carlnfd1.nsw.optusnet.com.au (c122-106-171-246.carlnfd1.nsw.optusnet.com.au [122.106.171.246]) by mail13.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id q6J7f8jj030255 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Thu, 19 Jul 2012 17:41:11 +1000 From: Bruce Evans Mail-Followup-To: freebsd-numerics@freebsd.org X-X-Sender: bde@besplex.bde.org To: Stephen Montgomery-Smith In-Reply-To: <5007826D.7060806@missouri.edu> Message-ID: <20120719164458.G1927@besplex.bde.org> References: <20120714120432.GA70706@server.rulingia.com> <20120717084457.U3890@besplex.bde.org> <5004A5C7.1040405@missouri.edu> <5004DEA9.1050001@missouri.edu> <20120717200931.U6624@besplex.bde.org> <5006D13D.2080702@missouri.edu> <20120718205625.GA409@troutmask.apl.washington.edu> <500725F2.7060603@missouri.edu> <20120719025345.GA1376@troutmask.apl.washington.edu> <50077987.1080307@missouri.edu> <20120719032706.GA1558@troutmask.apl.washington.edu> <5007826D.7060806@missouri.edu> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Mailman-Approved-At: Sun, 12 Aug 2012 23:56:00 +0000 Cc: Diane Bruce , Steve Kargl , John Baldwin , David Chisnall , Bruce Evans , Bruce Evans , David Schultz , Peter Jeremy , Warner Losh Subject: Re: Use of C99 extra long double math functions after r236148 X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Date: Sun, 12 Aug 2012 23:13:18 -0000 X-Original-Date: Thu, 19 Jul 2012 17:41:08 +1000 (EST) X-List-Received-Date: Sun, 12 Aug 2012 23:13:18 -0000 On Wed, 18 Jul 2012, Stephen Montgomery-Smith wrote: > On 07/18/2012 10:27 PM, Steve Kargl wrote: >> On Wed, Jul 18, 2012 at 10:05:43PM -0500, Stephen Montgomery-Smith wrote: >>> On 07/18/2012 09:53 PM, Steve Kargl wrote: >>>> >>>> The inexact flag will get raised by the fpu, but you need to >>>> cause the condition. For your 'sqrt(y*y-1) = y' example, >>>> you would do something like 'sqrt(y*y-1) = abs(y) - tiny' where >>>> tiny is much less than abs(y). Search msun/src for inexact >>>> (ie., grep -i inexact msun/src/*.c) Don't worry much about the inexact flag. The overflow and underflow flags are much more important. sqrt(y*y - 1) probably already sets the inexact flag. An exception might when y is 1. Then the sqrt() is exact, but you might still want to set the inexact flag. Typical code where you want to set the inexact flag but it doesn't happen automatically is for sin(x) ~= x when x is tiny. Then the correctly rounded result is x, but fdlibm is careful not to just return x. It sets the inexact flag using an efficient trick. A more topical example: > if (x==0) { > if (fabs(y)<=1) return I*asin(y); > else return signum(y)* ( > log(fabs(y)+sqrt(y*y-1)) > + I*PI/2); Here PI/2 is inexact, but if you just return that, then the inexact flag might not be set. When y != 1, the sqrt() srts the inexact flag. Even when y = 1, PI/2 may or may not set the inexact flag, depending on whether the division is done at compile time or at runtime, and if it is done at runtime then on whether PI's lowest bit is 1. Optimization normally results in PI/2 beinc done at compile time. fdlibm already handles almost exactly this code in e_asin.c. fdlibm should be copied for this and many other things: % GET_HIGH_WORD(hx,x); % ix = hx&0x7fffffff; % if(ix>= 0x3ff00000) { /* |x|>= 1 */ % u_int32_t lx; % GET_LOW_WORD(lx,x); % if(((ix-0x3ff00000)|lx)==0) % /* asin(1)=+-pi/2 with inexact */ % return x*pio2_hi+x*pio2_lo; Hmm, it has several subtleties for efficiency: - the comment is incomplete and should start with asin(+-1) - x is +-1. We multiply by this to get the sign right - we add up our hi and low terms for pi/2 to set the inexact flag - we multiply each of the hi and lo terms before adding them up to avoid optimizations that would result in not setting the inexact flag. The following wouldn't work: (1) x * pio2. x is precisely +-1, so this never sets the inexact flag (2) x * (pio2_hi + pio2_lo). The addition might be and usually is done at compile time, so the result is usually the same as (1). We could avoid this by making one of the pio2 terms volatile. This should give better object code here but worse elsehwere. We can avoid it being worse elswhere by using a separate term for use here. But the fdlibm implementors stopped optimizing before this point. This is only 1 very special case and not worth optimizing more than it already is. fdlibm uses at least 3 other methods for setting the inexact flag. It tries to choose the optimal method using variables with known values like x = +-1 above. Often x is known to be tiny or 0 and it can use (1 + x) to set the inexact flag iff x is != 0. Another, stranger, method is to test if ((int)x == 0). This sets the inexact flag if x is tiny and |x| < 1. This looks worse than the previous method, but it often works better because with (1 + x) it can be hard to actually use the result so that it doesn't get optimized away (the integer conversion method depends on the compiler not be smart enough to know that the result of the test is always `true'), so that it cannot be optimized away. Hardware improvements (better branch prediction) resulted in the integer conversion method working even better than when fdlibm preferred it in 1992. % return (x-x)/(x-x); /* asin(|x|>1) is NaN */ % } else if (ix<0x3fe00000) { /* |x|<0.5 */ >>> Couldn't you do this instead? >>> >>> #include >>> >>> feraiseexcept(FE_INEXACT) >> >> I haven't checked, but I suspect you're looking at a speed >> issue. It's faster to let the hardware raise the flag. >> It seems that libm only uses the above in the fuse-multiple-add >> code: Expect feraiseexcept() to be 100-200 times as slow as fdlibm methods. >> laptop:kargl[206] grep feraise src/*c >> src/s_fma.c: feraiseexcept(FE_INEXACT); >> src/s_fma.c: feraiseexcept(FE_UNDERFLOW); >> src/s_fmal.c: feraiseexcept(FE_INEXACT); >> src/s_fmal.c: feraiseexcept(FE_UNDERFLOW); >> src/s_lround.c: feraiseexcept(FE_INVALID); FreeBSD (now non-fdlibm parts) generally uses fenv stuff iff the standard explictly requires particular rounding or exceptions. This normally makes the functions that use it unusable. The fma() function is unusable even when it is in hardware. I get silly results like the following trying to optimize things using fma() on ia64: - fma instruction in hardware takes about 1 cycle. The compiler generates it if you write x*y + z - portable code that wants a correctly rounded result can't use x*y + z. My multi-precision code for one case takes 10-20 cycles (it depends on x, y and z being in a restrict range or constant) - fma(x, y, z) is portable, but takes about 50 cycles (49 for function call overhead and 1 to do the work). This is when fma is in hardware, and should be fixable by inlining fma. - if fma() is emulated, then it probably takes hundreds or thousands of cycles. Hundreds for each feraiseexcept(), though it probably doesn't always call that. My use doesn't care about inexact, and the others can't happen. > Still, I think I will use the feraiseexcept function in clog, because speed > isn't an issue when nans are involved. And it does make the code less > obscure. It's OK for development. Not so OK for testing. My tests cover NaNs too, and try not to have special knowledge of exceptional cases, so if the NaN case takes many of times longer than the usual case it will slow down the tests significantly. Bruce From owner-freebsd-numerics@FreeBSD.ORG Sun Aug 12 23:13:46 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id ECC5A106566B for ; Sun, 12 Aug 2012 23:13:45 +0000 (UTC) (envelope-from peter@rulingia.com) Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by mx1.freebsd.org (Postfix) with ESMTP id 720888FC17 for ; Sun, 12 Aug 2012 23:13:45 +0000 (UTC) Received: from server.rulingia.com (c220-239-249-137.belrs5.nsw.optusnet.com.au [220.239.249.137]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CNDjlQ075977 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Mon, 13 Aug 2012 09:13:45 +1000 (EST) (envelope-from peter@rulingia.com) X-Bogosity: Ham, spamicity=0.000000 Received: from server.rulingia.com (localhost.rulingia.com [127.0.0.1]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CNDcnQ021937 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Mon, 13 Aug 2012 09:13:38 +1000 (EST) (envelope-from peter@server.rulingia.com) Received: (from peter@localhost) by server.rulingia.com (8.14.5/8.14.5/Submit) id q7CNDcJV021936 for freebsd-numerics@freebsd.org; Mon, 13 Aug 2012 09:13:38 +1000 (EST) (envelope-from peter) Resent-From: Peter Jeremy Resent-Date: Mon, 13 Aug 2012 09:13:38 +1000 Resent-Message-ID: <20120812231338.GD20453@server.rulingia.com> Resent-To: freebsd-numerics@freebsd.org Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q6J6cJUx011155 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Thu, 19 Jul 2012 16:38:19 +1000 (EST) (envelope-from brde@optusnet.com.au) Received: from mail06.syd.optusnet.com.au (mail06.syd.optusnet.com.au [211.29.132.187]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q6J6cJmp079623 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Thu, 19 Jul 2012 16:38:19 +1000 (EST) (envelope-from brde@optusnet.com.au) Received: from c122-106-171-246.carlnfd1.nsw.optusnet.com.au (c122-106-171-246.carlnfd1.nsw.optusnet.com.au [122.106.171.246]) by mail06.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id q6J6bqRG014261 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Thu, 19 Jul 2012 16:37:53 +1000 From: Bruce Evans Mail-Followup-To: freebsd-numerics@freebsd.org X-X-Sender: bde@besplex.bde.org To: Stephen Montgomery-Smith In-Reply-To: <5006D13D.2080702@missouri.edu> Message-ID: <20120719144432.N1596@besplex.bde.org> References: <20120529045612.GB4445@server.rulingia.com> <20120711223247.GA9964@troutmask.apl.washington.edu> <20120713114100.GB83006@server.rulingia.com> <201207130818.38535.jhb@freebsd.org> <9EB2DA4F-19D7-4BA5-8811-D9451CB1D907@theravensnest.org> <20120713155805.GC81965@zim.MIT.EDU> <20120714120432.GA70706@server.rulingia.com> <20120717084457.U3890@besplex.bde.org> <5004A5C7.1040405@missouri.edu> <5004DEA9.1050001@missouri.edu> <20120717200931.U6624@besplex.bde.org> <5006D13D.2080702@missouri.edu> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Mailman-Approved-At: Sun, 12 Aug 2012 23:56:00 +0000 Cc: Diane Bruce , Steve Kargl , John Baldwin , David Chisnall , Bruce Evans , Bruce Evans , David Schultz , Peter Jeremy , Warner Losh Subject: Re: Use of C99 extra long double math functions after r236148 X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Date: Sun, 12 Aug 2012 23:13:46 -0000 X-Original-Date: Thu, 19 Jul 2012 16:37:52 +1000 (EST) X-List-Received-Date: Sun, 12 Aug 2012 23:13:46 -0000 On Wed, 18 Jul 2012, Stephen Montgomery-Smith wrote: > I went on a long road trip yesterday, so I didn't get any code written, but I > did have a lot of thoughts about clog and casinh. > > First, the naive formula (here z=x+I*y) > clog(z) = cpack(log(hypot(x,y)),atan2(x,y)) > is going to work in a lot more edge cases then one might expect. This is > because hypot and atan2, especially atan2, already do a rather good job > getting the edge cases right. I am thinking in particular of when x or y are > 0 or -0, or one of them is infinity or -infinity. Right, clog is deceptively simple. This is because it decomposes perfectly into 2 real functions of 2 real variables and both of these functions are standard and already implemented almost as well as possible. ISTR das saying that it had a complicated case, but I don't see even one. atan2() is supposed to handle all combinations of +-0. Now I remember a potential problem. Complex functions should have only poles and zeros, with projective infinity and "projective zero" (= inverse of projective infinity). Real functions can and do have affine infinities and zeros (+-Inf and +-0), with more detailed special cases. It's just impossible to have useful, detailed special cases for all the ways of approaching complex (projective) infinity and 0. I think Kahan wanted projective infinity in IEEE7xx in ~1980. Intel 8087 had both projective infinity and affine infinities, but projective infinity didn't make it into the first IEEEE7xx, and hardly anyone understood it and it was eventually dropped from Intel FPUs (I think it was in 80287; then in i486 it was reduced to a bit in the control word that can never be cleared (the bit is to set affine infinities); then in SSE the bit went away too). However, C99 tries too hard to make complex functions reduce to real functions when everything is purely real or purely complex. So most of the special cases for +-0 and +-Inf affect complex functions (for other directions of approaching 0 and infinity, not much is specified but you should try to be as continuous as possible, where continuity has delicate unclear meanings since it is related to discontinuous sign functions). Hopefully, the specification of imag(clog()) is that it has the same sign behaviour as atan2(), so you can just use atan2(). The sign conventions for both are arbitrary, but they shouldn't be gratuitously different. You still have to check that they aren't non-gratuitously different, because different conventions became established. cexp() was not quite as simple as clog(). In polar coordinates for the result it is even simpler: cexp(x, y) = (exp(x), y) (the real functions are independent), but in affine coordinates you have to multiply real functions and avoid underflow and overflow in the multiplications. clog() is simpler than that since it an addition instead of multiplications, and the addition doesn't even mix the real functions, but keeps them separate as real and immaginary parts. The real functions are more complicated since they take 2 args, but they are already implemented. > Next, concerning casinh: > > On 07/17/2012 06:13 AM, Bruce Evans wrote: > >> I translated this to pari. There was a sign error for the log() term >> (assuming that pari asinh() has the same branch cuts as C99): > > I couldn't spot which sign error Bruce had changed. However I expect it has > something to do with what happens when x=0 and fabs(y)>1. This is the > reasonable choice of the branch cut. What I think the value should be is > > casinh(z) = cpack( > signum(x)*sqrt(fabs(y)+sqrt(y^2-1)), > signum(y)*PI) > > where the value of signum(x) depends on whether x is 0 or -0. > > (I might add that I checked against the Mathematica ArcSinh function, and > this does NOT follow the above rule. But the document Steve pointed me to > says that > casinh(conj(z)) = conj(casinh(z)) > which means that we cannot follow the Mathematica conventions.) I had forgotten that pari doesn't support -0 at all (AFAIK). I certainly had to change a sign to get match the pari result for 0+y*I, but it was the sign of y. Your original code seems to have y where it should have x: > if (x==0) { > if (fabs(y)<=1) return I*asin(y); > else return signum(y)* ( > log(fabs(y)+sqrt(y*y-1)) > + I*PI/2); > } Here x is +-0 and there is no sign test for it. Pari probably doesn't distinguish +-0, and produces a result for +0. You choose a sign that only depends on and on conventions for asin() when fabs(y) <= 1. Then the choice of signs given by signum(y) is necessary for making the imaginary part agree with the choice when fabs(y) <= 1. The sign error that I got was for the real part: pari asinh(2*I) = -1.316... + Pi/2*I above asinh(2*I) = 1.316... + Pi/2*I C99 doesn't specify casinh(0+y*I) directly, However, it specifies that casinh(x+Inf*I) = +Inf + Pi/2*I for finite positive x, so it needs your choice of sign to be continuous for y in [1, Inf]. C99 specifies lots of weird behaviours for various affine infinities, but all of them have nonnegative real parts, so the pari choice of sign is not wanted. >> The most obvious immediate difficulty in translating the above into C is >> that y*y and z*z may overflow when the result shouldn't. > > This will be a lot easier than I originally expected. When we are in > conditions when overflow might occur, we can simply make the approximations > sqrt(y*y-1) = y > csqrt(z*z+1) = signum(x)*z > because in floating point arithmetic, these will not be approximations, but > true exactly. And I am thinking that the test I will use for when to use > these approximations will be (y==y+1) and (z==z+1) respectively. (I would > use (z*z==z*z+1) but that test has the overflow problem.) The problem is that y*y overflows when casinh() doesn't. A simpler case is that |y| never overflows for finite y, but sqrt(y*y) overflows at about sqrt(DBL_MAX). So sqrt(y*y) is quite different from |y| in floating point, though it is the same in real arithmeric. Adding +-1 to y*y moves the points of spurious and real overflow slightly. > Finally, I want to tell you guys that the reason I used the code: > > if (x>0) > return clog(z+csqrt(z*z+1)); > else > return -clog(-z+csqrt(z*z+1)); > > is this. Both formulas are mathematically exactly the same. This is true > even if one takes into account the branch cuts for csqrt and clog. The > difference between the two formulas is numerical errors. For example, if x<0 > and z has very large magnitude, then csqrt(z*z+1) will be very close to -z. > In fact in floating point arithmetic, if the magnitude of z is sufficiently > large, they will be the same. Probably even more sophistication is needed. I don't have much idea of the actual errors involved. At some point, someone should do an error analysis, and or test the worst cases to verify that the error is not unreasonably large (say < 100 ulps). BTW, Intel and AMD docs have always claimed errors of < 1 ulp for trig functions, but the actual errors are multi-giga-ulps near multiples of Pi/2 (my tests routinely find ones of 17 Gulps in float precision and 4503 Gulps in double precision). Such errors are normal near zeros of analytic functions unless the zeros are known to a precision of hyndreds or thousands of bits and extra-precision code with this many bits is used near them. Intel and AMD only use ~68 bits for Pi/2 in i387 functions. FreeBSD (fdlibm) uses the necessary thousands of bits for Pi/2. FreeBSD (fdlibm) doesn't do this for zeros of Bessel or gamma functions, so it has large errors for them too (I see the same errors for Bessel functions as for trig functions, but for lgammaf I only see errors 10 mega-ulps routinely; after switching from the i387 trig functions, I only see errors of 2 mega-ulps for Bessel functions). exp() has no zeros, so it doesn't have the probles of trig functions (or rather, it has essentially the same problems, but you agree not to notice them by looking at it in polar coordinates: exp(y*I) = cos(y) + sin(y)*I Near a zero of cos(y), sin(y) is +-1 and the relative error of exp(y*I) is small relative to |exp(y*I)| = 1, though it may be 4503 Gulps for the relative error of the real part relative to the real part. For cexp(), we easily implement accurate real and immaginary parts relative to themselves, limited mainly by the accuracy of the real functions, since the real functions are mixed nicely. Similarly for clog(). This is too hard for a general analytic function. I don't know what happens with zeros of complex inverse trig functions. I think they don't have many (like log()), but their real and imaginary parts do, and they are too general for accurate behaviour of the real and imaginary parts relative to themselves to fall out. The arbitrary mediocre error bound of 100 ulps is modestly less than 4503 Gulps. Bruce From owner-freebsd-numerics@FreeBSD.ORG Sun Aug 12 23:13:50 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id DC036106566B for ; Sun, 12 Aug 2012 23:13:49 +0000 (UTC) (envelope-from peter@rulingia.com) Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by mx1.freebsd.org (Postfix) with ESMTP id 73E998FC18 for ; Sun, 12 Aug 2012 23:13:49 +0000 (UTC) Received: from server.rulingia.com (c220-239-249-137.belrs5.nsw.optusnet.com.au [220.239.249.137]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CNDnBq075978 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Mon, 13 Aug 2012 09:13:49 +1000 (EST) (envelope-from peter@rulingia.com) X-Bogosity: Ham, spamicity=0.000000 Received: from server.rulingia.com (localhost.rulingia.com [127.0.0.1]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CNDgvf021948 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Mon, 13 Aug 2012 09:13:43 +1000 (EST) (envelope-from peter@server.rulingia.com) Received: (from peter@localhost) by server.rulingia.com (8.14.5/8.14.5/Submit) id q7CNDgaj021947 for freebsd-numerics@freebsd.org; Mon, 13 Aug 2012 09:13:42 +1000 (EST) (envelope-from peter) Resent-From: Peter Jeremy Resent-Date: Mon, 13 Aug 2012 09:13:42 +1000 Resent-Message-ID: <20120812231342.GE20453@server.rulingia.com> Resent-To: freebsd-numerics@freebsd.org Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q6J8bwC1038774 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Thu, 19 Jul 2012 18:38:00 +1000 (EST) (envelope-from brde@optusnet.com.au) Received: from mail05.syd.optusnet.com.au (mail05.syd.optusnet.com.au [211.29.132.186]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q6J8bwXU079933 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Thu, 19 Jul 2012 18:37:58 +1000 (EST) (envelope-from brde@optusnet.com.au) Received: from c122-106-171-246.carlnfd1.nsw.optusnet.com.au (c122-106-171-246.carlnfd1.nsw.optusnet.com.au [122.106.171.246]) by mail05.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id q6J8baEr025385 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Thu, 19 Jul 2012 18:37:37 +1000 From: Bruce Evans Mail-Followup-To: freebsd-numerics@freebsd.org X-X-Sender: bde@besplex.bde.org To: Bruce Evans In-Reply-To: <20120719144432.N1596@besplex.bde.org> Message-ID: <20120719182849.T2190@besplex.bde.org> References: <20120529045612.GB4445@server.rulingia.com> <20120711223247.GA9964@troutmask.apl.washington.edu> <20120713114100.GB83006@server.rulingia.com> <201207130818.38535.jhb@freebsd.org> <9EB2DA4F-19D7-4BA5-8811-D9451CB1D907@theravensnest.org> <20120713155805.GC81965@zim.MIT.EDU> <20120714120432.GA70706@server.rulingia.com> <20120717084457.U3890@besplex.bde.org> <5004A5C7.1040405@missouri.edu> <5004DEA9.1050001@missouri.edu> <20120717200931.U6624@besplex.bde.org> <5006D13D.2080702@missouri.edu> <20120719144432.N1596@besplex.bde.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Mailman-Approved-At: Sun, 12 Aug 2012 23:56:00 +0000 Cc: Diane Bruce , John Baldwin , David Chisnall , Stephen Montgomery-Smith , Bruce Evans , Steve Kargl , David Schultz , Peter Jeremy , Warner Losh Subject: Re: Use of C99 extra long double math functions after r236148 X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Date: Sun, 12 Aug 2012 23:13:50 -0000 X-Original-Date: Thu, 19 Jul 2012 18:37:36 +1000 (EST) X-List-Received-Date: Sun, 12 Aug 2012 23:13:50 -0000 On Thu, 19 Jul 2012, Bruce Evans wrote: > On Wed, 18 Jul 2012, Stephen Montgomery-Smith wrote: > >> I went on a long road trip yesterday, so I didn't get any code written, but >> I did have a lot of thoughts about clog and casinh. >> >> First, the naive formula (here z=x+I*y) >> clog(z) = cpack(log(hypot(x,y)),atan2(x,y)) >> is going to work in a lot more edge cases then one might expect. This is >> because hypot and atan2, especially atan2, already do a rather good job >> getting the edge cases right. I am thinking in particular of when x or y >> are 0 or -0, or one of them is infinity or -infinity. > > Right, clog is deceptively simple. This is because it decomposes perfectly > into 2 real functions of 2 real variables and both of these functions are > standard and already implemented almost as well as possible. ISTR das > saying that it had a complicated case, but I don't see even one. atan2() Duh, I forget that log() must be applied to hypot(). You found the surprisingly large inaccuracies from (too-?) simple avoidance of overflow in hypot() soon after I first replied. Is there a problem even without overflow? I think log contracts any errors in hypot() so there isn't, but then why doesn't it contract any error in the overflow avoidance? Bruce From owner-freebsd-numerics@FreeBSD.ORG Sun Aug 12 23:13:57 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 3C3131065673 for ; Sun, 12 Aug 2012 23:13:57 +0000 (UTC) (envelope-from peter@rulingia.com) Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by mx1.freebsd.org (Postfix) with ESMTP id 9C8E58FC14 for ; Sun, 12 Aug 2012 23:13:56 +0000 (UTC) Received: from server.rulingia.com (c220-239-249-137.belrs5.nsw.optusnet.com.au [220.239.249.137]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CNDuRC075987 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Mon, 13 Aug 2012 09:13:56 +1000 (EST) (envelope-from peter@rulingia.com) X-Bogosity: Ham, spamicity=0.000000 Received: from server.rulingia.com (localhost.rulingia.com [127.0.0.1]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CNDo4c021964 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Mon, 13 Aug 2012 09:13:50 +1000 (EST) (envelope-from peter@server.rulingia.com) Received: (from peter@localhost) by server.rulingia.com (8.14.5/8.14.5/Submit) id q7CNDoKE021963 for freebsd-numerics@freebsd.org; Mon, 13 Aug 2012 09:13:50 +1000 (EST) (envelope-from peter) Resent-From: Peter Jeremy Resent-Date: Mon, 13 Aug 2012 09:13:49 +1000 Resent-Message-ID: <20120812231349.GG20453@server.rulingia.com> Resent-To: freebsd-numerics@freebsd.org Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q6K2KMM7023903 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Fri, 20 Jul 2012 12:20:23 +1000 (EST) (envelope-from brde@optusnet.com.au) Received: from mail07.syd.optusnet.com.au (mail07.syd.optusnet.com.au [211.29.132.188]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q6K2KNfl085049 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Fri, 20 Jul 2012 12:20:23 +1000 (EST) (envelope-from brde@optusnet.com.au) Received: from c122-106-171-246.carlnfd1.nsw.optusnet.com.au (c122-106-171-246.carlnfd1.nsw.optusnet.com.au [122.106.171.246]) by mail07.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id q6K2JwWw023391 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Fri, 20 Jul 2012 12:19:59 +1000 From: Bruce Evans Mail-Followup-To: freebsd-numerics@freebsd.org X-X-Sender: bde@besplex.bde.org To: Stephen Montgomery-Smith In-Reply-To: <50083E83.9090404@missouri.edu> Message-ID: <20120720120802.F1061@besplex.bde.org> References: <20120529045612.GB4445@server.rulingia.com> <20120711223247.GA9964@troutmask.apl.washington.edu> <20120713114100.GB83006@server.rulingia.com> <201207130818.38535.jhb@freebsd.org> <9EB2DA4F-19D7-4BA5-8811-D9451CB1D907@theravensnest.org> <20120713155805.GC81965@zim.MIT.EDU> <20120714120432.GA70706@server.rulingia.com> <20120717084457.U3890@besplex.bde.org> <5004A5C7.1040405@missouri.edu> <5004DEA9.1050001@missouri.edu> <20120717200931.U6624@besplex.bde.org> <5006D13D.2080702@missouri.edu> <20120719144432.N1596@besplex.bde.org> <50083E83.9090404@missouri.edu> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Mailman-Approved-At: Sun, 12 Aug 2012 23:56:00 +0000 Cc: Diane Bruce , Steve Kargl , John Baldwin , David Chisnall , Bruce Evans , Bruce Evans , David Schultz , Peter Jeremy , Warner Losh Subject: Re: Use of C99 extra long double math functions after r236148 X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Date: Sun, 12 Aug 2012 23:13:57 -0000 X-Original-Date: Fri, 20 Jul 2012 12:19:58 +1000 (EST) X-List-Received-Date: Sun, 12 Aug 2012 23:13:57 -0000 On Thu, 19 Jul 2012, Stephen Montgomery-Smith wrote: > On 07/19/2012 01:37 AM, Bruce Evans wrote: >> ... >> problem. Complex functions should have only poles and zeros, with >> projective infinity and "projective zero" (= inverse of projective >> infinity). Real functions can and do have affine infinities and zeros >> (+-Inf and +-0), with more detailed special cases. It's just impossible >> to have useful, detailed special cases for all the ways of approaching >> complex (projective) infinity and 0. >> ... >> sign functions). Hopefully, the specification of imag(clog()) is >> that it has the same sign behaviour as atan2(), so you can just use >> atan2(). The sign conventions for both are arbitrary, but they >> shouldn't be gratuitously different. You still have to check that >> they aren't non-gratuitously different, because different conventions >> became established. > > I checked. Actually the sign conventions are not that arbitrary. But as a > mathematician I would say they are a bit useless, e.g. > atan(infinity,infinity) = pi/4 = 45 degrees > How do you know that the two infinities are the same? One could be double > the other. It should be NaN with projective infinity. > If it had been up to me, there would have been finite numbers, and nan. And > none of this -0. I think Kahan is a mathematician, and is primarily responsible for +-0. +-0 give poor man's branch cuts for real functions. >> I don't know what happens with zeros of complex inverse trig functions. >> I think they don't have many (like log()), but their real and imaginary >> parts do, and they are too general for accurate behaviour of the real >> and imaginary parts relative to themselves to fall out. > > casinh(z) is zero only when z=0, and near that point I could use Taylor's > series (but a lot of terms would be needed because the Taylot series > converges quite slowly). To get accuracy near zeros, you only need to use a series method in a very small radius. Even a linear approximation may be enough, and the main difficulty is the linear term: f() ~= f'(z0) * (z-z0) where z0 typically needs to be known to hundreds or thousands of bits of precision and the subtraction must be done in this precision. f'(z0) and the multiplkication only need a couple of extra bits. This is only easy when z0 is a dyadic rational. > I can now see that the separate cases of the real part and imaginary parts of > casinh being zero is going to be hard. I won't ask for that and will measure errors relative to the absolute value of the result. Bruce From owner-freebsd-numerics@FreeBSD.ORG Sun Aug 12 23:03:33 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id A2E18106564A for ; Sun, 12 Aug 2012 23:03:33 +0000 (UTC) (envelope-from peter@rulingia.com) Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by mx1.freebsd.org (Postfix) with ESMTP id 3674B8FC15 for ; Sun, 12 Aug 2012 23:03:33 +0000 (UTC) Received: from server.rulingia.com (c220-239-249-137.belrs5.nsw.optusnet.com.au [220.239.249.137]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CN3XWe075634 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Mon, 13 Aug 2012 09:03:33 +1000 (EST) (envelope-from peter@rulingia.com) X-Bogosity: Ham, spamicity=0.000000 Received: from server.rulingia.com (localhost.rulingia.com [127.0.0.1]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q7CN3QIO021187 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Mon, 13 Aug 2012 09:03:26 +1000 (EST) (envelope-from peter@server.rulingia.com) Received: (from peter@localhost) by server.rulingia.com (8.14.5/8.14.5/Submit) id q7CN3Qs1021186 for freebsd-numerics@freebsd.org; Mon, 13 Aug 2012 09:03:26 +1000 (EST) (envelope-from peter) Resent-From: Peter Jeremy Resent-Date: Mon, 13 Aug 2012 09:03:26 +1000 Resent-Message-ID: <20120812230326.GH20453@server.rulingia.com> Resent-To: freebsd-numerics@freebsd.org Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q6R5KLXE029950 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Fri, 27 Jul 2012 15:20:21 +1000 (EST) (envelope-from db@db.net) Received: from diana.db.net (diana.db.net [66.113.102.10]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q6R5KLKo042241 for ; Fri, 27 Jul 2012 15:20:21 +1000 (EST) (envelope-from db@db.net) Received: from night.db.net (localhost [127.0.0.1]) by diana.db.net (Postfix) with ESMTP id 9EA082AA3E0; Thu, 26 Jul 2012 23:20:18 -0600 (MDT) Received: by night.db.net (Postfix, from userid 1000) id B8FD71DEA1; Fri, 27 Jul 2012 00:20:06 -0500 (EST) From: Diane Bruce Mail-Followup-To: freebsd-numerics@freebsd.org To: Stephen Montgomery-Smith Message-ID: <20120727052006.GB92860@night.db.net> References: <20120717225328.GA86902@server.rulingia.com> <20120717232740.GA95026@troutmask.apl.washington.edu> <20120718001337.GA87817@server.rulingia.com> <20120718123627.D1575@besplex.bde.org> <20120722121219.GC73662@server.rulingia.com> <500DAD41.5030104@missouri.edu> <20120724113214.G934@besplex.bde.org> <501204AD.30605@missouri.edu> <20120727032611.GB25690@server.rulingia.com> <50121124.4000002@missouri.edu> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <50121124.4000002@missouri.edu> User-Agent: Mutt/1.4.2.3i X-Mailman-Approved-At: Sun, 12 Aug 2012 23:56:11 +0000 Cc: Diane Bruce , Bruce Evans , John Baldwin , David Chisnall , Bruce Evans , Steve Kargl , David Schultz , Peter Jeremy , Warner Losh Subject: Re: Use of C99 extra long double math functions after r236148 X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Date: Sun, 12 Aug 2012 23:03:33 -0000 X-Original-Date: Fri, 27 Jul 2012 00:20:06 -0500 X-List-Received-Date: Sun, 12 Aug 2012 23:03:33 -0000 On Thu, Jul 26, 2012 at 10:55:16PM -0500, Stephen Montgomery-Smith wrote: > On 07/26/2012 10:26 PM, Peter Jeremy wrote: > > >I've been writing a test harness to vet the special case handling of > >all the complex functions (excluding cpow so far). Basically, it's ... > >follow. > > On the subject of Linux, I tested the relative errors of the Linux > versions of clog, casinh, etc. They performed rather badly. They > really flunked the clog(z) for |z| close to 1 test. I was very curious about that. Any chance of running that test harness against the NetBSD routines? At least if we are very late we are very much better. ;-) > > As for your test program, maybe you could run some script to change the > indents to the 8-char tabs when you are done. It does sound like a > useful program, and it would be nice if it were generally available in > the FreeBSD source code. Thanks folks for all that work. Diane -- - db@FreeBSD.org db@db.net http://www.db.net/~db Nowadays tar can compress using yesterdays latest technologies! From owner-freebsd-numerics@FreeBSD.ORG Mon Aug 13 00:32:26 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 49A0A106567E for ; Mon, 13 Aug 2012 00:32:26 +0000 (UTC) (envelope-from stephen@missouri.edu) Received: from wilberforce.math.missouri.edu (wilberforce.math.missouri.edu [128.206.184.213]) by mx1.freebsd.org (Postfix) with ESMTP id 0A15E8FC14 for ; Mon, 13 Aug 2012 00:32:25 +0000 (UTC) Received: from [127.0.0.1] (wilberforce.math.missouri.edu [128.206.184.213]) by wilberforce.math.missouri.edu (8.14.5/8.14.5) with ESMTP id q7D0WO6I085447 for ; Sun, 12 Aug 2012 19:32:24 -0500 (CDT) (envelope-from stephen@missouri.edu) Message-ID: <50284B19.3000307@missouri.edu> Date: Sun, 12 Aug 2012 19:32:25 -0500 From: Stephen Montgomery-Smith User-Agent: Mozilla/5.0 (X11; Linux i686; rv:14.0) Gecko/20120714 Thunderbird/14.0 MIME-Version: 1.0 To: freebsd-numerics@freebsd.org References: <20120717225328.GA86902@server.rulingia.com> <20120717232740.GA95026@troutmask.apl.washington.edu> <20120718001337.GA87817@server.rulingia.com> <20120718123627.D1575@besplex.bde.org> <20120722121219.GC73662@server.rulingia.com> <500DAD41.5030104@missouri.edu> <20120724113214.G934@besplex.bde.org> <501204AD.30605@missouri.edu> <20120727032611.GB25690@server.rulingia.com> <50121124.4000002@missouri.edu> <20120727052006.GB92860@night.db.net> In-Reply-To: <20120727052006.GB92860@night.db.net> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: Use of C99 extra long double math functions after r236148 X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 13 Aug 2012 00:32:26 -0000 On 07/27/2012 12:20 AM, Diane Bruce wrote: > I was very curious about that. Any chance of running that test harness > against the NetBSD routines? At least if we are very late we are very > much better. ;-) I just looked at the code used in NetBSD. The code is rather simple, and no effort is made to avoid large numerical errors, nor to manage the edge cases: https://www-asim.lip6.fr/trac/netbsdtsar/browser/vendor/netbsd/5/src/lib/libm/complex/cacosh.c?rev=2 https://www-asim.lip6.fr/trac/netbsdtsar/browser/vendor/netbsd/5/src/lib/libm/complex/cacos.c?rev=2 https://www-asim.lip6.fr/trac/netbsdtsar/browser/vendor/netbsd/5/src/lib/libm/complex/casin.c?rev=2 casin is like my proposed pseudo-code, which I realize is numerically unstable if |Re(z)| <= 1 and Im(z) is close to zero. cacos(z) = Pi/2 - casin(z), which is subtracting two nearly equal quantities if z is close to 1. cacosh computes the wrong branch: it should be sign(Im(z))*I*cacos(z). clog is essentially log(cabs(z)) + I*arg(z), which can have a relative error close to infinity if |z|=1. So I think we can do better. From owner-freebsd-numerics@FreeBSD.ORG Mon Aug 13 16:57:50 2012 Return-Path: Delivered-To: freebsd-numerics@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 53C0C1065674 for ; Mon, 13 Aug 2012 16:57:50 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from fallbackmx10.syd.optusnet.com.au (fallbackmx10.syd.optusnet.com.au [211.29.132.251]) by mx1.freebsd.org (Postfix) with ESMTP id 131D68FC19 for ; Mon, 13 Aug 2012 16:57:48 +0000 (UTC) Received: from mail01.syd.optusnet.com.au (mail01.syd.optusnet.com.au [211.29.132.182]) by fallbackmx10.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id q7DGve2X017635 for ; Tue, 14 Aug 2012 02:57:40 +1000 Received: from c122-106-171-246.carlnfd1.nsw.optusnet.com.au (c122-106-171-246.carlnfd1.nsw.optusnet.com.au [122.106.171.246]) by mail01.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id q7DGvRjD001896 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Tue, 14 Aug 2012 02:57:32 +1000 Date: Tue, 14 Aug 2012 02:57:27 +1000 (EST) From: Bruce Evans X-X-Sender: bde@besplex.bde.org To: Stephen Montgomery-Smith In-Reply-To: <5027F07E.9060409@missouri.edu> Message-ID: <20120814003614.H3692@besplex.bde.org> References: <5017111E.6060003@missouri.edu> <501C361D.4010807@missouri.edu> <20120804165555.X1231@besplex.bde.org> <501D51D7.1020101@missouri.edu> <20120805030609.R3101@besplex.bde.org> <501D9C36.2040207@missouri.edu> <20120805175106.X3574@besplex.bde.org> <501EC015.3000808@missouri.edu> <20120805191954.GA50379@troutmask.apl.washington.edu> <20120807205725.GA10572@server.rulingia.com> <20120809025220.N4114@besplex.bde.org> <5027F07E.9060409@missouri.edu> MIME-Version: 1.0 Content-Type: MULTIPART/MIXED; BOUNDARY="0-1116022701-1344877047=:3692" Cc: freebsd-numerics@FreeBSD.org Subject: Re: Complex arg-trig functions X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 13 Aug 2012 16:57:50 -0000 This message is in MIME format. The first part should be readable text, while the remaining parts are likely unreadable without MIME-aware tools. --0-1116022701-1344877047=:3692 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed On Sun, 12 Aug 2012, Stephen Montgomery-Smith wrote: > Having brooded over the code for too many weeks, I now think I have finished > my complex arg-trig functions. I have also written versions for float and > long. So I am ready to have the code reviewed. > > http://people.freebsd.org/~stephen/ I finally tested a version of this. I only did simple comparisons (float vs double and double vs long double). The results look promising after fixing a few bugs: % amd64 float prec, on 2**12 * 2**12 args: % rcacos:max_er = 0x58460841 2.7585, avg_er = 0.317, #>=1:0.5 = 29084:255712 % rcacosh:max_er = 0x5e1e45e6 2.9412, avg_er = 0.262, #>=1:0.5 = 85868:3413684 % rcasin:max_er = 0x631b8183 3.0971, avg_er = 0.209, #>=1:0.5 = 38388:382508 % rcasinh:max_er = 0x5e1e45e6 2.9412, avg_er = 0.262, #>=1:0.5 = 85868:3413684 % rcatan:max_er = 0x51d7c47a 2.5576, avg_er = 0.290, #>=1:0.5 = 52984:318084 % rcatanh:max_er = 0x3693fccd4f 436.6246, avg_er = 0.223, #>=1:0.5 = 213124:1438280 % rclog: max_er = 0x26dfae4d 1.2148, avg_er = 0.247, #>=1:0.5 = 184:92244 % icacos:max_er = 0x5e1e45e6 2.9412, avg_er = 0.262, #>=1:0.5 = 85868:3413684 % icacosh:max_er = 0x3690000000 436.5000, avg_er = 0.317, #>=1:0.5 = 29104:255732 % icasin:max_er = 0x5e1e45e6 2.9412, avg_er = 0.262, #>=1:0.5 = 85868:3413684 % icasinh:max_er = 0x631b8183 3.0971, avg_er = 0.209, #>=1:0.5 = 38388:382508 % icatan:max_er = 0x3693fccd4f 436.6246, avg_er = 0.223, #>=1:0.5 = 213124:1438280 % icatanh:max_er = 0x51d7c47a 2.5576, avg_er = 0.290, #>=1:0.5 = 52984:318084 % iclog: max_er = 0x1fc2b4f5 0.9925, avg_er = 0.302, #>=1:0.5 = 0:349830 % % amd64 double prec, on 2**12 x 2**12 args: % rcacos:max_er = 0x1b5a 3.4189, avg_er = 0.228, #>=1:0.5 = 2394:125988 % rcacosh:max_er = 0xf7d 1.9360, avg_er = 0.257, #>=1:0.5 = 612:2741860 % rcasin:max_er = 0x15c5 2.7212, avg_er = 0.113, #>=1:0.5 = 33296:99152 % rcasinh:max_er = 0xf7d 1.9360, avg_er = 0.257, #>=1:0.5 = 612:2741796 % rcatan:max_er = 0x8000000000000000 4503599627370496.0000, avg_er = 268435456.212, #>=1:0.5 = 4681:81365 % rcatanh:max_er = 0x8000000000000000 4503599627370496.0000, avg_er = 268435456.047, #>=1:0.5 = 428997:691341 % rclog: max_er = 0x704 0.8770, avg_er = 0.250, #>=1:0.5 = 0:20152 % icacos:max_er = 0xf7d 1.9360, avg_er = 0.257, #>=1:0.5 = 612:2741860 % icacosh:max_er = 0x1b5a 3.4189, avg_er = 0.228, #>=1:0.5 = 2394:125988 % icasin:max_er = 0xf7d 1.9360, avg_er = 0.257, #>=1:0.5 = 612:2741796 % icasinh:max_er = 0x15c5 2.7212, avg_er = 0.113, #>=1:0.5 = 33296:99152 % icatan:max_er = 0x8000000000000000 4503599627370496.0000, avg_er = 268435456.047, #>=1:0.5 = 428997:691341 % icatanh:max_er = 0x8000000000000000 4503599627370496.0000, avg_er = 268435456.212, #>=1:0.5 = 4681:81365 % iclog: max_er = 0x6f4 0.8691, avg_er = 0.213, #>=1:0.5 = 0:181032 rfoo is the real part of foo, etc. 2**12 x 2**12 args is not enough. Some bugs showed up only with slightly more args, but after fixing some bugs the behaviour doesn't seem to depend so much on the number. The error of 436.6246 upls has been turning up a lot. It was for a bug in my test program for a result of 0 when the correctly rounded result is the smallest denormal, but I thought I fixed it. The error of 0x8000000000000000 for double precision *catan* is a sign mismatch. % i386 float prec, on 2**12 x 2**12 args: % rcacos:max_er = 0x42cd19c6 2.0875, avg_er = 0.314, #>=1:0.5 = 3854:215116 % rcacosh:max_er = 0x3170e232 1.5450, avg_er = 0.254, #>=1:0.5 = 23008:3245028 % rcasin:max_er = 0x55adc0df 2.6775, avg_er = 0.208, #>=1:0.5 = 34304:353980 % rcasinh:max_er = 0x3170e232 1.5450, avg_er = 0.254, #>=1:0.5 = 23008:3245028 % rcatan:max_er = 0x3c4078ec 1.8829, avg_er = 0.284, #>=1:0.5 = 13260:190836 % rcatanh:max_er = 0x3693fccd4f 436.6246, avg_er = 0.186, #>=1:0.5 = 4796:421616 % rclog: max_er = 0x25830853 1.1722, avg_er = 0.246, #>=1:0.5 = 120:24892 % icacos:max_er = 0x3170e232 1.5450, avg_er = 0.254, #>=1:0.5 = 23008:3245028 % icacosh:max_er = 0x3690000000 436.5000, avg_er = 0.315, #>=1:0.5 = 3874:215136 % icasin:max_er = 0x3170e232 1.5450, avg_er = 0.254, #>=1:0.5 = 23008:3245028 % icasinh:max_er = 0x55adc0df 2.6775, avg_er = 0.208, #>=1:0.5 = 34304:353980 % icatan:max_er = 0x3693fccd4f 436.6246, avg_er = 0.186, #>=1:0.5 = 4796:421616 % icatanh:max_er = 0x3c4078ec 1.8829, avg_er = 0.284, #>=1:0.5 = 13260:190836 % iclog: max_er = 0x1fc2b4f5 0.9925, avg_er = 0.302, #>=1:0.5 = 0:338712 % % i386 double prec, on 2**12 x 2**12 args: % rcacos:max_er = 0x11e8 2.2383, avg_er = 0.165, #>=1:0.5 = 248:111850 % rcacosh:max_er = 0xb02 1.3760, avg_er = 0.256, #>=1:0.5 = 104:2715312 % rcasin:max_er = 0x13ce 2.4756, avg_er = 0.112, #>=1:0.5 = 5616:95060 % rcasinh:max_er = 0xb02 1.3760, avg_er = 0.256, #>=1:0.5 = 104:2715312 % rcatan:max_er = 0x9ed 1.2407, avg_er = 0.015, #>=1:0.5 = 4084:48920 % rcatanh:max_er = 0xb17 1.3862, avg_er = 0.014, #>=1:0.5 = 56:77456 % rclog: max_er = 0x704 0.8770, avg_er = 0.250, #>=1:0.5 = 0:20112 % icacos:max_er = 0xb02 1.3760, avg_er = 0.256, #>=1:0.5 = 104:2715312 % icacosh:max_er = 0x11e8 2.2383, avg_er = 0.165, #>=1:0.5 = 248:111850 % icasin:max_er = 0xb02 1.3760, avg_er = 0.256, #>=1:0.5 = 104:2715312 % icasinh:max_er = 0x13ce 2.4756, avg_er = 0.112, #>=1:0.5 = 5616:95060 % icatan:max_er = 0xb17 1.3862, avg_er = 0.014, #>=1:0.5 = 56:77456 % icatanh:max_er = 0x9ed 1.2407, avg_er = 0.015, #>=1:0.5 = 4084:48920 % iclog: max_er = 0x6f4 0.8691, avg_er = 0.213, #>=1:0.5 = 0:181032 Note that i386 doesn't have the sign errors. However, i386 with -O0 has the sign errors for at least float precision in the same places that amd64 with -O has them for double precision. > The long versions require a logl and a log1pl, which I faked using mpfr. > > The float versions are more complicated, because FLT_EPSILON is too close to > the 4th root of FLT_MIN. It is simpler to make the float versions wrappers > for the double versions. But I wrote the float versions anyway, just in case > some purist insists that the wrapper approach is morally wrong. There are negative reasons to have the float versions unless they are not wrappers. The reasons to have non-wrappers are to test the algorithm and run faster. Fixes needed for the above test results: @ diff -c2 catrig.c~ catrig.c @ *** catrig.c~ Sun Aug 12 17:29:18 2012 @ --- catrig.c Mon Aug 13 12:07:09 2012 @ *************** @ *** 265,269 **** @ return; @ } @ ! if (y < MIN_4TH_ROOT) { @ /* @ * Avoid a possible underflow caused by y/A. For casinh this @ --- 265,269 ---- @ return; @ } @ ! if (!ISNAN(x) && y < MIN_4TH_ROOT) { @ /* @ * Avoid a possible underflow caused by y/A. For casinh this This stops the result for NaN x depending on the size of y relative to MIN_4TH_ROOT. MIN_4TH_ROOT varies with the precision, but the result shouldn't vary with the precision. The result should probably be the original NaN x quieted (unless y is also NaN). This happens naturally if you do a computation with x (except -x). However, symmetry often requires forcing or flipping signs, and we force or flip the sign even for NaNs. This change gives consistent signs for NaN results, by taking the same path in all precisions. @ *************** @ *** 408,416 **** @ @ if (ISFINITE(bx) && ISFINITE(by) && (x > RECIP_SQRT_EPSILON_100 || y > RECIP_SQRT_EPSILON_100)) { @ ! if (huge+x+y>one) { /* raise inexact flag */ @ ! w = clog_for_large_values(z) + M_LN2; Addition clobbers the sign of -0 in the imaginary part in some cases. I think it should clobber it in all cases, since it does for non-complex doubles (-0.0 + +0.0 is +0.0). Clobbering was observed in the following cases: - on i386 (-march=athlon64) with -O0, presumably because the addition to the imaginary part was not optimized away then - on amd64 with -O0 and -O, presumably because: - same as i386 for -O0 - with -O (no -march), float complex and double complex are represented as a vector in an SSE2 register, and it is natural to add both the components of the vector, and this may be optimal for at least float complex with the default -march. I didn't check what happens for double complex. @ if (sy == 0) @ ! return (cpack(cimag(w), -creal(w))); @ ! return (cpack(-cimag(w), creal(w))); The sign of creal(cacos()) is always 1, but this makes it +- the sign of atan2(x, y). @ } @ } @ --- 408,420 ---- @ @ if (ISFINITE(bx) && ISFINITE(by) && (x > RECIP_SQRT_EPSILON_100 || y > RECIP_SQRT_EPSILON_100)) { @ ! /* XXX following can also raise overflow */ Just note this error. @ ! if (huge+x+y>one) { /* raise inexact */ Start removing ' flag'. @ ! w = clog_for_large_values(z); @ ! /* Can't add M_LN2 to w since it should clobber -0*I. */ @ ! rx = fabs(cimag(w)); @ ! ry = creal(w) + M_LN2; @ if (sy == 0) @ ! ry = -ry; @ ! return (cpack(rx, ry)); @ } @ } Fix the above bugs. @ *************** @ *** 482,486 **** @ * but this case should happen extremely rarely. @ */ @ ! if (ay > 0.5*DBL_MAX) @ return (cpack(log(hypot(x / M_E, y / M_E)) + 1, atan2(y, x))); @ @ --- 486,490 ---- @ * but this case should happen extremely rarely. @ */ @ ! if (ax > 0.5*DBL_MAX) @ return (cpack(log(hypot(x / M_E, y / M_E)) + 1, atan2(y, x))); @ The smallest of ax and ay must be compared. I think I broke this in an early version of clog(). Replacing the whole function by clog() made little difference once this was fixed. @ diff -c2 catrigf.c~ catrigf.c @ *** catrigf.c~ Sun Aug 12 17:00:52 2012 @ --- catrigf.c Mon Aug 13 14:14:42 2012 @ *************** @ *** 138,142 **** @ return; @ } @ ! if (y < MIN_4TH_ROOT) { @ *B_is_usable = 0; @ if ((int)y==0) /* raise inexact flag */ @ --- 138,142 ---- @ return; @ } @ ! if (!isnan(x) && y < MIN_4TH_ROOT) { @ *B_is_usable = 0; @ if ((int)y==0) /* raise inexact flag */ @ *************** @ *** 233,240 **** @ if (isfinite(x) && isfinite(y) && (x > RECIP_SQRT_EPSILON_100 || y > RECIP_SQRT_EPSILON_100)) { @ if (huge+x+y>one) { /* raise inexact flag */ @ ! w = clog_for_large_values(z) + M_LN2; @ if (sy == 0) @ ! return (cpackf(cimagf(w), -crealf(w))); @ ! return (cpackf(-cimagf(w), crealf(w))); @ } @ } @ --- 233,242 ---- @ if (isfinite(x) && isfinite(y) && (x > RECIP_SQRT_EPSILON_100 || y > RECIP_SQRT_EPSILON_100)) { @ if (huge+x+y>one) { /* raise inexact flag */ @ ! w = clog_for_large_values(z); @ ! rx = fabsf(cimagf(w)); @ ! ry = crealf(w) + M_LN2; @ if (sy == 0) @ ! ry = -ry; @ ! return (cpackf(rx, ry)); @ } @ } @ *************** @ *** 290,294 **** @ } @ @ ! if (ay > 0.5*FLT_MAX) @ return (cpackf(logf(hypotf(x / M_E, y / M_E)) + 1, atan2f(y, x))); @ @ --- 292,296 ---- @ } @ @ ! if (ax > 0.5*FLT_MAX) @ return (cpackf(logf(hypotf(x / M_E, y / M_E)) + 1, atan2f(y, x))); @ @ diff -c2 catrigl.c~ catrigl.c @ *** catrigl.c~ Sun Aug 12 06:54:46 2012 @ --- catrigl.c Mon Aug 13 12:08:21 2012 @ *************** @ *** 119,123 **** @ return; @ } @ ! if (y < MIN_4TH_ROOT) { @ *B_is_usable = 0; @ if ((int)y==0) /* raise inexact flag */ @ --- 119,123 ---- @ return; @ } @ ! if (!isnan(x) && y < MIN_4TH_ROOT) { @ *B_is_usable = 0; @ if ((int)y==0) /* raise inexact flag */ @ *************** @ *** 207,214 **** @ if (isfinite(x) && isfinite(y) && (x > RECIP_SQRT_EPSILON_100 || y > RECIP_SQRT_EPSILON_100)) { @ if (huge+x+y>one) { /* raise inexact flag */ @ ! w = clog_for_large_values(z) + L_LN2; @ if (sy == 0) @ ! return (cpackl(cimagl(w), -creall(w))); @ ! return (cpackl(-cimagl(w), creall(w))); @ } @ } @ --- 207,216 ---- @ if (isfinite(x) && isfinite(y) && (x > RECIP_SQRT_EPSILON_100 || y > RECIP_SQRT_EPSILON_100)) { @ if (huge+x+y>one) { /* raise inexact flag */ @ ! w = clog_for_large_values(z); @ ! rx = fabsl(cimagl(w)); @ ! ry = creall(w) + M_LN2; @ if (sy == 0) @ ! ry = -ry; @ ! return (cpackl(rx, ry)); @ } @ } @ *************** @ *** 264,268 **** @ } @ @ ! if (ay > 0.5*LDBL_MAX) @ return (cpackl(logl(hypotl(x / L_E, y / L_E)) + 1, atan2l(y, x))); @ @ --- 266,270 ---- @ } @ @ ! if (ax > 0.5*LDBL_MAX) @ return (cpackl(logl(hypotl(x / L_E, y / L_E)) + 1, atan2l(y, x))); @ Some other bugs that I noticed: catrigf.c: % ... % #define MAX_4TH_ROOT 1e9 /* approx pow(FLT_MAX,0.25) */ % #define MIN_4TH_ROOT 1e-9 /* approx pow(FLT_MIN,0.25) */ % #define SQRT_EPSILON_100 1e-6 /* approx (sqrt(FLT_EPSILON)/100) */ % /* 100 is to "play it safe." */ % #define RECIP_SQRT_EPSILON_100 1e6 /* 1/SQRT_EPSILON_100 */ % #define EPSILON_100 1e-9 /* approx FLT_EPSILON/100 */ % #define RECIP_EPSILON_100 1e9 /* 1/EPSILON_100 */ These aren't float constants. Mixing them with floats may give unwanted extra precision and slowness. Most of these should be hex constants, so that they are easier to understand. In clog() I originaly used magic hex constants, but but almost everything was a power of 2 related to FOO_MAX or FOO_MANT_DIG, as above, but it is difficult to translate from those to hex constants (FLT_EPSILON / 128 could be done using token pasting as ((0x1p # FLT_MIN_EXP) * 2 / 128), but square and fourth roots are harder. After changing the classification using bits, all the comparisions with thresholds became comparisons of exponents, and it is now trivial to take roots by dividing the expondent. ld128 clogl() now works with identical code to ld80 clogl(), based on macros in , and the differences for other precisions are minor (I could hide them all using more macros). % ... % static const float % one = 1.00000000000000000000e+00, % huge= 1.00000000000000000000e+30; Using variables instead of macros avoids having to put F suffixes on all float constants. % ... % inline static float % f(float a, float b, float hypot_a_b) % { % if (b < 0) return (0.5 * (hypot_a_b - b)); % if (b == 0) return (0.5*a); % return (0.5 * a*a / (hypot_a_b + b)); % } There a squillions of 0.5's that should be 0.5F's and deserve being named `half' more than 1 deserves being named 'one'. catrigl.c: % #define MAX_4TH_ROOT 1e1200L /* approx pow(LDBL_MAX,0.25) */ % #define MIN_4TH_ROOT 1e-1200L /* approx pow(LDBL_MIN,0.25) */ % #define SQRT_EPSILON_100 1e-13L /* approx (sqrtl(LDBL_EPSILON)/100) */ % #define RECIP_SQRT_EPSILON_100 1e13L /* 1/SQRT_EPSILON_100 */ % #define EPSILON_100 1e-24L /* approx LDBL_EPSILON/100 */ % #define RECIP_EPSILON_100 1e24L /* 1/EPSILON_100 */ Now there is no need for a suffix on the smaller constants. You don't use one for the 0.5's later. The ones related to epsilon look wrong for ld128. LDBL_EPSILON = 2**-(LDBL_MANT_DIG - 1) is way smaller when LDBL_MANT_DIG is 113 than when it is 64. I'm fairly happy with the attached clog() now. All refinements that I tried lately give either less accuracy, lower speed or larger code. Ones that don't quite work are: - write log(ax*ax + ay*ay) * 0.5 = log(ax) + log1p(1 + (ay/ax)**2) * 0.5. This avoids complications when ax is larger or ay is small (after ensuring than ay/ax is not small), but tends to be slower and a bit less accurate. - write ax = 2**k * bx; ay = 2**k * by; log(ax*ax + ay*ay) * 0.5 = k * log(2) + log(bx*bx + by*by) * 0.5. This gives marginally more accuracy, but tends to be slower and a bit more complicated. One complication is that bx*bx + by*by may need another scaling by a factor of 2 or 1/2 to make it between sqrt(2)/2 and sqrt(2). Many changes in this version: % /* Avoid overflow. */ % if (kx >= MAX_EXP - 1) % return (cpack(log(hypot(x * 0x1p-1022, y * 0x1p-1022)) + % (MAX_EXP - 2) * ln2_lo + (MAX_EXP - 2) * ln2_hi, v)); % if (kx >= (MAX_EXP - 1) / 2) % return (cpack(log(hypot(x, y)), v)); 1. Use integer exponenents kx and ky for all classifications. 2. Get all the thresholds right. 3. Scale down to near 1 here, instead of just by a factor of 2. This gives some extra accuracy for no runtime cost. % % /* Reduce inaccuracies and avoid underflow when ax is denormal. */ % if (kx <= MIN_EXP - 2) % return (cpack(log(hypot(x * 0x1p1023, y * 0x1p1023)) + % (MIN_EXP - 2) * ln2_lo + (MIN_EXP - 2) * ln2_hi, v)); 4. Scale up to near 1 here, instead of just by a factor of 2. This gives considerable extra accuracy for no runtime cost. % % /* Avoid remaining underflows (when ax is small but not denormal). */ % if (ky < (MIN_EXP - 1) / 2 + MANT_DIG) % return (cpack(log(hypot(x, y)), v)); 5. Don't scale here. The correct scale factor is ~ 2**-kx for both cases, but that has more overheads. So we only scale up when necessary to avoid losing considerable accuracy (for denormal ax), and let log() do the scaling for other cases. % /* Calculate ax*ax and ay*ay exactly using Dekker's algorithm. */ % t = (double)(ax * (0x1p27 + 1)); % axh = (double)(ax - t) + t; % axl = ax - axh; % ax2h = ax * ax; % ax2l = axh * axh - ax2h + 2 * axh * axl + axl * axl; % t = (double)(ay * (0x1p27 + 1)); % ayh = (double)(ay - t) + t; % ayl = ay - ayh; % ay2h = ay * ay; % ay2l = ayh * ayh - ay2h + 2 * ayh * ayl + ayl * ayl; 6. Additive splitting like I had for clogl() before doesn't work unless ax is near 1. Use the correct Veldkamp algorithm after googling Dekker's algorithms. ax has type double_t, so casting to double avoids compiler bugfeatures at some cost. % sh -= 1; % norm(sh, sl); % norm(ax2l, ay2l); % /* Briggs-Kahan algorithm (except we discard the final low term): */ % norm(sh, ax2l); % norm(sl, ay2l); % t = ax2l + sl; % normF(sh, t); % return (cpack(log1p(ay2l + t + sh) * 0.5, v)); 7. Use a full modernised Dekker's algorithm for the critical case of |z| very near 1 (and for many cases not so near 1). There are 2 Veldkamp splittings, 4 norm() (= 2sum) calls and 2 normF() (= 2sumF) calls on the way here. norm() takes 6 additions and normF() takes 3, so there are 30 additions just for the norm*()'s. But this is only moderately slow (about the same speed as extra branches to avoid doing it all so much). Bruce --0-1116022701-1344877047=:3692 Content-Type: TEXT/PLAIN; charset=US-ASCII; name="cplex.c" Content-Transfer-Encoding: BASE64 Content-ID: <20120814025727.Q3692@besplex.bde.org> Content-Description: Content-Disposition: attachment; filename="cplex.c" I2luY2x1ZGUgPGNvbXBsZXguaD4NCiNpbmNsdWRlIDxmbG9hdC5oPg0KDQoj aW5jbHVkZSAiZnBtYXRoLmgiDQojaW5jbHVkZSAibG9jYWwuaCINCiNpbmNs dWRlICJtYXRoLmgiDQojaW5jbHVkZSAibWF0aF9wcml2YXRlLmgiDQoNCi8q IDJzdW0gYWxnb3JpdGhtLiAqLw0KI3VuZGVmIG5vcm0NCiNkZWZpbmUJbm9y bShhLCBiKSBkbyB7CQlcDQoJX190eXBlb2YoYSkgX19zLCBfX3c7CVwNCgkJ CQlcDQoJX193ID0gKGEpICsgKGIpOwlcDQoJX19zID0gX193IC0gKGEpOwlc DQoJKGIpID0gKChhKSAtIChfX3cgLSBfX3MpKSArICgoYikgLSBfX3MpOyBc DQoJKGEpID0gX193OwkJXA0KfSB3aGlsZSAoMCkNCg0KLyogMnN1bUYgYWxn b3JpdGhtLiAqLw0KI2RlZmluZQlub3JtRihhLCBiKSBkbyB7CVwNCglfX3R5 cGVvZihhKSBfX3c7CVwNCgkJCQlcDQoJX193ID0gKGEpICsgKGIpOwlcDQoJ KGIpID0gKChhKSAtIF9fdykgKyAoYik7IFwNCgkoYSkgPSBfX3c7CQlcDQp9 IHdoaWxlICgwKQ0KDQojZGVmaW5lCU1BTlRfRElHCURCTF9NQU5UX0RJRw0K I2RlZmluZQlNQVhfRVhQCQlEQkxfTUFYX0VYUA0KI2RlZmluZQlNSU5fRVhQ CQlEQkxfTUlOX0VYUA0KDQpzdGF0aWMgY29uc3QgZG91YmxlDQpsbjJfaGkg PSA2LjkzMTQ3MTgwNTU4Mjk4NzFlLTEsCQkvKiAgMHgxNjJlNDJmZWZhMDAw MC4wcC01MyAqLw0KbG4yX2xvID0gMS42NDY1OTQ5NTgyODk3MDgyZS0xMjsJ LyogIDB4MWNmNzlhYmM5ZTNiM2EuMHAtOTIgKi8NCg0KZG91YmxlIGNvbXBs ZXgNCmNsb2coZG91YmxlIGNvbXBsZXggeikNCnsNCglkb3VibGVfdCBheCwg YXgyaCwgYXgybCwgYXhoLCBheGwsIGF5LCBheTJoLCBheTJsLCBheWgsIGF5 bCwgc2gsIHNsLCB0Ow0KCWRvdWJsZSB4LCB5LCB2Ow0KCXVpbnQzMl90IGhh eCwgaGF5Ow0KCWludCBreCwga3k7DQoNCgl4ID0gY3JlYWwoeik7DQoJeSA9 IGNpbWFnKHopOw0KCXYgPSBhdGFuMih5LCB4KTsNCg0KCWF4ID0gZmFicyh4 KTsNCglheSA9IGZhYnMoeSk7DQoJaWYgKGF4IDwgYXkpIHsNCgkJdCA9IGF4 Ow0KCQlheCA9IGF5Ow0KCQlheSA9IHQ7DQoJfQ0KDQoJR0VUX0hJR0hfV09S RChoYXgsIGF4KTsNCglreCA9IChoYXggPj4gMjApIC0gMTAyMzsNCglHRVRf SElHSF9XT1JEKGhheSwgYXkpOw0KCWt5ID0gKGhheSA+PiAyMCkgLSAxMDIz Ow0KDQoJLyogSGFuZGxlIE5hTnMgYW5kIEluZnMgdXNpbmcgdGhlIGdlbmVy YWwgZm9ybXVsYS4gKi8NCglpZiAoa3ggPT0gTUFYX0VYUCB8fCBreSA9PSBN QVhfRVhQKQ0KCQlyZXR1cm4gKGNwYWNrKGxvZyhoeXBvdCh4LCB5KSksIHYp KTsNCg0KCS8qIEF2b2lkIHNwdXJpb3VzIHVuZGVyZmxvdywgYW5kIHJlZHVj ZSBpbmFjY3VyYWNpZXMgd2hlbiBheCBpcyAxLiAqLw0KCWlmIChheCA9PSAx KSB7DQoJCWlmIChreSA8IChNSU5fRVhQIC0gMSkgLyAyKQ0KCQkJcmV0dXJu IChjcGFjaygoYXkgKiAwLjUpICogYXksIHYpKTsNCgkJcmV0dXJuIChjcGFj ayhsb2cxcChheSAqIGF5KSAqIDAuNSwgdikpOw0KCX0NCg0KCS8qIEF2b2lk IHVuZGVyZmxvdyB3aGVuIGF4IGlzIG5vdCBzbWFsbC4gIEFsc28gaGFuZGxl IHplcm8gYXJncy4gKi8NCglpZiAoa3ggLSBreSA+IE1BTlRfRElHIHx8IGF5 ID09IDApDQoJCXJldHVybiAoY3BhY2sobG9nKGF4KSwgdikpOw0KDQoJLyog QXZvaWQgb3ZlcmZsb3cuICovDQoJaWYgKGt4ID49IE1BWF9FWFAgLSAxKQ0K CQlyZXR1cm4gKGNwYWNrKGxvZyhoeXBvdCh4ICogMHgxcC0xMDIyLCB5ICog MHgxcC0xMDIyKSkgKw0KCQkgICAgKE1BWF9FWFAgLSAyKSAqIGxuMl9sbyAr IChNQVhfRVhQIC0gMikgKiBsbjJfaGksIHYpKTsNCglpZiAoa3ggPj0gKE1B WF9FWFAgLSAxKSAvIDIpDQoJCXJldHVybiAoY3BhY2sobG9nKGh5cG90KHgs IHkpKSwgdikpOw0KDQoJLyogUmVkdWNlIGluYWNjdXJhY2llcyBhbmQgYXZv aWQgdW5kZXJmbG93IHdoZW4gYXggaXMgZGVub3JtYWwuICovDQoJaWYgKGt4 IDw9IE1JTl9FWFAgLSAyKQ0KCQlyZXR1cm4gKGNwYWNrKGxvZyhoeXBvdCh4 ICogMHgxcDEwMjMsIHkgKiAweDFwMTAyMykpICsNCgkJICAgIChNSU5fRVhQ IC0gMikgKiBsbjJfbG8gKyAoTUlOX0VYUCAtIDIpICogbG4yX2hpLCB2KSk7 DQoNCgkvKiBBdm9pZCByZW1haW5pbmcgdW5kZXJmbG93cyAod2hlbiBheCBp cyBzbWFsbCBidXQgbm90IGRlbm9ybWFsKS4gKi8NCglpZiAoa3kgPCAoTUlO X0VYUCAtIDEpIC8gMiArIE1BTlRfRElHKQ0KCQlyZXR1cm4gKGNwYWNrKGxv ZyhoeXBvdCh4LCB5KSksIHYpKTsNCg0KCS8qIENhbGN1bGF0ZSBheCpheCBh bmQgYXkqYXkgZXhhY3RseSB1c2luZyBEZWtrZXIncyBhbGdvcml0aG0uICov DQoJdCA9IChkb3VibGUpKGF4ICogKDB4MXAyNyArIDEpKTsNCglheGggPSAo ZG91YmxlKShheCAtIHQpICsgdDsNCglheGwgPSBheCAtIGF4aDsNCglheDJo ID0gYXggKiBheDsNCglheDJsID0gYXhoICogYXhoIC0gYXgyaCArIDIgKiBh eGggKiBheGwgKyBheGwgKiBheGw7DQoJdCA9IChkb3VibGUpKGF5ICogKDB4 MXAyNyArIDEpKTsNCglheWggPSAoZG91YmxlKShheSAtIHQpICsgdDsNCglh eWwgPSBheSAtIGF5aDsNCglheTJoID0gYXkgKiBheTsNCglheTJsID0gYXlo ICogYXloIC0gYXkyaCArIDIgKiBheWggKiBheWwgKyBheWwgKiBheWw7DQoN CgkvKg0KCSAqIFdoZW4gbG9nKHx6fCkgaXMgZmFyIGZyb20gMSwgYWNjdXJh Y3kgaW4gY2FsY3VsYXRpbmcgdGhlIHN1bQ0KCSAqIG9mIHRoZSBzcXVhcmVz IGlzIG5vdCB2ZXJ5IGltcG9ydGFudCBzaW5jZSBsb2coKSByZWR1Y2VzDQoJ ICogaW5hY2N1cmFjaWVzLiAgV2UgZGVwZW5kZWQgb24gdGhpcyB0byB1c2Ug dGhlIGdlbmVyYWwNCgkgKiBmb3JtdWxhIHdoZW4gbG9nKHx6fCkgaXMgdmVy eSBmYXIgZnJvbSAxLiAgV2hlbiBsb2cofHp8KSBpcw0KCSAqIG1vZGVyYXRl bHkgZmFyIGZyb20gMSwgd2UgZ28gdGhyb3VnaCB0aGUgZXh0cmEtcHJlY2lz aW9uDQoJICogY2FsY3VsYXRpb25zIHRvIHJlZHVjZSBicmFuY2hlcyBhbmQg Z2FpbiBhIGxpdHRsZSBhY2N1cmFjeS4NCgkgKg0KCSAqIFdoZW4gfHp8IGlz IG5lYXIgMSwgd2Ugc3VidHJhY3QgMSBhbmQgdXNlIGxvZzFwKCkgYW5kIGRv bid0DQoJICogbGVhdmUgaXQgdG8gbG9nKCkgdG8gc3VidHJhY3QgMSwgc2lu Y2Ugd2UgZ2FpbiBhdCBsZWFzdCAxIGJpdA0KCSAqIG9mIGFjY3VyYWN5IGlu IHRoaXMgd2F5Lg0KCSAqDQoJICogV2hlbiB8enwgaXMgdmVyeSBuZWFyIDEs IHN1YnRyYWN0aW5nIDEgY2FuIGNhbmNlbCBhbG1vc3QNCgkgKiAzKk1BTlRf RElHIGJpdHMuICBXZSBhcnJhbmdlIHRoYXQgc3VidHJhY3RpbmcgMSBpcyBl eGFjdCBpbg0KCSAqIGRvdWJsZWQgcHJlY2lzaW9uLCBhbmQgdGhlbiBkbyB0 aGUgcmVzdCBvZiB0aGUgY2FsY3VsYXRpb24NCgkgKiBpbiBzbG9wcHkgZG91 YmxlZCBwcmVjaXNpb24uICBBbHRob3VnaCBsYXJnZSBjYW5jZWxhdGlvbnMN CgkgKiBvZnRlbiBsb3NlIGxvdHMgb2YgYWNjdXJhY3ksIGhlcmUgdGhlIGZp bmFsIHJlc3VsdCBpcyBleGFjdA0KCSAqIGluIGRvdWJsZWQgcHJlY2lzaW9u IGlmIHRoZSBsYXJnZSBjYWxjdWxhdGlvbiBvY2N1cnMgKGJlY2F1c2UNCgkg KiB0aGVuIGl0IGlzIGV4YWN0IGluIHRyaXBsZWQgcHJlY2lzaW9uIGFuZCB0 aGUgY2FuY2VsYXRpb24NCgkgKiByZW1vdmVzIGVub3VnaCBiaXRzIHRvIGZp dCBpbiBkb3VibGVkIHByZWNpc2lvbikuICBUaHVzIHRoZQ0KCSAqIHJlc3Vs dCBpcyBhY2N1cmF0ZSBpbiBzbG9wcHkgZG91YmxlZCBwcmVjaXNpb24sIGFu ZCB0aGUgb25seQ0KCSAqIHNpZ25pZmljYW50IGxvc3Mgb2YgYWNjdXJhY3kg aXMgd2hlbiBpdCBpcyBzdW1tZWQgYW5kIHBhc3NlZA0KCSAqIHRvIGxvZzFw KCkuDQoJICovDQoJc2ggPSBheDJoOw0KCXNsID0gYXkyaDsNCglub3JtRihz aCwgc2wpOw0KCWlmIChzaCA8IDAuNSB8fCBzaCA+PSAzKQ0KCQlyZXR1cm4g KGNwYWNrKGxvZyhheTJsICsgYXgybCArIHNsICsgc2gpICogMC41LCB2KSk7 DQoJc2ggLT0gMTsNCglub3JtKHNoLCBzbCk7DQoJbm9ybShheDJsLCBheTJs KTsNCgkvKiBCcmlnZ3MtS2FoYW4gYWxnb3JpdGhtIChleGNlcHQgd2UgZGlz Y2FyZCB0aGUgZmluYWwgbG93IHRlcm0pOiAqLw0KCW5vcm0oc2gsIGF4Mmwp Ow0KCW5vcm0oc2wsIGF5MmwpOw0KCXQgPSBheDJsICsgc2w7DQoJbm9ybUYo c2gsIHQpOw0KCXJldHVybiAoY3BhY2sobG9nMXAoYXkybCArIHQgKyBzaCkg KiAwLjUsIHYpKTsNCn0NCg0KI2lmIChMREJMX01BTlRfRElHID09IDUzKQ0K X193ZWFrX3JlZmVyZW5jZShjbG9nLCBjbG9nbCk7DQojZW5kaWYNCg0KI3Vu ZGVmIE1BTlRfRElHDQojZGVmaW5lCU1BTlRfRElHCUZMVF9NQU5UX0RJRw0K I3VuZGVmIE1BWF9FWFANCiNkZWZpbmUJTUFYX0VYUAkJRkxUX01BWF9FWFAN CiN1bmRlZiBNSU5fRVhQDQojZGVmaW5lCU1JTl9FWFAJCUZMVF9NSU5fRVhQ DQoNCnN0YXRpYyBjb25zdCBmbG9hdA0KbG4yZl9oaSA9ICA2LjkzMTQ1NzUx OTVlLTEsCQkvKiAgMHhiMTcyMDAuMHAtMjQgKi8NCmxuMmZfbG8gPSAgMS40 Mjg2MDY3NjUzZS02OwkJLyogIDB4YmZiZThlLjBwLTQzICovDQoNCmZsb2F0 IGNvbXBsZXgNCmNsb2dmKGZsb2F0IGNvbXBsZXggeikNCnsNCglmbG9hdF90 IGF4LCBheDJoLCBheDJsLCBheGgsIGF4bCwgYXksIGF5MmgsIGF5MmwsIGF5 aCwgYXlsLCBzaCwgc2wsIHQ7DQoJZmxvYXQgeCwgeSwgdjsNCgl1aW50MzJf dCBoYXgsIGhheTsNCglpbnQga3gsIGt5Ow0KDQoJeCA9IGNyZWFsZih6KTsN Cgl5ID0gY2ltYWdmKHopOw0KCXYgPSBhdGFuMmYoeSwgeCk7DQoNCglheCA9 IGZhYnNmKHgpOw0KCWF5ID0gZmFic2YoeSk7DQoJaWYgKGF4IDwgYXkpIHsN CgkJdCA9IGF4Ow0KCQlheCA9IGF5Ow0KCQlheSA9IHQ7DQoJfQ0KDQoJR0VU X0ZMT0FUX1dPUkQoaGF4LCBheCk7DQoJa3ggPSAoaGF4ID4+IDIzKSAtIDEy NzsNCglHRVRfRkxPQVRfV09SRChoYXksIGF5KTsNCglreSA9IChoYXkgPj4g MjMpIC0gMTI3Ow0KDQoJLyogSGFuZGxlIE5hTnMgYW5kIEluZnMgdXNpbmcg dGhlIGdlbmVyYWwgZm9ybXVsYS4gKi8NCglpZiAoa3ggPT0gTUFYX0VYUCB8 fCBreSA9PSBNQVhfRVhQKQ0KCQlyZXR1cm4gKGNwYWNrZihsb2dmKGh5cG90 Zih4LCB5KSksIHYpKTsNCg0KCS8qIEF2b2lkIHNwdXJpb3VzIHVuZGVyZmxv dywgYW5kIHJlZHVjZSBpbmFjY3VyYWNpZXMgd2hlbiBheCBpcyAxLiAqLw0K CWlmIChoYXggPT0gMHgzZjgwMDAwMCkgew0KCQlpZiAoa3kgPCAoTUlOX0VY UCAtIDEpIC8gMikNCgkJCXJldHVybiAoY3BhY2tmKChheSAqIDAuNUYpICog YXksIHYpKTsNCgkJcmV0dXJuIChjcGFja2YobG9nMXBmKGF5ICogYXkpICog MC41RiwgdikpOw0KCX0NCg0KCS8qIEF2b2lkIHVuZGVyZmxvdyB3aGVuIGF4 IGlzIG5vdCBzbWFsbC4gIEFsc28gaGFuZGxlIHplcm8gYXJncy4gKi8NCglp ZiAoa3ggLSBreSA+IE1BTlRfRElHIHx8IGhheSA9PSAwKQ0KCQlyZXR1cm4g KGNwYWNrZihsb2dmKGF4KSwgdikpOw0KDQoJLyogQXZvaWQgb3ZlcmZsb3cu ICovDQoJaWYgKGt4ID49IE1BWF9FWFAgLSAxKQ0KCQlyZXR1cm4gKGNwYWNr Zihsb2dmKGh5cG90Zih4ICogMHgxcC0xMjZGLCB5ICogMHgxcC0xMjZGKSkg Kw0KCQkgICAgKE1BWF9FWFAgLSAyKSAqIGxuMl9sbyArIChNQVhfRVhQIC0g MikgKiBsbjJfaGksIHYpKTsNCglpZiAoa3ggPj0gKE1BWF9FWFAgLSAxKSAv IDIpDQoJCXJldHVybiAoY3BhY2tmKGxvZ2YoaHlwb3RmKHgsIHkpKSwgdikp Ow0KDQoJLyogUmVkdWNlIGluYWNjdXJhY2llcyBhbmQgYXZvaWQgdW5kZXJm bG93IHdoZW4gYXggaXMgZGVub3JtYWwuICovDQoJaWYgKGt4IDw9IE1JTl9F WFAgLSAyKQ0KCQlyZXR1cm4gKGNwYWNrZihsb2dmKGh5cG90Zih4ICogMHgx cDEyN0YsIHkgKiAweDFwMTI3RikpICsNCgkJICAgIChNSU5fRVhQIC0gMikg KiBsbjJmX2xvICsgKE1JTl9FWFAgLSAyKSAqIGxuMmZfaGksIHYpKTsNCg0K CS8qIEF2b2lkIHJlbWFpbmluZyB1bmRlcmZsb3dzICh3aGVuIGF4IGlzIHNt YWxsIGJ1dCBub3QgZGVub3JtYWwpLiAqLw0KCWlmIChreSA8IChNSU5fRVhQ IC0gMSkgLyAyICsgTUFOVF9ESUcpDQoJCXJldHVybiAoY3BhY2tmKGxvZ2Yo aHlwb3RmKHgsIHkpKSwgdikpOw0KDQoJLyogQ2FsY3VsYXRlIGF4KmF4IGFu ZCBheSpheSBleGFjdGx5IHVzaW5nIERla2tlcidzIGFsZ29yaXRobS4gKi8N Cgl0ID0gKGZsb2F0KShheCAqICgweDFwMTJGICsgMSkpOw0KCWF4aCA9IChm bG9hdCkoYXggLSB0KSArIHQ7DQoJYXhsID0gYXggLSBheGg7DQoJYXgyaCA9 IGF4ICogYXg7DQoJYXgybCA9IGF4aCAqIGF4aCAtIGF4MmggKyAyICogYXho ICogYXhsICsgYXhsICogYXhsOw0KCXQgPSAoZmxvYXQpKGF5ICogKDB4MXAx MkYgKyAxKSk7DQoJYXloID0gKGZsb2F0KShheSAtIHQpICsgdDsNCglheWwg PSBheSAtIGF5aDsNCglheTJoID0gYXkgKiBheTsNCglheTJsID0gYXloICog YXloIC0gYXkyaCArIDIgKiBheWggKiBheWwgKyBheWwgKiBheWw7DQoNCgkv Kg0KCSAqIFdoZW4gbG9nKHx6fCkgaXMgZmFyIGZyb20gMSwgYWNjdXJhY3kg aW4gY2FsY3VsYXRpbmcgdGhlIHN1bQ0KCSAqIG9mIHRoZSBzcXVhcmVzIGlz IG5vdCB2ZXJ5IGltcG9ydGFudCBzaW5jZSBsb2coKSByZWR1Y2VzDQoJICog aW5hY2N1cmFjaWVzLiAgV2UgZGVwZW5kZWQgb24gdGhpcyB0byB1c2UgdGhl IGdlbmVyYWwNCgkgKiBmb3JtdWxhIHdoZW4gbG9nKHx6fCkgaXMgdmVyeSBm YXIgZnJvbSAxLiAgV2hlbiBsb2cofHp8KSBpcw0KCSAqIG1vZGVyYXRlbHkg ZmFyIGZyb20gMSwgd2UgZ28gdGhyb3VnaCB0aGUgZXh0cmEtcHJlY2lzaW9u DQoJICogY2FsY3VsYXRpb25zIHRvIHJlZHVjZSBicmFuY2hlcyBhbmQgZ2Fp biBhIGxpdHRsZSBhY2N1cmFjeS4NCgkgKg0KCSAqIFdoZW4gfHp8IGlzIG5l YXIgMSwgd2Ugc3VidHJhY3QgMSBhbmQgdXNlIGxvZzFwKCkgYW5kIGRvbid0 DQoJICogbGVhdmUgaXQgdG8gbG9nKCkgdG8gc3VidHJhY3QgMSwgc2luY2Ug d2UgZ2FpbiBhdCBsZWFzdCAxIGJpdA0KCSAqIG9mIGFjY3VyYWN5IGluIHRo aXMgd2F5Lg0KCSAqDQoJICogV2hlbiB8enwgaXMgdmVyeSBuZWFyIDEsIHN1 YnRyYWN0aW5nIDEgY2FuIGNhbmNlbCBhbG1vc3QNCgkgKiAzKk1BTlRfRElH IGJpdHMuICBXZSBhcnJhbmdlIHRoYXQgc3VidHJhY3RpbmcgMSBpcyBleGFj dCBpbg0KCSAqIGRvdWJsZWQgcHJlY2lzaW9uLCBhbmQgdGhlbiBkbyB0aGUg cmVzdCBvZiB0aGUgY2FsY3VsYXRpb24NCgkgKiBpbiBzbG9wcHkgZG91Ymxl ZCBwcmVjaXNpb24uICBBbHRob3VnaCBsYXJnZSBjYW5jZWxhdGlvbnMNCgkg KiBvZnRlbiBsb3NlIGxvdHMgb2YgYWNjdXJhY3ksIGhlcmUgdGhlIGZpbmFs IHJlc3VsdCBpcyBleGFjdA0KCSAqIGluIGRvdWJsZWQgcHJlY2lzaW9uIGlm IHRoZSBsYXJnZSBjYWxjdWxhdGlvbiBvY2N1cnMgKGJlY2F1c2UNCgkgKiB0 aGVuIGl0IGlzIGV4YWN0IGluIHRyaXBsZWQgcHJlY2lzaW9uIGFuZCB0aGUg Y2FuY2VsYXRpb24NCgkgKiByZW1vdmVzIGVub3VnaCBiaXRzIHRvIGZpdCBp biBkb3VibGVkIHByZWNpc2lvbikuICBUaHVzIHRoZQ0KCSAqIHJlc3VsdCBp cyBhY2N1cmF0ZSBpbiBzbG9wcHkgZG91YmxlZCBwcmVjaXNpb24sIGFuZCB0 aGUgb25seQ0KCSAqIHNpZ25pZmljYW50IGxvc3Mgb2YgYWNjdXJhY3kgaXMg d2hlbiBpdCBpcyBzdW1tZWQgYW5kIHBhc3NlZA0KCSAqIHRvIGxvZzFwKCku DQoJICovDQoJc2ggPSBheDJoOw0KCXNsID0gYXkyaDsNCglub3JtRihzaCwg c2wpOw0KCWlmIChzaCA8IDAuNUYgfHwgc2ggPj0gMykNCgkJcmV0dXJuIChj cGFja2YobG9nZihheTJsICsgYXgybCArIHNsICsgc2gpICogMC41Riwgdikp Ow0KCXNoIC09IDE7DQoJbm9ybShzaCwgc2wpOw0KCW5vcm0oYXgybCwgYXky bCk7DQoJLyogQnJpZ2dzLUthaGFuIGFsZ29yaXRobSAoZXhjZXB0IHdlIGRp c2NhcmQgdGhlIGZpbmFsIGxvdyB0ZXJtKTogKi8NCglub3JtKHNoLCBheDJs KTsNCglub3JtKHNsLCBheTJsKTsNCgl0ID0gYXgybCArIHNsOw0KCW5vcm1G KHNoLCB0KTsNCglyZXR1cm4gKGNwYWNrZihsb2cxcGYoYXkybCArIHQgKyBz aCkgKiAwLjVGLCB2KSk7DQp9DQoNCiN1bmRlZiBNQU5UX0RJRw0KI2RlZmlu ZQlNQU5UX0RJRwlMREJMX01BTlRfRElHDQojdW5kZWYgTUFYX0VYUA0KI2Rl ZmluZQlNQVhfRVhQCQlMREJMX01BWF9FWFANCiN1bmRlZiBNSU5fRVhQDQoj ZGVmaW5lCU1JTl9FWFAJCUxEQkxfTUlOX0VYUA0KDQpsb25nIGRvdWJsZSBj b21wbGV4DQpjbG9nbChsb25nIGRvdWJsZSBjb21wbGV4IHopDQp7DQoJbG9u ZyBkb3VibGUgYXgsIGF4MmgsIGF4MmwsIGF4aCwgYXhsLCBheSwgYXkyaCwg YXkybCwgYXloLCBheWw7DQoJbG9uZyBkb3VibGUgc2gsIHNsLCB0Ow0KCWxv bmcgZG91YmxlIHgsIHksIHY7DQoJdWludDE2X3QgaGF4LCBoYXk7DQoJaW50 IGt4LCBreTsNCg0KCXggPSBjcmVhbGwoeik7DQoJeSA9IGNpbWFnbCh6KTsN Cgl2ID0gYXRhbjJsKHksIHgpOw0KDQoJYXggPSBmYWJzbCh4KTsNCglheSA9 IGZhYnNsKHkpOw0KCWlmIChheCA8IGF5KSB7DQoJCXQgPSBheDsNCgkJYXgg PSBheTsNCgkJYXkgPSB0Ow0KCX0NCg0KCUdFVF9MREJMX0VYUFNJR04oaGF4 LCBheCk7DQoJa3ggPSBoYXggLSAxNjM4MzsNCglHRVRfTERCTF9FWFBTSUdO KGhheSwgYXkpOw0KCWt5ID0gaGF5IC0gMTYzODM7DQoNCgkvKiBIYW5kbGUg TmFOcyBhbmQgSW5mcyB1c2luZyB0aGUgZ2VuZXJhbCBmb3JtdWxhLiAqLw0K CWlmIChreCA9PSBNQVhfRVhQIHx8IGt5ID09IE1BWF9FWFApDQoJCXJldHVy biAoY3BhY2tsKGxvZ2woaHlwb3RsKHgsIHkpKSwgdikpOw0KDQoJLyogQXZv aWQgc3B1cmlvdXMgdW5kZXJmbG93LCBhbmQgcmVkdWNlIGluYWNjdXJhY2ll cyB3aGVuIGF4IGlzIDEuICovDQoJaWYgKGF4ID09IDEpIHsNCgkJaWYgKGt5 IDwgKE1JTl9FWFAgLSAxKSAvIDIpDQoJCQlyZXR1cm4gKGNwYWNrbCgoYXkg KiAwLjUpICogYXksIHYpKTsNCgkJcmV0dXJuIChjcGFja2wobG9nMXBsKGF5 ICogYXkpICogMC41LCB2KSk7DQoJfQ0KDQoJLyogQXZvaWQgdW5kZXJmbG93 IHdoZW4gYXggaXMgbm90IHNtYWxsLiAgQWxzbyBoYW5kbGUgemVybyBhcmdz LiAqLw0KCWlmIChreCAtIGt5ID4gTUFOVF9ESUcgfHwgYXkgPT0gMCkNCgkJ cmV0dXJuIChjcGFja2wobG9nbChheCksIHYpKTsNCg0KCS8qIEF2b2lkIG92 ZXJmbG93LiAqLw0KCWlmIChreCA+PSBNQVhfRVhQIC0gMSkNCgkJcmV0dXJu IChjcGFja2wobG9nbChoeXBvdGwoeCAqIDB4MXAtMTYzODJMLCB5ICogMHgx cC0xNjM4MkwpKSArDQoJCSAgICAoTUFYX0VYUCAtIDIpICogbG4yX2xvICsg KE1BWF9FWFAgLSAyKSAqIGxuMl9oaSwgdikpOw0KCWlmIChreCA+PSAoTUFY X0VYUCAtIDEpIC8gMikNCgkJcmV0dXJuIChjcGFja2wobG9nbChoeXBvdGwo eCwgeSkpLCB2KSk7DQoNCgkvKiBSZWR1Y2UgaW5hY2N1cmFjaWVzIGFuZCBh dm9pZCB1bmRlcmZsb3cgd2hlbiBheCBpcyBkZW5vcm1hbC4gKi8NCglpZiAo a3ggPD0gTUlOX0VYUCAtIDIpDQoJCXJldHVybiAoY3BhY2tsKGxvZ2woaHlw b3RsKHggKiAweDFwMTYzODNMLCB5ICogMHgxcDE2MzgzTCkpICsNCgkJICAg IChNSU5fRVhQIC0gMikgKiBsbjJfbG8gKyAoTUlOX0VYUCAtIDIpICogbG4y X2hpLCB2KSk7DQoNCgkvKiBBdm9pZCByZW1haW5pbmcgdW5kZXJmbG93cyAo d2hlbiBheCBpcyBzbWFsbCBidXQgbm90IGRlbm9ybWFsKS4gKi8NCglpZiAo a3kgPCAoTUlOX0VYUCAtIDEpIC8gMiArIE1BTlRfRElHKQ0KCQlyZXR1cm4g KGNwYWNrbChsb2dsKGh5cG90bCh4LCB5KSksIHYpKTsNCg0KCS8qIENhbGN1 bGF0ZSBheCpheCBhbmQgYXkqYXkgZXhhY3RseSB1c2luZyBEZWtrZXIncyBh bGdvcml0aG0uICovDQoJdCA9IChsb25nIGRvdWJsZSkoYXggKiAoMHgxcDMy ICsgMSkpOw0KCWF4aCA9IChsb25nIGRvdWJsZSkoYXggLSB0KSArIHQ7DQoJ YXhsID0gYXggLSBheGg7DQoJYXgyaCA9IGF4ICogYXg7DQoJYXgybCA9IGF4 aCAqIGF4aCAtIGF4MmggKyAyICogYXhoICogYXhsICsgYXhsICogYXhsOw0K CXQgPSAobG9uZyBkb3VibGUpKGF5ICogKDB4MXAzMiArIDEpKTsNCglheWgg PSAobG9uZyBkb3VibGUpKGF5IC0gdCkgKyB0Ow0KCWF5bCA9IGF5IC0gYXlo Ow0KCWF5MmggPSBheSAqIGF5Ow0KCWF5MmwgPSBheWggKiBheWggLSBheTJo ICsgMiAqIGF5aCAqIGF5bCArIGF5bCAqIGF5bDsNCg0KCS8qDQoJICogV2hl biBsb2cofHp8KSBpcyBmYXIgZnJvbSAxLCBhY2N1cmFjeSBpbiBjYWxjdWxh dGluZyB0aGUgc3VtDQoJICogb2YgdGhlIHNxdWFyZXMgaXMgbm90IHZlcnkg aW1wb3J0YW50IHNpbmNlIGxvZygpIHJlZHVjZXMNCgkgKiBpbmFjY3VyYWNp ZXMuICBXZSBkZXBlbmRlZCBvbiB0aGlzIHRvIHVzZSB0aGUgZ2VuZXJhbA0K CSAqIGZvcm11bGEgd2hlbiBsb2cofHp8KSBpcyB2ZXJ5IGZhciBmcm9tIDEu ICBXaGVuIGxvZyh8enwpIGlzDQoJICogbW9kZXJhdGVseSBmYXIgZnJvbSAx LCB3ZSBnbyB0aHJvdWdoIHRoZSBleHRyYS1wcmVjaXNpb24NCgkgKiBjYWxj dWxhdGlvbnMgdG8gcmVkdWNlIGJyYW5jaGVzIGFuZCBnYWluIGEgbGl0dGxl IGFjY3VyYWN5Lg0KCSAqDQoJICogV2hlbiB8enwgaXMgbmVhciAxLCB3ZSBz dWJ0cmFjdCAxIGFuZCB1c2UgbG9nMXAoKSBhbmQgZG9uJ3QNCgkgKiBsZWF2 ZSBpdCB0byBsb2coKSB0byBzdWJ0cmFjdCAxLCBzaW5jZSB3ZSBnYWluIGF0 IGxlYXN0IDEgYml0DQoJICogb2YgYWNjdXJhY3kgaW4gdGhpcyB3YXkuDQoJ ICoNCgkgKiBXaGVuIHx6fCBpcyB2ZXJ5IG5lYXIgMSwgc3VidHJhY3Rpbmcg MSBjYW4gY2FuY2VsIGFsbW9zdA0KCSAqIDMqTUFOVF9ESUcgYml0cy4gIFdl IGFycmFuZ2UgdGhhdCBzdWJ0cmFjdGluZyAxIGlzIGV4YWN0IGluDQoJICog ZG91YmxlZCBwcmVjaXNpb24sIGFuZCB0aGVuIGRvIHRoZSByZXN0IG9mIHRo ZSBjYWxjdWxhdGlvbg0KCSAqIGluIHNsb3BweSBkb3VibGVkIHByZWNpc2lv bi4gIEFsdGhvdWdoIGxhcmdlIGNhbmNlbGF0aW9ucw0KCSAqIG9mdGVuIGxv c2UgbG90cyBvZiBhY2N1cmFjeSwgaGVyZSB0aGUgZmluYWwgcmVzdWx0IGlz IGV4YWN0DQoJICogaW4gZG91YmxlZCBwcmVjaXNpb24gaWYgdGhlIGxhcmdl IGNhbGN1bGF0aW9uIG9jY3VycyAoYmVjYXVzZQ0KCSAqIHRoZW4gaXQgaXMg ZXhhY3QgaW4gdHJpcGxlZCBwcmVjaXNpb24gYW5kIHRoZSBjYW5jZWxhdGlv bg0KCSAqIHJlbW92ZXMgZW5vdWdoIGJpdHMgdG8gZml0IGluIGRvdWJsZWQg cHJlY2lzaW9uKS4gIFRodXMgdGhlDQoJICogcmVzdWx0IGlzIGFjY3VyYXRl IGluIHNsb3BweSBkb3VibGVkIHByZWNpc2lvbiwgYW5kIHRoZSBvbmx5DQoJ ICogc2lnbmlmaWNhbnQgbG9zcyBvZiBhY2N1cmFjeSBpcyB3aGVuIGl0IGlz IHN1bW1lZCBhbmQgcGFzc2VkDQoJICogdG8gbG9nMXAoKS4NCgkgKi8NCglz aCA9IGF4Mmg7DQoJc2wgPSBheTJoOw0KCW5vcm1GKHNoLCBzbCk7DQoJaWYg KHNoIDwgMC41IHx8IHNoID49IDMpDQoJCXJldHVybiAoY3BhY2tsKGxvZ2wo YXkybCArIGF4MmwgKyBzbCArIHNoKSAqIDAuNSwgdikpOw0KCXNoIC09IDE7 DQoJbm9ybShzaCwgc2wpOw0KCW5vcm0oYXgybCwgYXkybCk7DQoJLyogQnJp Z2dzLUthaGFuIGFsZ29yaXRobSAoZXhjZXB0IHdlIGRpc2NhcmQgdGhlIGZp bmFsIGxvdyB0ZXJtKTogKi8NCglub3JtKHNoLCBheDJsKTsNCglub3JtKHNs LCBheTJsKTsNCgl0ID0gYXgybCArIHNsOw0KCW5vcm1GKHNoLCB0KTsNCgly ZXR1cm4gKGNwYWNrbChsb2cxcGwoYXkybCArIHQgKyBzaCkgKiAwLjUsIHYp KTsNCn0NCg0KI2RlZmluZQlHRU4oZnVuYywgdHlwZSwgbHR5cGUsIHBhcnQs IGxwYXJ0KQlcDQp0eXBlCQkJCQkJXA0KbHBhcnQgIyMgYyAjIyBmdW5jICMj IGx0eXBlKHR5cGUgeCwgdHlwZSB5KQlcDQp7CQkJCQkJXA0KCXR5cGUgY29t cGxleCB6OwkJCQlcDQoJdHlwZSBjb21wbGV4IHc7CQkJCVwNCgkJCQkJCVwN Cgl6ID0gY3BhY2sgIyMgbHR5cGUoeCwgeSk7CQlcDQoJdyA9IGMgIyMgZnVu YyAjIyBsdHlwZSh6KTsJCVwNCglyZXR1cm4gKGMgIyMgcGFydCAjIyBsdHlw ZSh3KSk7CQlcDQp9DQoNCiNkZWZpbmUJR0VOMyhmdW5jKQkJCVwNCkdFTihm dW5jLCBkb3VibGUsICwgcmVhbCwgcikJCVwNCkdFTihmdW5jLCBkb3VibGUs ICwgaW1hZywgaSkJCVwNCkdFTihmdW5jLCBmbG9hdCwgZiwgcmVhbCwgcikJ CVwNCkdFTihmdW5jLCBmbG9hdCwgZiwgaW1hZywgaSkJCVwNCkdFTihmdW5j LCBsb25nIGRvdWJsZSwgbCwgcmVhbCwgcikJXA0KR0VOKGZ1bmMsIGxvbmcg ZG91YmxlLCBsLCBpbWFnLCBpKQ0KDQpHRU4zKGFjb3MpDQpHRU4zKGFjb3No KQ0KR0VOMyhhc2luKQ0KR0VOMyhhc2luaCkNCkdFTjMoYXRhbikNCkdFTjMo YXRhbmgpDQpHRU4zKGxvZykNCg== --0-1116022701-1344877047=:3692 Content-Type: TEXT/PLAIN; charset=US-ASCII; name="catrig.diff" Content-Transfer-Encoding: BASE64 Content-ID: <20120814025727.M3692@besplex.bde.org> Content-Description: Content-Disposition: attachment; filename="catrig.diff" ZGlmZiAtYzIgY2F0cmlnLmN+IGNhdHJpZy5jDQoqKiogY2F0cmlnLmN+CVN1 biBBdWcgMTIgMTc6Mjk6MTggMjAxMg0KLS0tIGNhdHJpZy5jCU1vbiBBdWcg MTMgMTI6MDc6MDkgMjAxMg0KKioqKioqKioqKioqKioqDQoqKiogMjY1LDI2 OSAqKioqDQogIAkJcmV0dXJuOw0KICAJfQ0KISAJaWYgKHkgPCBNSU5fNFRI X1JPT1QpIHsNCiAgCQkvKg0KICAJCSAqIEF2b2lkIGEgcG9zc2libGUgdW5k ZXJmbG93IGNhdXNlZCBieSB5L0EuICBGb3IgY2FzaW5oIHRoaXMNCi0tLSAy NjUsMjY5IC0tLS0NCiAgCQlyZXR1cm47DQogIAl9DQohIAlpZiAoIUlTTkFO KHgpICYmIHkgPCBNSU5fNFRIX1JPT1QpIHsNCiAgCQkvKg0KICAJCSAqIEF2 b2lkIGEgcG9zc2libGUgdW5kZXJmbG93IGNhdXNlZCBieSB5L0EuICBGb3Ig Y2FzaW5oIHRoaXMNCioqKioqKioqKioqKioqKg0KKioqIDQwOCw0MTYgKioq Kg0KICANCiAgCWlmIChJU0ZJTklURShieCkgJiYgSVNGSU5JVEUoYnkpICYm ICh4ID4gUkVDSVBfU1FSVF9FUFNJTE9OXzEwMCB8fCB5ID4gUkVDSVBfU1FS VF9FUFNJTE9OXzEwMCkpIHsNCiEgCQlpZiAoaHVnZSt4K3k+b25lKSB7IC8q IHJhaXNlIGluZXhhY3QgZmxhZyAqLw0KISAJCQl3ID0gY2xvZ19mb3JfbGFy Z2VfdmFsdWVzKHopICsgTV9MTjI7DQogIAkJCWlmIChzeSA9PSAwKQ0KISAJ CQkJcmV0dXJuIChjcGFjayhjaW1hZyh3KSwgLWNyZWFsKHcpKSk7DQohIAkJ CXJldHVybiAoY3BhY2soLWNpbWFnKHcpLCBjcmVhbCh3KSkpOw0KICAJCX0N CiAgCX0NCi0tLSA0MDgsNDIwIC0tLS0NCiAgDQogIAlpZiAoSVNGSU5JVEUo YngpICYmIElTRklOSVRFKGJ5KSAmJiAoeCA+IFJFQ0lQX1NRUlRfRVBTSUxP Tl8xMDAgfHwgeSA+IFJFQ0lQX1NRUlRfRVBTSUxPTl8xMDApKSB7DQohIAkJ LyogWFhYIGZvbGxvd2luZyBjYW4gYWxzbyByYWlzZSBvdmVyZmxvdyAqLw0K ISAJCWlmIChodWdlK3greT5vbmUpIHsgLyogcmFpc2UgaW5leGFjdCAqLw0K ISAJCQl3ID0gY2xvZ19mb3JfbGFyZ2VfdmFsdWVzKHopOw0KISAJCQkvKiBD YW4ndCBhZGQgTV9MTjIgdG8gdyBzaW5jZSBpdCBzaG91bGQgY2xvYmJlciAt MCpJLiAqLw0KISAJCQlyeCA9IGZhYnMoY2ltYWcodykpOw0KISAJCQlyeSA9 IGNyZWFsKHcpICsgTV9MTjI7DQogIAkJCWlmIChzeSA9PSAwKQ0KISAJCQkJ cnkgPSAtcnk7DQohIAkJCXJldHVybiAoY3BhY2socngsIHJ5KSk7DQogIAkJ fQ0KICAJfQ0KKioqKioqKioqKioqKioqDQoqKiogNDgyLDQ4NiAqKioqDQog IAkgKiBidXQgdGhpcyBjYXNlIHNob3VsZCBoYXBwZW4gZXh0cmVtZWx5IHJh cmVseS4NCiAgCSAqLw0KISAJaWYgKGF5ID4gMC41KkRCTF9NQVgpDQogIAkJ cmV0dXJuIChjcGFjayhsb2coaHlwb3QoeCAvIE1fRSwgeSAvIE1fRSkpICsg MSwgYXRhbjIoeSwgeCkpKTsNCiAgDQotLS0gNDg2LDQ5MCAtLS0tDQogIAkg KiBidXQgdGhpcyBjYXNlIHNob3VsZCBoYXBwZW4gZXh0cmVtZWx5IHJhcmVs eS4NCiAgCSAqLw0KISAJaWYgKGF4ID4gMC41KkRCTF9NQVgpDQogIAkJcmV0 dXJuIChjcGFjayhsb2coaHlwb3QoeCAvIE1fRSwgeSAvIE1fRSkpICsgMSwg YXRhbjIoeSwgeCkpKTsNCiAgDQpkaWZmIC1jMiBjYXRyaWdmLmN+IGNhdHJp Z2YuYw0KKioqIGNhdHJpZ2YuY34JU3VuIEF1ZyAxMiAxNzowMDo1MiAyMDEy DQotLS0gY2F0cmlnZi5jCU1vbiBBdWcgMTMgMTQ6MTQ6NDIgMjAxMg0KKioq KioqKioqKioqKioqDQoqKiogMTM4LDE0MiAqKioqDQogIAkJcmV0dXJuOw0K ICAJfQ0KISAJaWYgKHkgPCBNSU5fNFRIX1JPT1QpIHsNCiAgCQkqQl9pc191 c2FibGUgPSAwOw0KICAJCWlmICgoaW50KXk9PTApIC8qIHJhaXNlIGluZXhh Y3QgZmxhZyAqLw0KLS0tIDEzOCwxNDIgLS0tLQ0KICAJCXJldHVybjsNCiAg CX0NCiEgCWlmICghaXNuYW4oeCkgJiYgeSA8IE1JTl80VEhfUk9PVCkgew0K ICAJCSpCX2lzX3VzYWJsZSA9IDA7DQogIAkJaWYgKChpbnQpeT09MCkgLyog cmFpc2UgaW5leGFjdCBmbGFnICovDQoqKioqKioqKioqKioqKioNCioqKiAy MzMsMjQwICoqKioNCiAgCWlmIChpc2Zpbml0ZSh4KSAmJiBpc2Zpbml0ZSh5 KSAmJiAoeCA+IFJFQ0lQX1NRUlRfRVBTSUxPTl8xMDAgfHwgeSA+IFJFQ0lQ X1NRUlRfRVBTSUxPTl8xMDApKSB7DQogIAkJaWYgKGh1Z2UreCt5Pm9uZSkg eyAvKiByYWlzZSBpbmV4YWN0IGZsYWcgKi8NCiEgCQkJdyA9IGNsb2dfZm9y X2xhcmdlX3ZhbHVlcyh6KSArIE1fTE4yOw0KICAJCQlpZiAoc3kgPT0gMCkN CiEgCQkJCXJldHVybiAoY3BhY2tmKGNpbWFnZih3KSwgLWNyZWFsZih3KSkp Ow0KISAJCQlyZXR1cm4gKGNwYWNrZigtY2ltYWdmKHcpLCBjcmVhbGYodykp KTsNCiAgCQl9DQogIAl9DQotLS0gMjMzLDI0MiAtLS0tDQogIAlpZiAoaXNm aW5pdGUoeCkgJiYgaXNmaW5pdGUoeSkgJiYgKHggPiBSRUNJUF9TUVJUX0VQ U0lMT05fMTAwIHx8IHkgPiBSRUNJUF9TUVJUX0VQU0lMT05fMTAwKSkgew0K ICAJCWlmIChodWdlK3greT5vbmUpIHsgLyogcmFpc2UgaW5leGFjdCBmbGFn ICovDQohIAkJCXcgPSBjbG9nX2Zvcl9sYXJnZV92YWx1ZXMoeik7DQohIAkJ CXJ4ID0gZmFic2YoY2ltYWdmKHcpKTsNCiEgCQkJcnkgPSBjcmVhbGYodykg KyBNX0xOMjsNCiAgCQkJaWYgKHN5ID09IDApDQohIAkJCQlyeSA9IC1yeTsN CiEgCQkJcmV0dXJuIChjcGFja2YocngsIHJ5KSk7DQogIAkJfQ0KICAJfQ0K KioqKioqKioqKioqKioqDQoqKiogMjkwLDI5NCAqKioqDQogIAl9DQogIA0K ISAJaWYgKGF5ID4gMC41KkZMVF9NQVgpDQogIAkJcmV0dXJuIChjcGFja2Yo bG9nZihoeXBvdGYoeCAvIE1fRSwgeSAvIE1fRSkpICsgMSwgYXRhbjJmKHks IHgpKSk7DQogIA0KLS0tIDI5MiwyOTYgLS0tLQ0KICAJfQ0KICANCiEgCWlm IChheCA+IDAuNSpGTFRfTUFYKQ0KICAJCXJldHVybiAoY3BhY2tmKGxvZ2Yo aHlwb3RmKHggLyBNX0UsIHkgLyBNX0UpKSArIDEsIGF0YW4yZih5LCB4KSkp Ow0KICANCmRpZmYgLWMyIGNhdHJpZ2wuY34gY2F0cmlnbC5jDQoqKiogY2F0 cmlnbC5jfglTdW4gQXVnIDEyIDA2OjU0OjQ2IDIwMTINCi0tLSBjYXRyaWds LmMJTW9uIEF1ZyAxMyAxMjowODoyMSAyMDEyDQoqKioqKioqKioqKioqKioN CioqKiAxMTksMTIzICoqKioNCiAgCQlyZXR1cm47DQogIAl9DQohIAlpZiAo eSA8IE1JTl80VEhfUk9PVCkgew0KICAJCSpCX2lzX3VzYWJsZSA9IDA7DQog IAkJaWYgKChpbnQpeT09MCkgLyogcmFpc2UgaW5leGFjdCBmbGFnICovDQot LS0gMTE5LDEyMyAtLS0tDQogIAkJcmV0dXJuOw0KICAJfQ0KISAJaWYgKCFp c25hbih4KSAmJiB5IDwgTUlOXzRUSF9ST09UKSB7DQogIAkJKkJfaXNfdXNh YmxlID0gMDsNCiAgCQlpZiAoKGludCl5PT0wKSAvKiByYWlzZSBpbmV4YWN0 IGZsYWcgKi8NCioqKioqKioqKioqKioqKg0KKioqIDIwNywyMTQgKioqKg0K ICAJaWYgKGlzZmluaXRlKHgpICYmIGlzZmluaXRlKHkpICYmICh4ID4gUkVD SVBfU1FSVF9FUFNJTE9OXzEwMCB8fCB5ID4gUkVDSVBfU1FSVF9FUFNJTE9O XzEwMCkpIHsNCiAgCQlpZiAoaHVnZSt4K3k+b25lKSB7IC8qIHJhaXNlIGlu ZXhhY3QgZmxhZyAqLw0KISAJCQl3ID0gY2xvZ19mb3JfbGFyZ2VfdmFsdWVz KHopICsgTF9MTjI7DQogIAkJCWlmIChzeSA9PSAwKQ0KISAJCQkJcmV0dXJu IChjcGFja2woY2ltYWdsKHcpLCAtY3JlYWxsKHcpKSk7DQohIAkJCXJldHVy biAoY3BhY2tsKC1jaW1hZ2wodyksIGNyZWFsbCh3KSkpOw0KICAJCX0NCiAg CX0NCi0tLSAyMDcsMjE2IC0tLS0NCiAgCWlmIChpc2Zpbml0ZSh4KSAmJiBp c2Zpbml0ZSh5KSAmJiAoeCA+IFJFQ0lQX1NRUlRfRVBTSUxPTl8xMDAgfHwg eSA+IFJFQ0lQX1NRUlRfRVBTSUxPTl8xMDApKSB7DQogIAkJaWYgKGh1Z2Ur eCt5Pm9uZSkgeyAvKiByYWlzZSBpbmV4YWN0IGZsYWcgKi8NCiEgCQkJdyA9 IGNsb2dfZm9yX2xhcmdlX3ZhbHVlcyh6KTsNCiEgCQkJcnggPSBmYWJzbChj aW1hZ2wodykpOw0KISAJCQlyeSA9IGNyZWFsbCh3KSArIE1fTE4yOw0KICAJ CQlpZiAoc3kgPT0gMCkNCiEgCQkJCXJ5ID0gLXJ5Ow0KISAJCQlyZXR1cm4g KGNwYWNrbChyeCwgcnkpKTsNCiAgCQl9DQogIAl9DQoqKioqKioqKioqKioq KioNCioqKiAyNjQsMjY4ICoqKioNCiAgCX0NCiAgDQohIAlpZiAoYXkgPiAw LjUqTERCTF9NQVgpDQogIAkJcmV0dXJuIChjcGFja2wobG9nbChoeXBvdGwo eCAvIExfRSwgeSAvIExfRSkpICsgMSwgYXRhbjJsKHksIHgpKSk7DQogIA0K LS0tIDI2NiwyNzAgLS0tLQ0KICAJfQ0KICANCiEgCWlmIChheCA+IDAuNSpM REJMX01BWCkNCiAgCQlyZXR1cm4gKGNwYWNrbChsb2dsKGh5cG90bCh4IC8g TF9FLCB5IC8gTF9FKSkgKyAxLCBhdGFuMmwoeSwgeCkpKTsNCiAgDQo= --0-1116022701-1344877047=:3692-- From owner-freebsd-numerics@FreeBSD.ORG Mon Aug 13 19:42:07 2012 Return-Path: Delivered-To: freebsd-numerics@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id EB7E9106564A for ; Mon, 13 Aug 2012 19:42:07 +0000 (UTC) (envelope-from stephen@missouri.edu) Received: from wilberforce.math.missouri.edu (wilberforce.math.missouri.edu [128.206.184.213]) by mx1.freebsd.org (Postfix) with ESMTP id AB0788FC0C for ; Mon, 13 Aug 2012 19:42:07 +0000 (UTC) Received: from [127.0.0.1] (wilberforce.math.missouri.edu [128.206.184.213]) by wilberforce.math.missouri.edu (8.14.5/8.14.5) with ESMTP id q7DJfxpR020858; Mon, 13 Aug 2012 14:42:00 -0500 (CDT) (envelope-from stephen@missouri.edu) Message-ID: <50295887.2010608@missouri.edu> Date: Mon, 13 Aug 2012 14:41:59 -0500 From: Stephen Montgomery-Smith User-Agent: Mozilla/5.0 (X11; Linux i686; rv:14.0) Gecko/20120714 Thunderbird/14.0 MIME-Version: 1.0 To: Bruce Evans References: <5017111E.6060003@missouri.edu> <501C361D.4010807@missouri.edu> <20120804165555.X1231@besplex.bde.org> <501D51D7.1020101@missouri.edu> <20120805030609.R3101@besplex.bde.org> <501D9C36.2040207@missouri.edu> <20120805175106.X3574@besplex.bde.org> <501EC015.3000808@missouri.edu> <20120805191954.GA50379@troutmask.apl.washington.edu> <20120807205725.GA10572@server.rulingia.com> <20120809025220.N4114@besplex.bde.org> <5027F07E.9060409@missouri.edu> <20120814003614.H3692@besplex.bde.org> In-Reply-To: <20120814003614.H3692@besplex.bde.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-numerics@FreeBSD.org Subject: Re: Complex arg-trig functions X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 13 Aug 2012 19:42:08 -0000 On 08/13/2012 11:57 AM, Bruce Evans wrote: > I finally tested a version of this. I only did simple comparisons (float > vs double and double vs long double). The results look promising after > fixing a few bugs: Thank you very much for doing the testing, and for fixing the bugs. > > % amd64 float prec, on 2**12 * 2**12 args: > % icacosh:max_er = 0x3690000000 436.5000, avg_er = 0.317, #>=1:0.5 = > 29104:255732 > There are negative reasons to have the float versions unless they are not > wrappers. The reasons to have non-wrappers are to test the algorithm and > run faster. That large max-err for the imaginary part of icacosh for float bothers me. It means that I haven't thought it through properly. Could you send me the input values that created this error(s)? The float versions really are much harder than the double and long-double versions. And it just doesn't seem worth the effort, because who uses them when the double versions are available? In my case, not for speed. Because on my machine the float versions are slightly slower than the double version. Also, you made the comment that in the float version, all the 0.5 should become 0.5F. Two questions: 1. Doesn't the compiler do this conversion for me? 2. What is wrong with using x/2 instead of 0.5*x? You told me in a far earlier email to use 0.5*x. (Similarly in one place I have a 0.25.) From owner-freebsd-numerics@FreeBSD.ORG Mon Aug 13 20:11:11 2012 Return-Path: Delivered-To: freebsd-numerics@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 96145106567E for ; Mon, 13 Aug 2012 20:11:11 +0000 (UTC) (envelope-from stephen@missouri.edu) Received: from wilberforce.math.missouri.edu (wilberforce.math.missouri.edu [128.206.184.213]) by mx1.freebsd.org (Postfix) with ESMTP id 568D38FC1B for ; Mon, 13 Aug 2012 20:11:10 +0000 (UTC) Received: from [127.0.0.1] (wilberforce.math.missouri.edu [128.206.184.213]) by wilberforce.math.missouri.edu (8.14.5/8.14.5) with ESMTP id q7DKB8fL022786; Mon, 13 Aug 2012 15:11:09 -0500 (CDT) (envelope-from stephen@missouri.edu) Message-ID: <50295F5C.6010800@missouri.edu> Date: Mon, 13 Aug 2012 15:11:08 -0500 From: Stephen Montgomery-Smith User-Agent: Mozilla/5.0 (X11; Linux i686; rv:14.0) Gecko/20120714 Thunderbird/14.0 MIME-Version: 1.0 To: Bruce Evans References: <5017111E.6060003@missouri.edu> <501C361D.4010807@missouri.edu> <20120804165555.X1231@besplex.bde.org> <501D51D7.1020101@missouri.edu> <20120805030609.R3101@besplex.bde.org> <501D9C36.2040207@missouri.edu> <20120805175106.X3574@besplex.bde.org> <501EC015.3000808@missouri.edu> <20120805191954.GA50379@troutmask.apl.washington.edu> <20120807205725.GA10572@server.rulingia.com> <20120809025220.N4114@besplex.bde.org> <5027F07E.9060409@missouri.edu> <20120814003614.H3692@besplex.bde.org> In-Reply-To: <20120814003614.H3692@besplex.bde.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-numerics@FreeBSD.org Subject: Re: Complex arg-trig functions X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 13 Aug 2012 20:11:11 -0000 On 08/13/2012 11:57 AM, Bruce Evans wrote: > @ if (sy == 0) > @ ! return (cpack(cimag(w), -creal(w))); > @ ! return (cpack(-cimag(w), creal(w))); > > The sign of creal(cacos()) is always 1, but this makes it +- the sign > of atan2(x, y). Yes, but the sign of atan2(y,x) will always be the same as the sign of y. So the two negatives will cancel. But your code works just as well (and your code doesn't clobber the -0's in the imaginary part). > > @ } > @ } > @ --- 408,420 ---- > @ @ if (ISFINITE(bx) && ISFINITE(by) && (x > > RECIP_SQRT_EPSILON_100 || y > RECIP_SQRT_EPSILON_100)) { > @ ! /* XXX following can also raise overflow */ I don't see how the code could raise an overflow. The output of clog should always be very much less than DBL_MAX. (Originally I had clog(2*z), and that could raise an unwarranted overflow.) From owner-freebsd-numerics@FreeBSD.ORG Mon Aug 13 20:16:32 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 92C37106566B for ; Mon, 13 Aug 2012 20:16:32 +0000 (UTC) (envelope-from sgk@troutmask.apl.washington.edu) Received: from troutmask.apl.washington.edu (troutmask.apl.washington.edu [128.95.76.21]) by mx1.freebsd.org (Postfix) with ESMTP id 6ACE48FC0C for ; Mon, 13 Aug 2012 20:16:32 +0000 (UTC) Received: from troutmask.apl.washington.edu (localhost.apl.washington.edu [127.0.0.1]) by troutmask.apl.washington.edu (8.14.5/8.14.5) with ESMTP id q7DKGQ9P054218; Mon, 13 Aug 2012 13:16:26 -0700 (PDT) (envelope-from sgk@troutmask.apl.washington.edu) Received: (from sgk@localhost) by troutmask.apl.washington.edu (8.14.5/8.14.5/Submit) id q7DKGQaA054217; Mon, 13 Aug 2012 13:16:26 -0700 (PDT) (envelope-from sgk) Date: Mon, 13 Aug 2012 13:16:26 -0700 From: Steve Kargl To: Stephen Montgomery-Smith Message-ID: <20120813201626.GA54144@troutmask.apl.washington.edu> References: <20120805030609.R3101@besplex.bde.org> <501D9C36.2040207@missouri.edu> <20120805175106.X3574@besplex.bde.org> <501EC015.3000808@missouri.edu> <20120805191954.GA50379@troutmask.apl.washington.edu> <20120807205725.GA10572@server.rulingia.com> <20120809025220.N4114@besplex.bde.org> <5027F07E.9060409@missouri.edu> <20120814003614.H3692@besplex.bde.org> <50295887.2010608@missouri.edu> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <50295887.2010608@missouri.edu> User-Agent: Mutt/1.4.2.3i Cc: freebsd-numerics@freebsd.org, Bruce Evans Subject: Re: Complex arg-trig functions X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 13 Aug 2012 20:16:32 -0000 On Mon, Aug 13, 2012 at 02:41:59PM -0500, Stephen Montgomery-Smith wrote: > > Also, you made the comment that in the float version, all the 0.5 should > become 0.5F. Two questions: > 1. Doesn't the compiler do this conversion for me? float x, y; y = 0.5 * x; The conversion is effectively 'y = 0.5 * (double)x' where now the rhs is evaluated in double (53-bit precision). If you have 'y = 0.5f * x', then the rhs side is evaluate in float (24-bit precision). For a more complicated, expression whether one computes in 53 rather than 24 bits can have an effect on the outcome. -- Steve From owner-freebsd-numerics@FreeBSD.ORG Mon Aug 13 20:36:42 2012 Return-Path: Delivered-To: freebsd-numerics@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 641C9106564A for ; Mon, 13 Aug 2012 20:36:42 +0000 (UTC) (envelope-from stephen@missouri.edu) Received: from wilberforce.math.missouri.edu (wilberforce.math.missouri.edu [128.206.184.213]) by mx1.freebsd.org (Postfix) with ESMTP id 0BACD8FC12 for ; Mon, 13 Aug 2012 20:36:41 +0000 (UTC) Received: from [127.0.0.1] (wilberforce.math.missouri.edu [128.206.184.213]) by wilberforce.math.missouri.edu (8.14.5/8.14.5) with ESMTP id q7DKaeob024709; Mon, 13 Aug 2012 15:36:40 -0500 (CDT) (envelope-from stephen@missouri.edu) Message-ID: <50296558.8060909@missouri.edu> Date: Mon, 13 Aug 2012 15:36:40 -0500 From: Stephen Montgomery-Smith User-Agent: Mozilla/5.0 (X11; Linux i686; rv:14.0) Gecko/20120714 Thunderbird/14.0 MIME-Version: 1.0 To: Bruce Evans References: <5017111E.6060003@missouri.edu> <501C361D.4010807@missouri.edu> <20120804165555.X1231@besplex.bde.org> <501D51D7.1020101@missouri.edu> <20120805030609.R3101@besplex.bde.org> <501D9C36.2040207@missouri.edu> <20120805175106.X3574@besplex.bde.org> <501EC015.3000808@missouri.edu> <20120805191954.GA50379@troutmask.apl.washington.edu> <20120807205725.GA10572@server.rulingia.com> <20120809025220.N4114@besplex.bde.org> <5027F07E.9060409@missouri.edu> <20120814003614.H3692@besplex.bde.org> In-Reply-To: <20120814003614.H3692@besplex.bde.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-numerics@FreeBSD.org Subject: Re: Complex arg-trig functions X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 13 Aug 2012 20:36:42 -0000 Bruce, Can you post the two files fpmath.h and local.h that are needed to build your cplex.c? From owner-freebsd-numerics@FreeBSD.ORG Mon Aug 13 20:39:25 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id A29B4106564A for ; Mon, 13 Aug 2012 20:39:25 +0000 (UTC) (envelope-from stephen@missouri.edu) Received: from wilberforce.math.missouri.edu (wilberforce.math.missouri.edu [128.206.184.213]) by mx1.freebsd.org (Postfix) with ESMTP id 635118FC15 for ; Mon, 13 Aug 2012 20:39:25 +0000 (UTC) Received: from [127.0.0.1] (wilberforce.math.missouri.edu [128.206.184.213]) by wilberforce.math.missouri.edu (8.14.5/8.14.5) with ESMTP id q7DKdOE7024931; Mon, 13 Aug 2012 15:39:24 -0500 (CDT) (envelope-from stephen@missouri.edu) Message-ID: <502965FC.1040203@missouri.edu> Date: Mon, 13 Aug 2012 15:39:24 -0500 From: Stephen Montgomery-Smith User-Agent: Mozilla/5.0 (X11; Linux i686; rv:14.0) Gecko/20120714 Thunderbird/14.0 MIME-Version: 1.0 To: Steve Kargl References: <20120805030609.R3101@besplex.bde.org> <501D9C36.2040207@missouri.edu> <20120805175106.X3574@besplex.bde.org> <501EC015.3000808@missouri.edu> <20120805191954.GA50379@troutmask.apl.washington.edu> <20120807205725.GA10572@server.rulingia.com> <20120809025220.N4114@besplex.bde.org> <5027F07E.9060409@missouri.edu> <20120814003614.H3692@besplex.bde.org> <50295887.2010608@missouri.edu> <20120813201626.GA54144@troutmask.apl.washington.edu> In-Reply-To: <20120813201626.GA54144@troutmask.apl.washington.edu> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-numerics@freebsd.org, Bruce Evans Subject: Re: Complex arg-trig functions X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 13 Aug 2012 20:39:25 -0000 On 08/13/2012 03:16 PM, Steve Kargl wrote: > On Mon, Aug 13, 2012 at 02:41:59PM -0500, Stephen Montgomery-Smith wrote: >> >> Also, you made the comment that in the float version, all the 0.5 should >> become 0.5F. Two questions: >> 1. Doesn't the compiler do this conversion for me? > > float x, y; > y = 0.5 * x; > > The conversion is effectively 'y = 0.5 * (double)x' where > now the rhs is evaluated in double (53-bit precision). If > you have 'y = 0.5f * x', then the rhs side is evaluate > in float (24-bit precision). For a more complicated, > expression whether one computes in 53 rather than 24 bits > can have an effect on the outcome. > Thanks for the clarification. I made the changes in catrigf.c adding "F"s as needed. From owner-freebsd-numerics@FreeBSD.ORG Mon Aug 13 20:59:45 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id A99A7106566B for ; Mon, 13 Aug 2012 20:59:45 +0000 (UTC) (envelope-from stephen@missouri.edu) Received: from wilberforce.math.missouri.edu (wilberforce.math.missouri.edu [128.206.184.213]) by mx1.freebsd.org (Postfix) with ESMTP id 679DF8FC0C for ; Mon, 13 Aug 2012 20:59:45 +0000 (UTC) Received: from [127.0.0.1] (wilberforce.math.missouri.edu [128.206.184.213]) by wilberforce.math.missouri.edu (8.14.5/8.14.5) with ESMTP id q7DKxiiT026457 for ; Mon, 13 Aug 2012 15:59:44 -0500 (CDT) (envelope-from stephen@missouri.edu) Message-ID: <50296AC0.2040509@missouri.edu> Date: Mon, 13 Aug 2012 15:59:44 -0500 From: Stephen Montgomery-Smith User-Agent: Mozilla/5.0 (X11; Linux i686; rv:14.0) Gecko/20120714 Thunderbird/14.0 MIME-Version: 1.0 To: freebsd-numerics@freebsd.org References: <20120805030609.R3101@besplex.bde.org> <501D9C36.2040207@missouri.edu> <20120805175106.X3574@besplex.bde.org> <501EC015.3000808@missouri.edu> <20120805191954.GA50379@troutmask.apl.washington.edu> <20120807205725.GA10572@server.rulingia.com> <20120809025220.N4114@besplex.bde.org> <5027F07E.9060409@missouri.edu> <20120814003614.H3692@besplex.bde.org> <50295887.2010608@missouri.edu> <20120813201626.GA54144@troutmask.apl.washington.edu> <502965FC.1040203@missouri.edu> In-Reply-To: <502965FC.1040203@missouri.edu> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: Complex arg-trig functions X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 13 Aug 2012 20:59:45 -0000 On 08/13/2012 03:39 PM, Stephen Montgomery-Smith wrote: > On 08/13/2012 03:16 PM, Steve Kargl wrote: >> On Mon, Aug 13, 2012 at 02:41:59PM -0500, Stephen Montgomery-Smith wrote: >>> >>> Also, you made the comment that in the float version, all the 0.5 should >>> become 0.5F. Two questions: >>> 1. Doesn't the compiler do this conversion for me? >> >> float x, y; >> y = 0.5 * x; >> >> The conversion is effectively 'y = 0.5 * (double)x' where >> now the rhs is evaluated in double (53-bit precision). If >> you have 'y = 0.5f * x', then the rhs side is evaluate >> in float (24-bit precision). For a more complicated, >> expression whether one computes in 53 rather than 24 bits >> can have an effect on the outcome. >> > > > Thanks for the clarification. > > I made the changes in catrigf.c adding "F"s as needed. It shaved 5% of the computation time on my system (which means that the float versions are still slower than the double versions. To do hundred million casinh's double: about 18s. float (new): about 19s. float (old): about 20s. From owner-freebsd-numerics@FreeBSD.ORG Mon Aug 13 21:14:33 2012 Return-Path: Delivered-To: freebsd-numerics@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 3751A106566C for ; Mon, 13 Aug 2012 21:14:33 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from mail01.syd.optusnet.com.au (mail01.syd.optusnet.com.au [211.29.132.182]) by mx1.freebsd.org (Postfix) with ESMTP id 5AA958FC14 for ; Mon, 13 Aug 2012 21:14:31 +0000 (UTC) Received: from c122-106-171-246.carlnfd1.nsw.optusnet.com.au (c122-106-171-246.carlnfd1.nsw.optusnet.com.au [122.106.171.246]) by mail01.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id q7DLESV1010057 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Tue, 14 Aug 2012 07:14:29 +1000 Date: Tue, 14 Aug 2012 07:14:28 +1000 (EST) From: Bruce Evans X-X-Sender: bde@besplex.bde.org To: Stephen Montgomery-Smith In-Reply-To: <50295887.2010608@missouri.edu> Message-ID: <20120814055931.Q4897@besplex.bde.org> References: <5017111E.6060003@missouri.edu> <501C361D.4010807@missouri.edu> <20120804165555.X1231@besplex.bde.org> <501D51D7.1020101@missouri.edu> <20120805030609.R3101@besplex.bde.org> <501D9C36.2040207@missouri.edu> <20120805175106.X3574@besplex.bde.org> <501EC015.3000808@missouri.edu> <20120805191954.GA50379@troutmask.apl.washington.edu> <20120807205725.GA10572@server.rulingia.com> <20120809025220.N4114@besplex.bde.org> <5027F07E.9060409@missouri.edu> <20120814003614.H3692@besplex.bde.org> <50295887.2010608@missouri.edu> MIME-Version: 1.0 Content-Type: MULTIPART/MIXED; BOUNDARY="0-286592833-1344892468=:4897" Cc: freebsd-numerics@FreeBSD.org, Bruce Evans Subject: Re: Complex arg-trig functions X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 13 Aug 2012 21:14:33 -0000 This message is in MIME format. The first part should be readable text, while the remaining parts are likely unreadable without MIME-aware tools. --0-286592833-1344892468=:4897 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed On Mon, 13 Aug 2012, Stephen Montgomery-Smith wrote: > On 08/13/2012 11:57 AM, Bruce Evans wrote: > >> I finally tested a version of this. I only did simple comparisons (float >> vs double and double vs long double). The results look promising after >> fixing a few bugs: > > Thank you very much for doing the testing, and for fixing the bugs. > >> >> % amd64 float prec, on 2**12 * 2**12 args: > >> % icacosh:max_er = 0x3690000000 436.5000, avg_er = 0.317, #>=1:0.5 = >> 29104:255732 > >> There are negative reasons to have the float versions unless they are not >> wrappers. The reasons to have non-wrappers are to test the algorithm and >> run faster. > > That large max-err for the imaginary part of icacosh for float bothers me. > It means that I haven't thought it through properly. Could you send me the > input values that created this error(s)? It was just a bug in my test program, for scaling denormals. Some cases have an infinite-precision result that is very close to the smallest denormal. When this is rounded imperfectly to 0, implementation details give a special case. I had just fixed this for +0, but forget to mask out the sign bit for testing -0. Somehow this didn't turn up for other functions. > The float versions really are much harder than the double and long-double > versions. And it just doesn't seem worth the effort, because who uses them > when the double versions are available? I doubt that they are really harder. > In my case, not for speed. Because on my machine the float versions are > slightly slower than the double version. The double constants might do that on amd64. 'r *= 0.5F;' would compile to a single multiplication, where r and 0.5F are already in registers. 'r *= 0.5;' would compile to: convert r to double; multiply; convert back to float. Conversions are slower than multiplications. On Phenom, double <-> float conversion has a latency of 7 (?) cycles, while multiplication has a latency of 4 cycles; both have a throughput of 1 instruction/cycle. Sometimes latencies can be hidden, but here there are 3 dependent operations. > Also, you made the comment that in the float version, all the 0.5 should > become 0.5F. Two questions: > 1. Doesn't the compiler do this conversion for me? > 2. What is wrong with using x/2 instead of 0.5*x? You told me in a far > earlier email to use 0.5*x. (Similarly in one place I have a 0.25.) 1. It can't, since x * 0.5F is quite different from x * 0.5 when x is float and float expressions are evaluated in float precision. The former is a float expression, and for example overflows when x = FLT_MAX. The latter is a double expression, so it doesn't overflow when x = FLT_MAX. Maybe the compiler could optimize it to a float expression if its result is immediately assigned to a float variable or cast to float, but this is not a very easy optimization -- the compiler would have to prove that it has the right side effects for all possible values of x. The side effect of overflow happens on assignment. 2. Division tends to be slow, and is slow on x86. On Phenom, division has a latency of 20 cycles and a throughput of 1 per 15 cycles in scalar double precision (17 in vector[2] double precision). Maybe the compiler can optimize division by a power of 2 to a multiplication though. In fact, gcc on x86 does both of these optimizations. With float x, and double y, in a function returning float, 'return x/2.0;' gets turned into float multiplication by 0.5F, but 'y = x/2.0; return y;' only have the division optimization (it converts x to double, multiplies doubles, stores to y, and converts to float). So all of those '* 0.5[F]'s can be turned back into '/ 2's (no need to depend on the optimization by writing 2.0). Optimization of [long] double constants to float (or double) constants is more routine. When you write 'static const long double one = 1;', gcc normally puts a long double at the address of `one' (this is visible to debuggers), but never actually uses it it uses either fld1 on i387 or an unnamed float constant with the same value. At least on i387; loading a float constant automatically extends it and is much faster than loading an (already-extended) long double constant; however, on amd64 a float constant would have to be converted to double for use in double expressions. I added old complex functions to the simple test run. See the attachment for the full list. % amd64 float prec, on 2**12 x 2**12 args: % % ... % rctan: max_er = 0x9708bb4b 4.7198, avg_er = 0.104, #>=1:0.5 = 177976:980356 % rctanh:max_er = 0x9c7f786c 4.8906, avg_er = 0.159, #>=1:0.5 = 635992:2189748 % ... % ictan: max_er = 0x9c7f786c 4.8906, avg_er = 0.159, #>=1:0.5 = 635992:2189748 % ictanh:max_er = 0x9708bb4b 4.7198, avg_er = 0.104, #>=1:0.5 = 177976:980356 All (amd64 float prec) below 4 ulps except these. Have to check my denormal scaling again. % amd64 double prec, on 2**12 x 2**12 args: % ... % rcatan:max_er = 0x8000000000000000 4503599627370496.0000, avg_er = 268435456.212, #>=1:0.5 = 4681:81365 % rcatanh:max_er = 0x8000000000000000 4503599627370496.0000, avg_er = 268435456.047, #>=1:0.5 = 428997:691341 % ... % icatan:max_er = 0x8000000000000000 4503599627370496.0000, avg_er = 268435456.047, #>=1:0.5 = 428997:691341 % icatanh:max_er = 0x8000000000000000 4503599627370496.0000, avg_er = 268435456.212, #>=1:0.5 = 4681:81365 % ... % icsqrt:max_er = 0x8000000000000000 4503599627370496.0000, avg_er = 268435456.476, #>=1:0.5 = 14279:7662849 This was tracked to a known annoying difference between SSE and i387 on NaNs. x+y clears the sign buit on i387 but not on SSE. SSE is correct. % i386 float prec, on 2**12 x 2**12 args: % ... % rcatan:max_er = 0x3c4078ec 1.8829, avg_er = 0.284, #>=1:0.5 = 13260:190836 % rcatanh:max_er = 0x2cef3171 1.4042, avg_er = 0.167, #>=1:0.5 = 4096:420916 i386 is generally more accurate. *atan* now below 4 ulps. % rccos: max_er = 0xb7f10439f3111686 24688206287.5958, avg_er = 1415.000, #>=1:0.5 = 3881364:4046644 % rccosh:max_er = 0xb7f10439f3111686 24688206287.5958, avg_er = 1415.000, #>=1:0.5 = 3881364:4046644 % rcexp: max_er = 0xb7e10439f3111686 24679817679.5958, avg_er = 312.908, #>=1:0.5 = 3884684:4078258 % rcsin: max_er = 0xb80bf11b402ee934 24702322906.0057, avg_er = 807.518, #>=1:0.5 = 3841652:4115524 % rcsinh:max_er = 0xb7f10439f3111686 24688206287.5958, avg_er = 1693.508, #>=1:0.5 = 3880876:4368000 % rctan: max_er = 0xb68d1b1c5151fcc0 24501606626.5413, avg_er = 1627.536, #>=1:0.5 = 3890932:4214696 % rctanh:max_er = 0x125ead9c2ea875d 154097358.0911, avg_er = 1416.635, #>=1:0.5 = 2897540:3523200 % ... % iccos: max_er = 0xb80bf11b402ee934 24702322906.0057, avg_er = 1477.124, #>=1:0.5 = 3824832:4420052 % iccosh:max_er = 0xb80bf11b402ee934 24702322906.0057, avg_er = 1477.124, #>=1:0.5 = 3824832:4420052 % icexp: max_er = 0xb7fbf11b402ee934 24693934298.0057, avg_er = 1736.298, #>=1:0.5 = 3839256:4219864 % icsin: max_er = 0xb7f10439f3111686 24688206287.5958, avg_er = 1693.508, #>=1:0.5 = 3880876:4368000 % icsinh:max_er = 0xb80bf11b402ee934 24702322906.0057, avg_er = 807.518, #>=1:0.5 = 3841652:4115524 % ictan: max_er = 0x125ead9c2ea875d 154097358.0911, avg_er = 1416.635, #>=1:0.5 = 2897540:3523200 % ictanh:max_er = 0xb68d1b1c5151fcc0 24501606626.5413, avg_er = 1632.536, #>=1:0.5 = 3890988:4214752 i387 hardware trig functions are very inaccurate. % sparc64 double prec, on 2**12 x 2**12 args: % rcacos:max_er = 0x36b5 3.4192, avg_er = 0.228, #>=1:0.5 = 2394:125984 % rcacosh:max_er = 0x1f15 1.9426, avg_er = 0.258, #>=1:0.5 = 8464:2737832 % rcasin:max_er = 0x2b8c 2.7217, avg_er = 0.113, #>=1:0.5 = 33296:99148 % rcasinh:max_er = 0x1f15 1.9426, avg_er = 0.258, #>=1:0.5 = 8464:2737800 % rcatan:max_er = 0x2c46 2.7671, avg_er = 0.212, #>=1:0.5 = 4680:81364 % rcatanh:max_er = 0x2970 2.5898, avg_er = 0.047, #>=1:0.5 = 129452:691356 % rclog: max_er = 0xe09 0.8772, avg_er = 0.250, #>=1:0.5 = 0:18348 Everything seemed to be passing on sparc64, but it is 300 to 1000 times slower on long doubles, so not many tests completed. Bruce --0-286592833-1344892468=:4897 Content-Type: TEXT/PLAIN; charset=US-ASCII; name=complex-errtab Content-Transfer-Encoding: BASE64 Content-ID: <20120814071428.M4897@besplex.bde.org> Content-Description: Content-Disposition: attachment; filename=complex-errtab YW1kNjQgZmxvYXQgcHJlYywgb24gMioqMTIgeCAyKioxMiBhcmdzOg0KcmNh Y29zOm1heF9lciA9IDB4NTg0NjA4NDEgMi43NTg1LCBhdmdfZXIgPSAwLjMx NywgIz49MTowLjUgPSAyOTA4NDoyNTU3MTINCnJjYWNvc2g6bWF4X2VyID0g MHg1ZTFlNDVlNiAyLjk0MTIsIGF2Z19lciA9IDAuMjYyLCAjPj0xOjAuNSA9 IDg1ODY4OjM0MTM2ODQNCnJjYXNpbjptYXhfZXIgPSAweDYzMWI4MTgzIDMu MDk3MSwgYXZnX2VyID0gMC4yMDksICM+PTE6MC41ID0gMzgzODg6MzgyNTA4 DQpyY2FzaW5oOm1heF9lciA9IDB4NWUxZTQ1ZTYgMi45NDEyLCBhdmdfZXIg PSAwLjI2MiwgIz49MTowLjUgPSA4NTg2ODozNDEzNjg0DQpyY2F0YW46bWF4 X2VyID0gMHg1MWQ3YzQ3YSAyLjU1NzYsIGF2Z19lciA9IDAuMjkwLCAjPj0x OjAuNSA9IDUyOTg0OjMxODA4NA0KcmNhdGFuaDptYXhfZXIgPSAweDczNDI0 ZDUyIDMuNjAxOCwgYXZnX2VyID0gMC4yMDUsICM+PTE6MC41ID0gMjEyNDI0 OjE0Mzc1ODANCnJjbG9nOiBtYXhfZXIgPSAweDI2ZGZhZTRkIDEuMjE0OCwg YXZnX2VyID0gMC4yNDcsICM+PTE6MC41ID0gMTg0OjkyMjQ0DQpyY3NxcnQ6 bWF4X2VyID0gIDB4ZmZmZmFmMCAwLjUwMDAsIGF2Z19lciA9IDAuMjkyLCAj Pj0xOjAuNSA9IDA6MA0KcmNjb3M6IG1heF9lciA9IDB4NTMzMzUxYmEgMi42 MDAwLCBhdmdfZXIgPSAwLjA4OCwgIz49MTowLjUgPSA1ODkzMjozMTIwODgN CnJjY29zaDptYXhfZXIgPSAweDUzMzM1MWJhIDIuNjAwMCwgYXZnX2VyID0g MC4wODgsICM+PTE6MC41ID0gNTg5MzI6MzEyMDg4DQpyY2V4cDogbWF4X2Vy ID0gMHg0MmJmMzdlNSAyLjA4NTgsIGF2Z19lciA9IDAuMDk2LCAjPj0xOjAu NSA9IDU5MDc2OjQ0MzkwNA0KcmNzaW46IG1heF9lciA9IDB4NTIyMGUwNGIg Mi41NjY1LCBhdmdfZXIgPSAwLjA5NSwgIz49MTowLjUgPSA4NDU2NDo0Njg0 ODANCnJjc2luaDptYXhfZXIgPSAweDQ1YmFkY2U4IDIuMTc5MSwgYXZnX2Vy ID0gMC4xMDcsICM+PTE6MC41ID0gNjQyMjA6MTIwODQ2NA0KcmN0YW46IG1h eF9lciA9IDB4OTcwOGJiNGIgNC43MTk4LCBhdmdfZXIgPSAwLjEwNCwgIz49 MTowLjUgPSAxNzc5NzY6OTgwMzU2DQpyY3Rhbmg6bWF4X2VyID0gMHg5Yzdm Nzg2YyA0Ljg5MDYsIGF2Z19lciA9IDAuMTU5LCAjPj0xOjAuNSA9IDYzNTk5 MjoyMTg5NzQ4DQppY2Fjb3M6bWF4X2VyID0gMHg1ZTFlNDVlNiAyLjk0MTIs IGF2Z19lciA9IDAuMjYyLCAjPj0xOjAuNSA9IDg1ODY4OjM0MTM2ODQNCmlj YWNvc2g6bWF4X2VyID0gMHg1ODQ2MDg0MSAyLjc1ODUsIGF2Z19lciA9IDAu MzE3LCAjPj0xOjAuNSA9IDI5MDg0OjI1NTcxMg0KaWNhc2luOm1heF9lciA9 IDB4NWUxZTQ1ZTYgMi45NDEyLCBhdmdfZXIgPSAwLjI2MiwgIz49MTowLjUg PSA4NTg2ODozNDEzNjg0DQppY2FzaW5oOm1heF9lciA9IDB4NjMxYjgxODMg My4wOTcxLCBhdmdfZXIgPSAwLjIwOSwgIz49MTowLjUgPSAzODM4ODozODI1 MDgNCmljYXRhbjptYXhfZXIgPSAweDczNDI0ZDUyIDMuNjAxOCwgYXZnX2Vy ID0gMC4yMDUsICM+PTE6MC41ID0gMjEyNDI0OjE0Mzc1ODANCmljYXRhbmg6 bWF4X2VyID0gMHg1MWQ3YzQ3YSAyLjU1NzYsIGF2Z19lciA9IDAuMjkwLCAj Pj0xOjAuNSA9IDUyOTg0OjMxODA4NA0KaWNsb2c6IG1heF9lciA9IDB4MWZj MmI0ZjUgMC45OTI1LCBhdmdfZXIgPSAwLjMwMiwgIz49MTowLjUgPSAwOjM0 OTgzMA0KaWNzcXJ0Om1heF9lciA9ICAweGZmZmZhZjAgMC41MDAwLCBhdmdf ZXIgPSAwLjI5MiwgIz49MTowLjUgPSAwOjANCmljY29zOiBtYXhfZXIgPSAw eDQ2ZDU2ZTZiIDIuMjEzNiwgYXZnX2VyID0gMC4xMzksICM+PTE6MC41ID0g ODM2MDA6MTM1NzIwNA0KaWNjb3NoOm1heF9lciA9IDB4NDZkNTZlNmIgMi4y MTM2LCBhdmdfZXIgPSAwLjEzOSwgIz49MTowLjUgPSA4MzYwMDoxMzU3MjA0 DQppY2V4cDogbWF4X2VyID0gMHgzZmQxZThkZSAxLjk5NDQsIGF2Z19lciA9 IDAuMTA0LCAjPj0xOjAuNSA9IDgwNzM4OjY1MDY2MA0KaWNzaW46IG1heF9l ciA9IDB4NDViYWRjZTggMi4xNzkxLCBhdmdfZXIgPSAwLjEwNywgIz49MTow LjUgPSA2NDIyMDoxMjA4NDY0DQppY3Npbmg6bWF4X2VyID0gMHg1MjIwZTA0 YiAyLjU2NjUsIGF2Z19lciA9IDAuMDk1LCAjPj0xOjAuNSA9IDg0NTY0OjQ2 ODQ4MA0KaWN0YW46IG1heF9lciA9IDB4OWM3Zjc4NmMgNC44OTA2LCBhdmdf ZXIgPSAwLjE1OSwgIz49MTowLjUgPSA2MzU5OTI6MjE4OTc0OA0KaWN0YW5o Om1heF9lciA9IDB4OTcwOGJiNGIgNC43MTk4LCBhdmdfZXIgPSAwLjEwNCwg Iz49MTowLjUgPSAxNzc5NzY6OTgwMzU2DQoNCmFtZDY0IGRvdWJsZSBwcmVj LCBvbiAyKioxMiB4IDIqKjEyIGFyZ3M6DQpyY2Fjb3M6bWF4X2VyID0gICAg IDB4MWI1YSAzLjQxODksIGF2Z19lciA9IDAuMjI4LCAjPj0xOjAuNSA9IDIz OTQ6MTI1OTg4DQpyY2Fjb3NoOm1heF9lciA9ICAgICAgMHhmN2QgMS45MzYw LCBhdmdfZXIgPSAwLjI1NywgIz49MTowLjUgPSA2MTI6Mjc0MTg2MA0KcmNh c2luOm1heF9lciA9ICAgICAweDE1YzUgMi43MjEyLCBhdmdfZXIgPSAwLjEx MywgIz49MTowLjUgPSAzMzI5Njo5OTE1Mg0KcmNhc2luaDptYXhfZXIgPSAg ICAgIDB4ZjdkIDEuOTM2MCwgYXZnX2VyID0gMC4yNTcsICM+PTE6MC41ID0g NjEyOjI3NDE3OTYNCnJjYXRhbjptYXhfZXIgPSAweDgwMDAwMDAwMDAwMDAw MDAgNDUwMzU5OTYyNzM3MDQ5Ni4wMDAwLCBhdmdfZXIgPSAyNjg0MzU0NTYu MjEyLCAjPj0xOjAuNSA9IDQ2ODE6ODEzNjUNCnJjYXRhbmg6bWF4X2VyID0g MHg4MDAwMDAwMDAwMDAwMDAwIDQ1MDM1OTk2MjczNzA0OTYuMDAwMCwgYXZn X2VyID0gMjY4NDM1NDU2LjA0NywgIz49MTowLjUgPSA0Mjg5OTc6NjkxMzQx DQpyY2xvZzogbWF4X2VyID0gICAgICAweDcwNCAwLjg3NzAsIGF2Z19lciA9 IDAuMjUwLCAjPj0xOjAuNSA9IDA6MjAxNTINCnJjc3FydDptYXhfZXIgPSAg ICAweDExNTk0IDM0LjY5NzMsIGF2Z19lciA9IDAuNDc2LCAjPj0xOjAuNSA9 IDEyMjMyOjc2NjA4MDINCmljYWNvczptYXhfZXIgPSAgICAgIDB4ZjdkIDEu OTM2MCwgYXZnX2VyID0gMC4yNTcsICM+PTE6MC41ID0gNjEyOjI3NDE4NjAN CmljYWNvc2g6bWF4X2VyID0gICAgIDB4MWI1YSAzLjQxODksIGF2Z19lciA9 IDAuMjI4LCAjPj0xOjAuNSA9IDIzOTQ6MTI1OTg4DQppY2FzaW46bWF4X2Vy ID0gICAgICAweGY3ZCAxLjkzNjAsIGF2Z19lciA9IDAuMjU3LCAjPj0xOjAu NSA9IDYxMjoyNzQxNzk2DQppY2FzaW5oOm1heF9lciA9ICAgICAweDE1YzUg Mi43MjEyLCBhdmdfZXIgPSAwLjExMywgIz49MTowLjUgPSAzMzI5Njo5OTE1 Mg0KaWNhdGFuOm1heF9lciA9IDB4ODAwMDAwMDAwMDAwMDAwMCA0NTAzNTk5 NjI3MzcwNDk2LjAwMDAsIGF2Z19lciA9IDI2ODQzNTQ1Ni4wNDcsICM+PTE6 MC41ID0gNDI4OTk3OjY5MTM0MQ0KaWNhdGFuaDptYXhfZXIgPSAweDgwMDAw MDAwMDAwMDAwMDAgNDUwMzU5OTYyNzM3MDQ5Ni4wMDAwLCBhdmdfZXIgPSAy Njg0MzU0NTYuMjEyLCAjPj0xOjAuNSA9IDQ2ODE6ODEzNjUNCmljbG9nOiBt YXhfZXIgPSAgICAgIDB4NmY0IDAuODY5MSwgYXZnX2VyID0gMC4yMTMsICM+ PTE6MC41ID0gMDoxODEwMzINCmljc3FydDptYXhfZXIgPSAweDgwMDAwMDAw MDAwMDAwMDAgNDUwMzU5OTYyNzM3MDQ5Ni4wMDAwLCBhdmdfZXIgPSAyNjg0 MzU0NTYuNDc2LCAjPj0xOjAuNSA9IDE0Mjc5Ojc2NjI4NDkNCg0KaTM4NiBm bG9hdCBwcmVjLCBvbiAyKioxMiB4IDIqKjEyIGFyZ3M6DQpyY2Fjb3M6bWF4 X2VyID0gMHg0MmNkMTljNiAyLjA4NzUsIGF2Z19lciA9IDAuMzE0LCAjPj0x OjAuNSA9IDM4NTQ6MjE1MTE2DQpyY2Fjb3NoOm1heF9lciA9IDB4MzE3MGUy MzIgMS41NDUwLCBhdmdfZXIgPSAwLjI1NCwgIz49MTowLjUgPSAyMzAwODoz MjQ1MDI4DQpyY2FzaW46bWF4X2VyID0gMHg1NWFkYzBkZiAyLjY3NzUsIGF2 Z19lciA9IDAuMjA4LCAjPj0xOjAuNSA9IDM0MzA0OjM1Mzk4MA0KcmNhc2lu aDptYXhfZXIgPSAweDMxNzBlMjMyIDEuNTQ1MCwgYXZnX2VyID0gMC4yNTQs ICM+PTE6MC41ID0gMjMwMDg6MzI0NTAyOA0KcmNhdGFuOm1heF9lciA9IDB4 M2M0MDc4ZWMgMS44ODI5LCBhdmdfZXIgPSAwLjI4NCwgIz49MTowLjUgPSAx MzI2MDoxOTA4MzYNCnJjYXRhbmg6bWF4X2VyID0gMHgyY2VmMzE3MSAxLjQw NDIsIGF2Z19lciA9IDAuMTY3LCAjPj0xOjAuNSA9IDQwOTY6NDIwOTE2DQpy Y2xvZzogbWF4X2VyID0gMHgyNTgzMDg1MyAxLjE3MjIsIGF2Z19lciA9IDAu MjQ2LCAjPj0xOjAuNSA9IDEyMDoyNDg5Mg0KcmNzcXJ0Om1heF9lciA9ICAw eGZmZmZhZjAgMC41MDAwLCBhdmdfZXIgPSAwLjI5MiwgIz49MTowLjUgPSAw OjANCnJjY29zOiBtYXhfZXIgPSAweGI3ZjEwNDM5ZjMxMTE2ODYgMjQ2ODgy MDYyODcuNTk1OCwgYXZnX2VyID0gMTQxNS4wMDAsICM+PTE6MC41ID0gMzg4 MTM2NDo0MDQ2NjQ0DQpyY2Nvc2g6bWF4X2VyID0gMHhiN2YxMDQzOWYzMTEx Njg2IDI0Njg4MjA2Mjg3LjU5NTgsIGF2Z19lciA9IDE0MTUuMDAwLCAjPj0x OjAuNSA9IDM4ODEzNjQ6NDA0NjY0NA0KcmNleHA6IG1heF9lciA9IDB4Yjdl MTA0MzlmMzExMTY4NiAyNDY3OTgxNzY3OS41OTU4LCBhdmdfZXIgPSAzMTIu OTA4LCAjPj0xOjAuNSA9IDM4ODQ2ODQ6NDA3ODI1OA0KcmNzaW46IG1heF9l ciA9IDB4YjgwYmYxMWI0MDJlZTkzNCAyNDcwMjMyMjkwNi4wMDU3LCBhdmdf ZXIgPSA4MDcuNTE4LCAjPj0xOjAuNSA9IDM4NDE2NTI6NDExNTUyNA0KcmNz aW5oOm1heF9lciA9IDB4YjdmMTA0MzlmMzExMTY4NiAyNDY4ODIwNjI4Ny41 OTU4LCBhdmdfZXIgPSAxNjkzLjUwOCwgIz49MTowLjUgPSAzODgwODc2OjQz NjgwMDANCnJjdGFuOiBtYXhfZXIgPSAweGI2OGQxYjFjNTE1MWZjYzAgMjQ1 MDE2MDY2MjYuNTQxMywgYXZnX2VyID0gMTYyNy41MzYsICM+PTE6MC41ID0g Mzg5MDkzMjo0MjE0Njk2DQpyY3Rhbmg6bWF4X2VyID0gMHgxMjVlYWQ5YzJl YTg3NWQgMTU0MDk3MzU4LjA5MTEsIGF2Z19lciA9IDE0MTYuNjM1LCAjPj0x OjAuNSA9IDI4OTc1NDA6MzUyMzIwMA0KaWNhY29zOm1heF9lciA9IDB4MzE3 MGUyMzIgMS41NDUwLCBhdmdfZXIgPSAwLjI1NCwgIz49MTowLjUgPSAyMzAw ODozMjQ1MDI4DQppY2Fjb3NoOm1heF9lciA9IDB4NDJjZDE5YzYgMi4wODc1 LCBhdmdfZXIgPSAwLjMxNCwgIz49MTowLjUgPSAzODU0OjIxNTExNg0KaWNh c2luOm1heF9lciA9IDB4MzE3MGUyMzIgMS41NDUwLCBhdmdfZXIgPSAwLjI1 NCwgIz49MTowLjUgPSAyMzAwODozMjQ1MDI4DQppY2FzaW5oOm1heF9lciA9 IDB4NTVhZGMwZGYgMi42Nzc1LCBhdmdfZXIgPSAwLjIwOCwgIz49MTowLjUg PSAzNDMwNDozNTM5ODANCmljYXRhbjptYXhfZXIgPSAweDJjZWYzMTcxIDEu NDA0MiwgYXZnX2VyID0gMC4xNjcsICM+PTE6MC41ID0gNDA5Njo0MjA5MTYN CmljYXRhbmg6bWF4X2VyID0gMHgzYzQwNzhlYyAxLjg4MjksIGF2Z19lciA9 IDAuMjg0LCAjPj0xOjAuNSA9IDEzMjYwOjE5MDgzNg0KaWNsb2c6IG1heF9l ciA9IDB4MWZjMmI0ZjUgMC45OTI1LCBhdmdfZXIgPSAwLjMwMiwgIz49MTow LjUgPSAwOjMzODcxMg0KaWNzcXJ0Om1heF9lciA9ICAweGZmZmZhZjAgMC41 MDAwLCBhdmdfZXIgPSAwLjI5MiwgIz49MTowLjUgPSAwOjANCmljY29zOiBt YXhfZXIgPSAweGI4MGJmMTFiNDAyZWU5MzQgMjQ3MDIzMjI5MDYuMDA1Nywg YXZnX2VyID0gMTQ3Ny4xMjQsICM+PTE6MC41ID0gMzgyNDgzMjo0NDIwMDUy DQppY2Nvc2g6bWF4X2VyID0gMHhiODBiZjExYjQwMmVlOTM0IDI0NzAyMzIy OTA2LjAwNTcsIGF2Z19lciA9IDE0NzcuMTI0LCAjPj0xOjAuNSA9IDM4MjQ4 MzI6NDQyMDA1Mg0KaWNleHA6IG1heF9lciA9IDB4YjdmYmYxMWI0MDJlZTkz NCAyNDY5MzkzNDI5OC4wMDU3LCBhdmdfZXIgPSAxNzM2LjI5OCwgIz49MTow LjUgPSAzODM5MjU2OjQyMTk4NjQNCmljc2luOiBtYXhfZXIgPSAweGI3ZjEw NDM5ZjMxMTE2ODYgMjQ2ODgyMDYyODcuNTk1OCwgYXZnX2VyID0gMTY5My41 MDgsICM+PTE6MC41ID0gMzg4MDg3Njo0MzY4MDAwDQppY3Npbmg6bWF4X2Vy ID0gMHhiODBiZjExYjQwMmVlOTM0IDI0NzAyMzIyOTA2LjAwNTcsIGF2Z19l ciA9IDgwNy41MTgsICM+PTE6MC41ID0gMzg0MTY1Mjo0MTE1NTI0DQppY3Rh bjogbWF4X2VyID0gMHgxMjVlYWQ5YzJlYTg3NWQgMTU0MDk3MzU4LjA5MTEs IGF2Z19lciA9IDE0MTYuNjM1LCAjPj0xOjAuNSA9IDI4OTc1NDA6MzUyMzIw MA0KaWN0YW5oOm1heF9lciA9IDB4YjY4ZDFiMWM1MTUxZmNjMCAyNDUwMTYw NjYyNi41NDEzLCBhdmdfZXIgPSAxNjMyLjUzNiwgIz49MTowLjUgPSAzODkw OTg4OjQyMTQ3NTINCg0KaTM4NiBkb3VibGUgcHJlYywgb24gMioqMTIgeCAy KioxMiBhcmdzOg0KcmNhY29zOm1heF9lciA9ICAgICAweDExZTggMi4yMzgz LCBhdmdfZXIgPSAwLjE2NSwgIz49MTowLjUgPSAyNDg6MTExODUwDQpyY2Fj b3NoOm1heF9lciA9ICAgICAgMHhiMDIgMS4zNzYwLCBhdmdfZXIgPSAwLjI1 NiwgIz49MTowLjUgPSAxMDQ6MjcxNTMxMg0KcmNhc2luOm1heF9lciA9ICAg ICAweDEzY2UgMi40NzU2LCBhdmdfZXIgPSAwLjExMiwgIz49MTowLjUgPSA1 NjE2Ojk1MDYwDQpyY2FzaW5oOm1heF9lciA9ICAgICAgMHhiMDIgMS4zNzYw LCBhdmdfZXIgPSAwLjI1NiwgIz49MTowLjUgPSAxMDQ6MjcxNTMxMg0KcmNh dGFuOm1heF9lciA9ICAgICAgMHg5ZWQgMS4yNDA3LCBhdmdfZXIgPSAwLjAx NSwgIz49MTowLjUgPSA0MDg0OjQ4OTIwDQpyY2F0YW5oOm1heF9lciA9ICAg ICAgMHhiMTcgMS4zODYyLCBhdmdfZXIgPSAwLjAxNCwgIz49MTowLjUgPSA1 Njo3NzQ1Ng0KcmNsb2c6IG1heF9lciA9ICAgICAgMHg3MDQgMC44NzcwLCBh dmdfZXIgPSAwLjI1MCwgIz49MTowLjUgPSAwOjIwMTEyDQpyY3NxcnQ6bWF4 X2VyID0gICAgICAweDdhYSAwLjk1ODAsIGF2Z19lciA9IDAuMzk5LCAjPj0x OjAuNSA9IDA6NDEwNA0KaWNhY29zOm1heF9lciA9ICAgICAgMHhiMDIgMS4z NzYwLCBhdmdfZXIgPSAwLjI1NiwgIz49MTowLjUgPSAxMDQ6MjcxNTMxMg0K aWNhY29zaDptYXhfZXIgPSAgICAgMHgxMWU4IDIuMjM4MywgYXZnX2VyID0g MC4xNjUsICM+PTE6MC41ID0gMjQ4OjExMTg1MA0KaWNhc2luOm1heF9lciA9 ICAgICAgMHhiMDIgMS4zNzYwLCBhdmdfZXIgPSAwLjI1NiwgIz49MTowLjUg PSAxMDQ6MjcxNTMxMg0KaWNhc2luaDptYXhfZXIgPSAgICAgMHgxM2NlIDIu NDc1NiwgYXZnX2VyID0gMC4xMTIsICM+PTE6MC41ID0gNTYxNjo5NTA2MA0K aWNhdGFuOm1heF9lciA9ICAgICAgMHhiMTcgMS4zODYyLCBhdmdfZXIgPSAw LjAxNCwgIz49MTowLjUgPSA1Njo3NzQ1Ng0KaWNhdGFuaDptYXhfZXIgPSAg ICAgIDB4OWVkIDEuMjQwNywgYXZnX2VyID0gMC4wMTUsICM+PTE6MC41ID0g NDA4NDo0ODkyMA0KaWNsb2c6IG1heF9lciA9ICAgICAgMHg2ZjQgMC44Njkx LCBhdmdfZXIgPSAwLjIxMywgIz49MTowLjUgPSAwOjE4MTAzMg0KaWNzcXJ0 Om1heF9lciA9ICAgICAgMHg0MDAgMC41MDAwLCBhdmdfZXIgPSAwLjM5OSwg Iz49MTowLjUgPSAwOjQwMjYNCg0Kc3BhcmM2NCBkb3VibGUgcHJlYywgb24g MioqMTIgeCAyKioxMiBhcmdzOg0KcmNhY29zOm1heF9lciA9ICAgICAweDM2 YjUgMy40MTkyLCBhdmdfZXIgPSAwLjIyOCwgIz49MTowLjUgPSAyMzk0OjEy NTk4NA0KcmNhY29zaDptYXhfZXIgPSAgICAgMHgxZjE1IDEuOTQyNiwgYXZn X2VyID0gMC4yNTgsICM+PTE6MC41ID0gODQ2NDoyNzM3ODMyDQpyY2FzaW46 bWF4X2VyID0gICAgIDB4MmI4YyAyLjcyMTcsIGF2Z19lciA9IDAuMTEzLCAj Pj0xOjAuNSA9IDMzMjk2Ojk5MTQ4DQpyY2FzaW5oOm1heF9lciA9ICAgICAw eDFmMTUgMS45NDI2LCBhdmdfZXIgPSAwLjI1OCwgIz49MTowLjUgPSA4NDY0 OjI3Mzc4MDANCnJjYXRhbjptYXhfZXIgPSAgICAgMHgyYzQ2IDIuNzY3MSwg YXZnX2VyID0gMC4yMTIsICM+PTE6MC41ID0gNDY4MDo4MTM2NA0KcmNhdGFu aDptYXhfZXIgPSAgICAgMHgyOTcwIDIuNTg5OCwgYXZnX2VyID0gMC4wNDcs ICM+PTE6MC41ID0gMTI5NDUyOjY5MTM1Ng0KcmNsb2c6IG1heF9lciA9ICAg ICAgMHhlMDkgMC44NzcyLCBhdmdfZXIgPSAwLjI1MCwgIz49MTowLjUgPSAw OjE4MzQ4DQo= --0-286592833-1344892468=:4897-- From owner-freebsd-numerics@FreeBSD.ORG Mon Aug 13 21:29:24 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 594701065674 for ; Mon, 13 Aug 2012 21:29:24 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from mail08.syd.optusnet.com.au (mail08.syd.optusnet.com.au [211.29.132.189]) by mx1.freebsd.org (Postfix) with ESMTP id C8A5E8FC15 for ; Mon, 13 Aug 2012 21:29:23 +0000 (UTC) Received: from c122-106-171-246.carlnfd1.nsw.optusnet.com.au (c122-106-171-246.carlnfd1.nsw.optusnet.com.au [122.106.171.246]) by mail08.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id q7DLTC3P029816 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Tue, 14 Aug 2012 07:29:15 +1000 Date: Tue, 14 Aug 2012 07:29:12 +1000 (EST) From: Bruce Evans X-X-Sender: bde@besplex.bde.org To: Stephen Montgomery-Smith In-Reply-To: <50296558.8060909@missouri.edu> Message-ID: <20120814072345.E5260@besplex.bde.org> References: <5017111E.6060003@missouri.edu> <501C361D.4010807@missouri.edu> <20120804165555.X1231@besplex.bde.org> <501D51D7.1020101@missouri.edu> <20120805030609.R3101@besplex.bde.org> <501D9C36.2040207@missouri.edu> <20120805175106.X3574@besplex.bde.org> <501EC015.3000808@missouri.edu> <20120805191954.GA50379@troutmask.apl.washington.edu> <20120807205725.GA10572@server.rulingia.com> <20120809025220.N4114@besplex.bde.org> <5027F07E.9060409@missouri.edu> <20120814003614.H3692@besplex.bde.org> <50296558.8060909@missouri.edu> MIME-Version: 1.0 Content-Type: MULTIPART/MIXED; BOUNDARY="0-1536198523-1344893352=:5260" Cc: freebsd-numerics@freebsd.org, Bruce Evans Subject: Re: Complex arg-trig functions X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 13 Aug 2012 21:29:24 -0000 This message is in MIME format. The first part should be readable text, while the remaining parts are likely unreadable without MIME-aware tools. --0-1536198523-1344893352=:5260 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed On Mon, 13 Aug 2012, Stephen Montgomery-Smith wrote: > Can you post the two files fpmath.h and local.h that are needed to build your > cplex.c? fpmath.h is in libc and needs a couple of -I paths to reach. local.h just declares everything theat might not be in math.h. Attached. You will have to fake log*l() or not compile clogl(). I don't like the combinatoral explosion in the number of interfaces, but wanted to keep local.h readable. cplex.c uses macros to avoid repetition, but this is painful for debugging compared with a previous version. Bruce --0-1536198523-1344893352=:5260 Content-Type: TEXT/PLAIN; charset=US-ASCII; name="local.h" Content-Transfer-Encoding: BASE64 Content-ID: <20120814072912.R5260@besplex.bde.org> Content-Description: Content-Disposition: attachment; filename="local.h" bG9uZyBkb3VibGUgZXhwbChsb25nIGRvdWJsZSk7DQpsb25nIGRvdWJsZSBl eHBtMWwobG9uZyBkb3VibGUpOw0KbG9uZyBkb3VibGUgbG9nbChsb25nIGRv dWJsZSk7DQpsb25nIGRvdWJsZSBsb2cxMGwobG9uZyBkb3VibGUpOw0KbG9u ZyBkb3VibGUgbG9nMXBsKGxvbmcgZG91YmxlKTsNCmxvbmcgZG91YmxlIGxv ZzJsKGxvbmcgZG91YmxlKTsNCg0KI2lmZGVmIF9DT01QTEVYX0gNCmRvdWJs ZSBjb21wbGV4CWNjb3MoZG91YmxlIGNvbXBsZXgpOw0KZmxvYXQgY29tcGxl eAljY29zZihmbG9hdCBjb21wbGV4KTsNCmRvdWJsZSBjb21wbGV4CWNjb3No KGRvdWJsZSBjb21wbGV4KTsNCmZsb2F0IGNvbXBsZXgJY2Nvc2hmKGZsb2F0 IGNvbXBsZXgpOw0KZG91YmxlIGNvbXBsZXgJY2V4cChkb3VibGUgY29tcGxl eCk7DQpmbG9hdCBjb21wbGV4CWNleHBmKGZsb2F0IGNvbXBsZXgpOw0KZG91 YmxlIGNvbXBsZXgJY3Npbihkb3VibGUgY29tcGxleCk7DQpmbG9hdCBjb21w bGV4CWNzaW5mKGZsb2F0IGNvbXBsZXgpOw0KZG91YmxlIGNvbXBsZXgJY3Np bmgoZG91YmxlIGNvbXBsZXgpOw0KZmxvYXQgY29tcGxleAljc2luaGYoZmxv YXQgY29tcGxleCk7DQpkb3VibGUgY29tcGxleAljdGFuKGRvdWJsZSBjb21w bGV4KTsNCmZsb2F0IGNvbXBsZXgJY3RhbmYoZmxvYXQgY29tcGxleCk7DQpk b3VibGUgY29tcGxleAljdGFuaChkb3VibGUgY29tcGxleCk7DQpmbG9hdCBj b21wbGV4CWN0YW5oZihmbG9hdCBjb21wbGV4KTsNCiNlbmRpZg0KDQojaWZk ZWYgX0NPTVBMRVhfSA0KZG91YmxlIGNvbXBsZXgJY2FzaW4oZG91YmxlIGNv bXBsZXgpOw0KZG91YmxlIGNvbXBsZXgJY2FzaW5oKGRvdWJsZSBjb21wbGV4 KTsNCmRvdWJsZSBjb21wbGV4CWNhY29zaChkb3VibGUgY29tcGxleCk7DQpk b3VibGUgY29tcGxleAljYWNvcyhkb3VibGUgY29tcGxleCk7DQpkb3VibGUg Y29tcGxleAljYXRhbihkb3VibGUgY29tcGxleCk7DQpkb3VibGUgY29tcGxl eAljYXRhbmgoZG91YmxlIGNvbXBsZXgpOw0KZG91YmxlIGNvbXBsZXgJY2xv Zyhkb3VibGUgY29tcGxleCk7DQoNCmZsb2F0IGNvbXBsZXgJY2FzaW5mKGZs b2F0IGNvbXBsZXgpOw0KZmxvYXQgY29tcGxleAljYXNpbmhmKGZsb2F0IGNv bXBsZXgpOw0KZmxvYXQgY29tcGxleAljYWNvc2hmKGZsb2F0IGNvbXBsZXgp Ow0KZmxvYXQgY29tcGxleAljYWNvc2YoZmxvYXQgY29tcGxleCk7DQpmbG9h dCBjb21wbGV4CWNhdGFuZihmbG9hdCBjb21wbGV4KTsNCmZsb2F0IGNvbXBs ZXgJY2F0YW5oZihmbG9hdCBjb21wbGV4KTsNCmZsb2F0IGNvbXBsZXgJY2xv Z2YoZmxvYXQgY29tcGxleCk7DQoNCmxvbmcgZG91YmxlIGNvbXBsZXgJY2Fz aW5sKGxvbmcgZG91YmxlIGNvbXBsZXgpOw0KbG9uZyBkb3VibGUgY29tcGxl eAljYXNpbmhsKGxvbmcgZG91YmxlIGNvbXBsZXgpOw0KbG9uZyBkb3VibGUg Y29tcGxleAljYWNvc2hsKGxvbmcgZG91YmxlIGNvbXBsZXgpOw0KbG9uZyBk b3VibGUgY29tcGxleAljYWNvc2wobG9uZyBkb3VibGUgY29tcGxleCk7DQps b25nIGRvdWJsZSBjb21wbGV4CWNhdGFubChsb25nIGRvdWJsZSBjb21wbGV4 KTsNCmxvbmcgZG91YmxlIGNvbXBsZXgJY2F0YW5obChsb25nIGRvdWJsZSBj b21wbGV4KTsNCmxvbmcgZG91YmxlIGNvbXBsZXgJY2xvZ2wobG9uZyBkb3Vi bGUgY29tcGxleCk7DQojZW5kaWYNCg0KZG91YmxlCXJjYWNvcyhkb3VibGUs IGRvdWJsZSk7DQpkb3VibGUJcmNhY29zaChkb3VibGUsIGRvdWJsZSk7DQpk b3VibGUJcmNhc2luKGRvdWJsZSwgZG91YmxlKTsNCmRvdWJsZQlyY2FzaW5o KGRvdWJsZSwgZG91YmxlKTsNCmRvdWJsZQlyY2F0YW4oZG91YmxlLCBkb3Vi bGUpOw0KZG91YmxlCXJjYXRhbmgoZG91YmxlLCBkb3VibGUpOw0KZG91Ymxl CXJjbG9nKGRvdWJsZSwgZG91YmxlKTsNCg0KZmxvYXQJcmNhY29zZihmbG9h dCwgZmxvYXQpOw0KZmxvYXQJcmNhY29zaGYoZmxvYXQsIGZsb2F0KTsNCmZs b2F0CXJjYXNpbmYoZmxvYXQsIGZsb2F0KTsNCmZsb2F0CXJjYXNpbmhmKGZs b2F0LCBmbG9hdCk7DQpmbG9hdAlyY2F0YW5mKGZsb2F0LCBmbG9hdCk7DQpm bG9hdAlyY2F0YW5oZihmbG9hdCwgZmxvYXQpOw0KZmxvYXQJcmNsb2dmKGZs b2F0LCBmbG9hdCk7DQoNCmxvbmcgZG91YmxlCXJjYWNvc2wobG9uZyBkb3Vi bGUsIGxvbmcgZG91YmxlKTsNCmxvbmcgZG91YmxlCXJjYWNvc2hsKGxvbmcg ZG91YmxlLCBsb25nIGRvdWJsZSk7DQpsb25nIGRvdWJsZQlyY2FzaW5sKGxv bmcgZG91YmxlLCBsb25nIGRvdWJsZSk7DQpsb25nIGRvdWJsZQlyY2FzaW5o bChsb25nIGRvdWJsZSwgbG9uZyBkb3VibGUpOw0KbG9uZyBkb3VibGUJcmNh dGFubChsb25nIGRvdWJsZSwgbG9uZyBkb3VibGUpOw0KbG9uZyBkb3VibGUJ cmNhdGFuaGwobG9uZyBkb3VibGUsIGxvbmcgZG91YmxlKTsNCmxvbmcgZG91 YmxlCXJjbG9nbChsb25nIGRvdWJsZSwgbG9uZyBkb3VibGUpOw0KDQpkb3Vi bGUJaWNhY29zKGRvdWJsZSwgZG91YmxlKTsNCmRvdWJsZQlpY2Fjb3NoKGRv dWJsZSwgZG91YmxlKTsNCmRvdWJsZQlpY2FzaW4oZG91YmxlLCBkb3VibGUp Ow0KZG91YmxlCWljYXNpbmgoZG91YmxlLCBkb3VibGUpOw0KZG91YmxlCWlj YXRhbihkb3VibGUsIGRvdWJsZSk7DQpkb3VibGUJaWNhdGFuaChkb3VibGUs IGRvdWJsZSk7DQpkb3VibGUJaWNsb2coZG91YmxlLCBkb3VibGUpOw0KDQpm bG9hdAlpY2Fjb3NmKGZsb2F0LCBmbG9hdCk7DQpmbG9hdAlpY2Fjb3NoZihm bG9hdCwgZmxvYXQpOw0KZmxvYXQJaWNhc2luZihmbG9hdCwgZmxvYXQpOw0K ZmxvYXQJaWNhc2luaGYoZmxvYXQsIGZsb2F0KTsNCmZsb2F0CWljYXRhbmYo ZmxvYXQsIGZsb2F0KTsNCmZsb2F0CWljYXRhbmhmKGZsb2F0LCBmbG9hdCk7 DQpmbG9hdAlpY2xvZ2YoZmxvYXQsIGZsb2F0KTsNCg0KbG9uZyBkb3VibGUJ aWNhY29zbChsb25nIGRvdWJsZSwgbG9uZyBkb3VibGUpOw0KbG9uZyBkb3Vi bGUJaWNhY29zaGwobG9uZyBkb3VibGUsIGxvbmcgZG91YmxlKTsNCmxvbmcg ZG91YmxlCWljYXNpbmwobG9uZyBkb3VibGUsIGxvbmcgZG91YmxlKTsNCmxv bmcgZG91YmxlCWljYXNpbmhsKGxvbmcgZG91YmxlLCBsb25nIGRvdWJsZSk7 DQpsb25nIGRvdWJsZQlpY2F0YW5sKGxvbmcgZG91YmxlLCBsb25nIGRvdWJs ZSk7DQpsb25nIGRvdWJsZQlpY2F0YW5obChsb25nIGRvdWJsZSwgbG9uZyBk b3VibGUpOw0KbG9uZyBkb3VibGUJaWNsb2dsKGxvbmcgZG91YmxlLCBsb25n IGRvdWJsZSk7DQoNCmRvdWJsZQlyY3NxcnQoZG91YmxlLCBkb3VibGUpOw0K ZmxvYXQJcmNzcXJ0ZihmbG9hdCwgZmxvYXQpOw0KbG9uZyBkb3VibGUJcmNz cXJ0bChsb25nIGRvdWJsZSwgbG9uZyBkb3VibGUpOw0KDQpkb3VibGUJcmNj b3MoZG91YmxlLCBkb3VibGUpOw0KZmxvYXQJcmNjb3NmKGZsb2F0LCBmbG9h dCk7DQpkb3VibGUJcmNjb3NoKGRvdWJsZSwgZG91YmxlKTsNCmZsb2F0CXJj Y29zaGYoZmxvYXQsIGZsb2F0KTsNCmRvdWJsZQlyY2V4cChkb3VibGUsIGRv dWJsZSk7DQpmbG9hdAlyY2V4cGYoZmxvYXQsIGZsb2F0KTsNCmRvdWJsZQly Y3Npbihkb3VibGUsIGRvdWJsZSk7DQpmbG9hdAlyY3NpbmYoZmxvYXQsIGZs b2F0KTsNCmRvdWJsZQlyY3NpbmgoZG91YmxlLCBkb3VibGUpOw0KZmxvYXQJ cmNzaW5oZihmbG9hdCwgZmxvYXQpOw0KZG91YmxlCXJjdGFuKGRvdWJsZSwg ZG91YmxlKTsNCmZsb2F0CXJjdGFuZihmbG9hdCwgZmxvYXQpOw0KZG91Ymxl CXJjdGFuaChkb3VibGUsIGRvdWJsZSk7DQpmbG9hdAlyY3RhbmhmKGZsb2F0 LCBmbG9hdCk7DQoNCmRvdWJsZQlpY3NxcnQoZG91YmxlLCBkb3VibGUpOw0K ZmxvYXQJaWNzcXJ0ZihmbG9hdCwgZmxvYXQpOw0KbG9uZyBkb3VibGUJaWNz cXJ0bChsb25nIGRvdWJsZSwgbG9uZyBkb3VibGUpOw0KDQpkb3VibGUJaWNj b3MoZG91YmxlLCBkb3VibGUpOw0KZmxvYXQJaWNjb3NmKGZsb2F0LCBmbG9h dCk7DQpkb3VibGUJaWNjb3NoKGRvdWJsZSwgZG91YmxlKTsNCmZsb2F0CWlj Y29zaGYoZmxvYXQsIGZsb2F0KTsNCmRvdWJsZQlpY2V4cChkb3VibGUsIGRv dWJsZSk7DQpmbG9hdAlpY2V4cGYoZmxvYXQsIGZsb2F0KTsNCmRvdWJsZQlp Y3Npbihkb3VibGUsIGRvdWJsZSk7DQpmbG9hdAlpY3NpbmYoZmxvYXQsIGZs b2F0KTsNCmRvdWJsZQlpY3NpbmgoZG91YmxlLCBkb3VibGUpOw0KZmxvYXQJ aWNzaW5oZihmbG9hdCwgZmxvYXQpOw0KZG91YmxlCWljdGFuKGRvdWJsZSwg ZG91YmxlKTsNCmZsb2F0CWljdGFuZihmbG9hdCwgZmxvYXQpOw0KZG91Ymxl CWljdGFuaChkb3VibGUsIGRvdWJsZSk7DQpmbG9hdAlpY3RhbmhmKGZsb2F0 LCBmbG9hdCk7DQo= --0-1536198523-1344893352=:5260-- From owner-freebsd-numerics@FreeBSD.ORG Mon Aug 13 21:40:58 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id AFE1C106566B for ; Mon, 13 Aug 2012 21:40:58 +0000 (UTC) (envelope-from stephen@missouri.edu) Received: from wilberforce.math.missouri.edu (wilberforce.math.missouri.edu [128.206.184.213]) by mx1.freebsd.org (Postfix) with ESMTP id 7D9C98FC1B for ; Mon, 13 Aug 2012 21:40:58 +0000 (UTC) Received: from [127.0.0.1] (wilberforce.math.missouri.edu [128.206.184.213]) by wilberforce.math.missouri.edu (8.14.5/8.14.5) with ESMTP id q7DLeuQm029114; Mon, 13 Aug 2012 16:40:57 -0500 (CDT) (envelope-from stephen@missouri.edu) Message-ID: <50297468.20902@missouri.edu> Date: Mon, 13 Aug 2012 16:40:56 -0500 From: Stephen Montgomery-Smith User-Agent: Mozilla/5.0 (X11; Linux i686; rv:14.0) Gecko/20120714 Thunderbird/14.0 MIME-Version: 1.0 To: Bruce Evans References: <5017111E.6060003@missouri.edu> <501C361D.4010807@missouri.edu> <20120804165555.X1231@besplex.bde.org> <501D51D7.1020101@missouri.edu> <20120805030609.R3101@besplex.bde.org> <501D9C36.2040207@missouri.edu> <20120805175106.X3574@besplex.bde.org> <501EC015.3000808@missouri.edu> <20120805191954.GA50379@troutmask.apl.washington.edu> <20120807205725.GA10572@server.rulingia.com> <20120809025220.N4114@besplex.bde.org> <5027F07E.9060409@missouri.edu> <20120814003614.H3692@besplex.bde.org> <50295887.2010608@missouri.edu> <20120814055931.Q4897@besplex.bde.org> In-Reply-To: <20120814055931.Q4897@besplex.bde.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-numerics@freebsd.org Subject: Re: Complex arg-trig functions X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 13 Aug 2012 21:40:58 -0000 On 08/13/2012 04:14 PM, Bruce Evans wrote: > On Mon, 13 Aug 2012, Stephen Montgomery-Smith wrote: >> The float versions really are much harder than the double and >> long-double versions. And it just doesn't seem worth the effort, >> because who uses them when the double versions are available? > > I doubt that they are really harder. Yes, they are. In fact, the paper by Hull et al (whose algorithms I copied) talked about "bad floating point." IEEE single precision doesn't actually fit their criteria for being bad, but then again I didn't exactly follow their algorithm. > 2. Division tends to be slow, and is slow on x86. On Phenom, division > has a latency of 20 cycles and a throughput of 1 per 15 cycles in > scalar double precision (17 in vector[2] double precision). Maybe > the compiler can optimize division by a power of 2 to a multiplication > though. Does the time taken to perform certain floating point operations depend upon what the numbers are? I know that when I do long multiplication by hand that (1) it is much faster than long division, and (2) certain numbers (like 1000) are very quick to multiply and divide. Similarly with the computer, is it possible that division by 4 is much faster than division by (say) 4.4424242389, and that division by 4 is just as fast as multiplication by 0.25? (And multiplication by 0.25 is faster than multiplication than 0.235341212412?) > So all of those '* 0.5[F]'s can be turned back into '/ 2's (no need to > depend on the optimization by writing 2.0). Well, I have already done it. > This was tracked to a known annoying difference between SSE and i387 > on NaNs. x+y clears the sign buit on i387 but not on SSE. SSE is > correct. Was it your code that was wrong, or mine? And if it is mine, what is the fix? > i386 is generally more accurate. *atan* now below 4 ulps. I have tested it for very large numbers of random inputs (several millions). It does creep up to about 3.5 ulp. > > % rccos: max_er = 0xb7f10439f3111686 24688206287.5958, avg_er = > 1415.000, #>=1:0.5 = 3881364:4046644 > i387 hardware trig functions are very inaccurate. But also, what about the problem of when the input is close to one of the non-trivial roots of sin, cos, etc? As a mathematician, I wouldn't be shocked if sin(M_PI) was 1e-15 or such like. > Everything seemed to be passing on sparc64, but it is 300 to 1000 times > slower on long doubles, so not many tests completed. Also, as you said earlier, the values of LDBL_MIN and LDBL_EPSILON will be different. > > Bruce From owner-freebsd-numerics@FreeBSD.ORG Mon Aug 13 21:46:08 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id E60D7106564A for ; Mon, 13 Aug 2012 21:46:07 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from mail06.syd.optusnet.com.au (mail06.syd.optusnet.com.au [211.29.132.187]) by mx1.freebsd.org (Postfix) with ESMTP id 77C9A8FC12 for ; Mon, 13 Aug 2012 21:46:06 +0000 (UTC) Received: from c122-106-171-246.carlnfd1.nsw.optusnet.com.au (c122-106-171-246.carlnfd1.nsw.optusnet.com.au [122.106.171.246]) by mail06.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id q7DLjx9G025175 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Tue, 14 Aug 2012 07:45:59 +1000 Date: Tue, 14 Aug 2012 07:45:59 +1000 (EST) From: Bruce Evans X-X-Sender: bde@besplex.bde.org To: Stephen Montgomery-Smith In-Reply-To: <50295F5C.6010800@missouri.edu> Message-ID: <20120814072946.S5260@besplex.bde.org> References: <5017111E.6060003@missouri.edu> <501C361D.4010807@missouri.edu> <20120804165555.X1231@besplex.bde.org> <501D51D7.1020101@missouri.edu> <20120805030609.R3101@besplex.bde.org> <501D9C36.2040207@missouri.edu> <20120805175106.X3574@besplex.bde.org> <501EC015.3000808@missouri.edu> <20120805191954.GA50379@troutmask.apl.washington.edu> <20120807205725.GA10572@server.rulingia.com> <20120809025220.N4114@besplex.bde.org> <5027F07E.9060409@missouri.edu> <20120814003614.H3692@besplex.bde.org> <50295F5C.6010800@missouri.edu> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: freebsd-numerics@freebsd.org, Bruce Evans Subject: Re: Complex arg-trig functions X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 13 Aug 2012 21:46:08 -0000 On Mon, 13 Aug 2012, Stephen Montgomery-Smith wrote: > On 08/13/2012 11:57 AM, Bruce Evans wrote: > >> @ if (sy == 0) >> @ ! return (cpack(cimag(w), -creal(w))); >> @ ! return (cpack(-cimag(w), creal(w))); >> >> The sign of creal(cacos()) is always 1, but this makes it +- the sign >> of atan2(x, y). > > Yes, but the sign of atan2(y,x) will always be the same as the sign of y. So > the two negatives will cancel. y can have any sign I think. But the problem only seemed to happen with denormals and/or NaNs. There might be a problem with NaNs not giving one of the canceling negatives. > But your code works just as well (and your code doesn't clobber the -0's in > the imaginary part). fabs() forces the sign even for NaNs. >> @ --- 408,420 ---- >> @ @ if (ISFINITE(bx) && ISFINITE(by) && (x > >> RECIP_SQRT_EPSILON_100 || y > RECIP_SQRT_EPSILON_100)) { >> @ ! /* XXX following can also raise overflow */ > > I don't see how the code could raise an overflow. The output of clog should > always be very much less than DBL_MAX. (Originally I had clog(2*z), and that > could raise an unwarranted overflow.) @ if (ISFINITE(bx) && ISFINITE(by) && (x > RECIP_SQRT_EPSILON_100 || y > RECIP_SQRT_EPSILON_100)) { @ ! /* XXX following can also raise overflow */ @ ! if (huge+x+y>one) { /* raise inexact */ @ ! w = clog_for_large_values(z); @ ! /* Can't add M_LN2 to w since it should clobber -0*I. */ @ ! rx = fabs(cimag(w)); @ ! ry = creal(w) + M_LN2; @ if (sy == 0) @ ! ry = -ry; @ ! return (cpack(rx, ry)); @ } @ } clog() won't overflow spuriously, but huge+x+y might. ((int)x == 0)' is a safer method of raising inexact for certain x. Expressions using something like huge+x-huge caused me problems in Steve's s_expm1l.c recently. They not only didn't set inexact right, but they also didn't work if the expression is evaluated in extra precision. huge*huge also doesn't work in extra exponent range. i386 by default doesn't have extra precision for doubles, but it does have extra exponent range. So when huge = 1e300, 'return huge*huge;' returns the long double 1e600 in a register. Assignment of this to a double would overflow, possibly much later where it is harder to debug. But sometimes the value is just used from the register and never causes overflow. This is a smaller problem than for huge+x-huge, since the latter is an example where the result of the extra exponent range is used immediately to give another result that doesn't need the extra exponent range to represent. Bruce From owner-freebsd-numerics@FreeBSD.ORG Mon Aug 13 22:08:17 2012 Return-Path: Delivered-To: freebsd-numerics@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 84D79106566B for ; Mon, 13 Aug 2012 22:08:17 +0000 (UTC) (envelope-from stephen@missouri.edu) Received: from wilberforce.math.missouri.edu (wilberforce.math.missouri.edu [128.206.184.213]) by mx1.freebsd.org (Postfix) with ESMTP id 2B3D78FC0A for ; Mon, 13 Aug 2012 22:08:17 +0000 (UTC) Received: from [127.0.0.1] (wilberforce.math.missouri.edu [128.206.184.213]) by wilberforce.math.missouri.edu (8.14.5/8.14.5) with ESMTP id q7DM8Fpk031321; Mon, 13 Aug 2012 17:08:16 -0500 (CDT) (envelope-from stephen@missouri.edu) Message-ID: <50297AD0.9040206@missouri.edu> Date: Mon, 13 Aug 2012 17:08:16 -0500 From: Stephen Montgomery-Smith User-Agent: Mozilla/5.0 (X11; Linux i686; rv:14.0) Gecko/20120714 Thunderbird/14.0 MIME-Version: 1.0 To: Bruce Evans References: <5017111E.6060003@missouri.edu> <501C361D.4010807@missouri.edu> <20120804165555.X1231@besplex.bde.org> <501D51D7.1020101@missouri.edu> <20120805030609.R3101@besplex.bde.org> <501D9C36.2040207@missouri.edu> <20120805175106.X3574@besplex.bde.org> <501EC015.3000808@missouri.edu> <20120805191954.GA50379@troutmask.apl.washington.edu> <20120807205725.GA10572@server.rulingia.com> <20120809025220.N4114@besplex.bde.org> <5027F07E.9060409@missouri.edu> <20120814003614.H3692@besplex.bde.org> In-Reply-To: <20120814003614.H3692@besplex.bde.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-numerics@FreeBSD.org Subject: Re: Complex arg-trig functions X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 13 Aug 2012 22:08:17 -0000 The errors for your clog functions are very impressive! From owner-freebsd-numerics@FreeBSD.ORG Mon Aug 13 22:16:06 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 626FD106564A for ; Mon, 13 Aug 2012 22:16:06 +0000 (UTC) (envelope-from stephen@missouri.edu) Received: from wilberforce.math.missouri.edu (wilberforce.math.missouri.edu [128.206.184.213]) by mx1.freebsd.org (Postfix) with ESMTP id 225BF8FC08 for ; Mon, 13 Aug 2012 22:16:05 +0000 (UTC) Received: from [127.0.0.1] (wilberforce.math.missouri.edu [128.206.184.213]) by wilberforce.math.missouri.edu (8.14.5/8.14.5) with ESMTP id q7DMG58s031847 for ; Mon, 13 Aug 2012 17:16:05 -0500 (CDT) (envelope-from stephen@missouri.edu) Message-ID: <50297CA5.5010900@missouri.edu> Date: Mon, 13 Aug 2012 17:16:05 -0500 From: Stephen Montgomery-Smith User-Agent: Mozilla/5.0 (X11; Linux i686; rv:14.0) Gecko/20120714 Thunderbird/14.0 MIME-Version: 1.0 To: freebsd-numerics@freebsd.org References: <5017111E.6060003@missouri.edu> <501C361D.4010807@missouri.edu> <20120804165555.X1231@besplex.bde.org> <501D51D7.1020101@missouri.edu> <20120805030609.R3101@besplex.bde.org> <501D9C36.2040207@missouri.edu> <20120805175106.X3574@besplex.bde.org> <501EC015.3000808@missouri.edu> <20120805191954.GA50379@troutmask.apl.washington.edu> <20120807205725.GA10572@server.rulingia.com> <20120809025220.N4114@besplex.bde.org> <5027F07E.9060409@missouri.edu> <20120814003614.H3692@besplex.bde.org> <50295F5C.6010800@missouri.edu> <20120814072946.S5260@besplex.bde.org> In-Reply-To: <20120814072946.S5260@besplex.bde.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: Complex arg-trig functions X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 13 Aug 2012 22:16:06 -0000 On 08/13/2012 04:45 PM, Bruce Evans wrote: > y can have any sign I think. But the problem only seemed to happen with > denormals and/or NaNs. There might be a problem with NaNs not giving one > of the canceling negatives. OK. >>> @ --- 408,420 ---- >>> @ @ if (ISFINITE(bx) && ISFINITE(by) && (x > >>> RECIP_SQRT_EPSILON_100 || y > RECIP_SQRT_EPSILON_100)) { >>> @ ! /* XXX following can also raise overflow */ >> >> I don't see how the code could raise an overflow. The output of clog >> should always be very much less than DBL_MAX. (Originally I had >> clog(2*z), and that could raise an unwarranted overflow.) > > @ if (ISFINITE(bx) && ISFINITE(by) && (x > RECIP_SQRT_EPSILON_100 > || y > RECIP_SQRT_EPSILON_100)) { > @ ! /* XXX following can also raise overflow */ > @ ! if (huge+x+y>one) { /* raise inexact */ > @ ! w = clog_for_large_values(z); > @ ! /* Can't add M_LN2 to w since it should clobber -0*I. */ > @ ! rx = fabs(cimag(w)); > @ ! ry = creal(w) + M_LN2; > @ if (sy == 0) > @ ! ry = -ry; > @ ! return (cpack(rx, ry)); > @ } > @ } > > clog() won't overflow spuriously, but huge+x+y might. Yes, I didn't think of that! > ((int)x == 0)' is a safer method of raising inexact for certain x. But this only works if x is less than 1. OK, how about this: sqrt_huge = 1e150; if (sqrt_huge+x>one || sqrt_huge+y>one) ... From owner-freebsd-numerics@FreeBSD.ORG Mon Aug 13 22:23:00 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 3637D106566B for ; Mon, 13 Aug 2012 22:23:00 +0000 (UTC) (envelope-from stephen@missouri.edu) Received: from wilberforce.math.missouri.edu (wilberforce.math.missouri.edu [128.206.184.213]) by mx1.freebsd.org (Postfix) with ESMTP id EBC988FC0A for ; Mon, 13 Aug 2012 22:22:59 +0000 (UTC) Received: from [127.0.0.1] (wilberforce.math.missouri.edu [128.206.184.213]) by wilberforce.math.missouri.edu (8.14.5/8.14.5) with ESMTP id q7DMMwP5032310 for ; Mon, 13 Aug 2012 17:22:59 -0500 (CDT) (envelope-from stephen@missouri.edu) Message-ID: <50297E43.7090309@missouri.edu> Date: Mon, 13 Aug 2012 17:22:59 -0500 From: Stephen Montgomery-Smith User-Agent: Mozilla/5.0 (X11; Linux i686; rv:14.0) Gecko/20120714 Thunderbird/14.0 MIME-Version: 1.0 To: freebsd-numerics@freebsd.org References: <5017111E.6060003@missouri.edu> <501C361D.4010807@missouri.edu> <20120804165555.X1231@besplex.bde.org> <501D51D7.1020101@missouri.edu> <20120805030609.R3101@besplex.bde.org> <501D9C36.2040207@missouri.edu> <20120805175106.X3574@besplex.bde.org> <501EC015.3000808@missouri.edu> <20120805191954.GA50379@troutmask.apl.washington.edu> <20120807205725.GA10572@server.rulingia.com> <20120809025220.N4114@besplex.bde.org> <5027F07E.9060409@missouri.edu> <20120814003614.H3692@besplex.bde.org> <50295F5C.6010800@missouri.edu> <20120814072946.S5260@besplex.bde.org> <50297CA5.5010900@missouri.edu> In-Reply-To: <50297CA5.5010900@missouri.edu> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: Complex arg-trig functions X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 13 Aug 2012 22:23:00 -0000 On 08/13/2012 05:16 PM, Stephen Montgomery-Smith wrote: > On 08/13/2012 04:45 PM, Bruce Evans wrote: > >> y can have any sign I think. But the problem only seemed to happen with >> denormals and/or NaNs. There might be a problem with NaNs not giving one >> of the canceling negatives. > > OK. > >>>> @ --- 408,420 ---- >>>> @ @ if (ISFINITE(bx) && ISFINITE(by) && (x > >>>> RECIP_SQRT_EPSILON_100 || y > RECIP_SQRT_EPSILON_100)) { >>>> @ ! /* XXX following can also raise overflow */ >>> >>> I don't see how the code could raise an overflow. The output of clog >>> should always be very much less than DBL_MAX. (Originally I had >>> clog(2*z), and that could raise an unwarranted overflow.) >> >> @ if (ISFINITE(bx) && ISFINITE(by) && (x > RECIP_SQRT_EPSILON_100 >> || y > RECIP_SQRT_EPSILON_100)) { >> @ ! /* XXX following can also raise overflow */ >> @ ! if (huge+x+y>one) { /* raise inexact */ >> @ ! w = clog_for_large_values(z); >> @ ! /* Can't add M_LN2 to w since it should clobber -0*I. */ >> @ ! rx = fabs(cimag(w)); >> @ ! ry = creal(w) + M_LN2; >> @ if (sy == 0) >> @ ! ry = -ry; >> @ ! return (cpack(rx, ry)); >> @ } >> @ } >> >> clog() won't overflow spuriously, but huge+x+y might. > > Yes, I didn't think of that! > >> ((int)x == 0)' is a safer method of raising inexact for certain x. > > But this only works if x is less than 1. > > OK, how about this: > > sqrt_huge = 1e150; > if (sqrt_huge+x>one || sqrt_huge+y>one) ... Oops if (sqrt_huge+x>one && sqrt_huge+y>one) From owner-freebsd-numerics@FreeBSD.ORG Tue Aug 14 10:09:52 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 05EBF106566B for ; Tue, 14 Aug 2012 10:09:52 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from mail13.syd.optusnet.com.au (mail13.syd.optusnet.com.au [211.29.132.194]) by mx1.freebsd.org (Postfix) with ESMTP id E59A78FC16 for ; Tue, 14 Aug 2012 10:09:49 +0000 (UTC) Received: from c122-106-171-246.carlnfd1.nsw.optusnet.com.au (c122-106-171-246.carlnfd1.nsw.optusnet.com.au [122.106.171.246]) by mail13.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id q7EA9dNB024420 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Tue, 14 Aug 2012 20:09:40 +1000 Date: Tue, 14 Aug 2012 20:09:39 +1000 (EST) From: Bruce Evans X-X-Sender: bde@besplex.bde.org To: Stephen Montgomery-Smith In-Reply-To: <50297468.20902@missouri.edu> Message-ID: <20120814173931.V934@besplex.bde.org> References: <5017111E.6060003@missouri.edu> <501C361D.4010807@missouri.edu> <20120804165555.X1231@besplex.bde.org> <501D51D7.1020101@missouri.edu> <20120805030609.R3101@besplex.bde.org> <501D9C36.2040207@missouri.edu> <20120805175106.X3574@besplex.bde.org> <501EC015.3000808@missouri.edu> <20120805191954.GA50379@troutmask.apl.washington.edu> <20120807205725.GA10572@server.rulingia.com> <20120809025220.N4114@besplex.bde.org> <5027F07E.9060409@missouri.edu> <20120814003614.H3692@besplex.bde.org> <50295887.2010608@missouri.edu> <20120814055931.Q4897@besplex.bde.org> <50297468.20902@missouri.edu> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: freebsd-numerics@freebsd.org Subject: Re: Complex arg-trig functions X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 14 Aug 2012 10:09:52 -0000 On Mon, 13 Aug 2012, Stephen Montgomery-Smith wrote: > On 08/13/2012 04:14 PM, Bruce Evans wrote: > ... >> 2. Division tends to be slow, and is slow on x86. On Phenom, division >> has a latency of 20 cycles and a throughput of 1 per 15 cycles in >> scalar double precision (17 in vector[2] double precision). Maybe >> the compiler can optimize division by a power of 2 to a multiplication >> though. > > Does the time taken to perform certain floating point operations depend upon > what the numbers are? I know that when I do long multiplication by hand that > (1) it is much faster than long division, and (2) certain numbers (like 1000) > are very quick to multiply and divide. On x86, mostly not. Denormals can be slow in general, but I've never seen then causing significant slowness on x86. I've only seen underflow (and overflow?) causing signficant slowness. The easy underflowing cases for exp*() run about 4 times slower than the normal case although they only execute 1 underflowing instruction instead of 50-100 normal instructions after the classification. This slowness is very MD: - on identical Intel core2 hardware (ref*.freebsd.org) - i386 underflow is normally slow in all precisions. But when SSE[2] is used, it has no penalty. SSE[2] is the default on core2 for clang in float and double precision, but for gcc it takes unusual CFLAGS to get it. CFLAGS without -march= don't give it even for clang IIRC, since the default is more like -mpentiumpro. - amd64 underflow normally or always has no penalty in float and double precisions, since use of SSE[2] is forced, except possibly with even more unusual CFLAGS. - long doubles always use the i387 and always have a large penalty for underflow. - on AthlonXP and original Athlon64 hardware: - there is no penalty for underflow for any combination of amd64/i386/ i387/SSE[2]/float/double/long double. > Similarly with the computer, is it possible that division by 4 is much faster > than division by (say) 4.4424242389, and that division by 4 is just as fast > as multiplication by 0.25? (And multiplication by 0.25 is faster than > multiplication than 0.235341212412?) Pipelining and large hardware probably inhibits this optimization on modern CPUs. Some x86 has faster integer multiplication (?) for smaller numbers of bits and/or the position of the bits. I've never seen this for FP on x86. Except, there are special FP instructions for smaller numbers of bits in SSE: reciprocal with 12 (?) bits and reciprocal square root with 12 (?) bits. These have about the same latency as addition and multiplication (4 cycles). Probably lower throughput (not 1 per cycle). This many bits can be done by table lookup. Perhaps the full operations start with this fully in hardware, then use Newton's method in microcode, then magic to round. IIRC. ia64 doesn't even have full division in hardware, but it has reciprocals. >> This was tracked to a known annoying difference between SSE and i387 >> on NaNs. x+y clears the sign buit on i387 but not on SSE. SSE is >> correct. > > Was it your code that was wrong, or mine? And if it is mine, what is the > fix? Neither. It is just that different hardware gives different results, and the differences unfortunately show up on the same machine when SSE and i387 are mixed. Even larger differences also show up on sparc64, since long doubles are normally not in hardware; they are normally emulated by library calls and the library is not careful about NaNs; they can also be emulated by trapping of hardware long double instructions and then running different library functions from the trap handler, and these functions (or maybe the hardware part) are more careful about NaNs. >> % rccos: max_er = 0xb7f10439f3111686 24688206287.5958, avg_er = >> 1415.000, #>=1:0.5 = 3881364:4046644 > >> i387 hardware trig functions are very inaccurate. > > But also, what about the problem of when the input is close to one of the > non-trivial roots of sin, cos, etc? As a mathematician, I wouldn't be > shocked if sin(M_PI) was 1e-15 or such like. M_PI is a rational approximation to pi, so it would be a serious error of sin() on it were zero. sin() on it should be the infinite-precision sin() on it, correctly rounded to MANT_DIG bits except possibly for an error of less than 1 ulp. But the i387 only has a 66 (68?) bit internal approximation to pi. Subtracting this gives a large cancelation error: - float precision: 66-24 = 42. _At least_ 24 bits cancel. Some of the remaining 42 bits may cancel, but we would have to be very unlucky for pi to be so close to a representable rational that many of them cancel. In practice, most don't, and we are left with about 42 correct and rounding these to 24 bits is perfect provided we are also not unlucky with the trailing bits being near a half-way case. - double precision: 66-53 = 11. 11 bits possibly correct and 42 probably wrong. - long double precision: 66-64 = 2. 2 bits possibly correct and 62 probably wrong. sin(pi) is an easy case. More intesting is sin((double)(DBL_MAX/pi) * pi), where the multiplication is done in infinite precision and rounded to the closest representable rational. Multiplication by DBL_MAX/pi expands the number of certainly-canceling bits by a huge factor. The required number is approximately DBL_MAX_EXP = 1024. Then we have to know how many more bits may be lost to cancelation due to a multiple of pi being very close to a representable rational. Mathematicians showed using the continued fraction expansion of pi that the closest is about 2**-75 for precisions of interest (I forget if this is for double or ld128 long double precision). So about 1024+75 bits are needed in the approximation to pi. fdlibm msun/e_rem_pio2.c provided the necessary number (actually 1584) in 1994. Most of the extra 500 seem to be unnnecessary (perhaps the 2**-75 number was not well known in 1994). das had to expand this to support ld128. The table is now in msun/k_rem_pio2.c and provides 16560 bits. Unfortunately, the comment before it still says 1585 (actually "396 Hex digits"); this is correct but confusing since the table is extended under an ifdef. Approximately LDBL_MAX_EXP + 75+ = 16384 + 75 bits required for ld128, so there is now less to spare. Later the comment gives the precise number of bits needed for the reduction as (e0-3) + jk*24, where e0 <= 16360 and jk = 6; this is <= 16501, so there are 59 bits to spare. 59 isn't enough, so I don't see why we aren't depending on numerical accidents. I already avoid sign errors for NaNs in many places including hypot*() and atan2*(). This is probably needed for clog() to pass my tests: % Index: e_atan2.c % =================================================================== % RCS file: /home/ncvs/src/lib/msun/src/e_atan2.c,v % retrieving revision 1.14 % diff -u -1 -r1.14 e_atan2.c % --- e_atan2.c 2 Aug 2008 19:17:00 -0000 1.14 % +++ e_atan2.c 12 Jul 2012 18:31:28 -0000 % @@ -72,3 +72,3 @@ % ((iy|((ly|-ly)>>31))>0x7ff00000)) /* x or y is NaN */ % - return x+y; % + return (x+0.0L)/(y+0); /* quieten sNaNs before mixing */ % if((hx-0x3ff00000|lx)==0) return atan(y); /* x=1.0 */ Adding 0 quietens signaling NaNs before mixing. Otherwise, the result can depend on the quiet bit. (The hardware rule for mixing NaNs is sometimes to compare them in bits and select the largest one. The sign and quiet bits should not be primary in this comparision, but they are top bits and are primary on some hardware. SSE differs from i387 here IIRC). This was already done in some places. The additional hack of adding 0.0L instead of 0 makes the caclulation done in the same (i387) hardware on x86 for all precisions, so that the result doesn't depend on the precision. On i386 with ![clang && -msse2], this makes no difference to the object code, since the i387 is always used anyway, but it takes extra code on amd64, as does adding 0. This is only in an exceptional path so it shouldn't cost any speed. Also make the indentation less unusual. % Index: e_atan2f.c % =================================================================== % RCS file: /home/ncvs/src/lib/msun/src/e_atan2f.c,v % retrieving revision 1.12 % diff -u -1 -r1.12 e_atan2f.c % --- e_atan2f.c 3 Aug 2008 17:39:54 -0000 1.12 % +++ e_atan2f.c 12 Jul 2012 18:31:50 -0000 % @@ -43,3 +43,3 @@ % (iy>0x7f800000)) /* x or y is NaN */ % - return x+y; % + return (x+0.0L)/(y+0); /* quieten sNaNs before mixing */ % if(hx==0x3f800000) return atanf(y); /* x=1.0 */ Use the same +0.0L and comment and formatting in all precisions. % Index: e_atan2l.c % =================================================================== % RCS file: /home/ncvs/src/lib/msun/src/e_atan2l.c,v % retrieving revision 1.3 % diff -u -1 -r1.3 e_atan2l.c % --- e_atan2l.c 2 Aug 2008 19:17:00 -0000 1.3 % +++ e_atan2l.c 12 Jul 2012 18:32:55 -0000 % @@ -64,3 +64,3 @@ % ((uy.bits.manh&~LDBL_NBIT)|uy.bits.manl)!=0)) /* y is NaN */ % - return x+y; % + return (x+0.0L)/(y+0); /* quieten sNaNs before mixing */ % if (expsignx==BIAS && ((ux.bits.manh&~LDBL_NBIT)|ux.bits.manl)==0) Formatting was already improved. % Index: e_hypot.c % =================================================================== % RCS file: /home/ncvs/src/lib/msun/src/e_hypot.c,v % retrieving revision 1.14 % diff -u -1 -r1.14 e_hypot.c % --- e_hypot.c 15 Oct 2011 07:00:28 -0000 1.14 % +++ e_hypot.c 3 Jan 2012 18:00:26 -0000 % @@ -72,3 +72,3 @@ % /* Use original arg order iff result is NaN; quieten sNaNs. */ % - w = fabs(x+0.0)-fabs(y+0.0); % + w = fabsl(x+0.0L)-fabs(y+0); % GET_LOW_WORD(low,a); hypot* already added 0.0. Also change the spelling of 0.0 to 0 wherever possible, so that 0 is added in the same precision wherever the i387 hack is not wanted. % Index: e_hypotf.c % =================================================================== % RCS file: /home/ncvs/src/lib/msun/src/e_hypotf.c,v % retrieving revision 1.14 % diff -u -1 -r1.14 e_hypotf.c % --- e_hypotf.c 15 Oct 2011 07:00:28 -0000 1.14 % +++ e_hypotf.c 3 Jan 2012 18:00:42 -0000 % @@ -39,3 +39,3 @@ % /* Use original arg order iff result is NaN; quieten sNaNs. */ % - w = fabsf(x+0.0F)-fabsf(y+0.0F); % + w = fabsl(x+0.0L)-fabsf(y+0); % if(ha == 0x7f800000) w = a; % Index: e_hypotl.c % =================================================================== % RCS file: /home/ncvs/src/lib/msun/src/e_hypotl.c,v % retrieving revision 1.3 % diff -u -1 -r1.3 e_hypotl.c % --- e_hypotl.c 16 Oct 2011 05:36:39 -0000 1.3 % +++ e_hypotl.c 4 Jan 2012 05:26:52 -0000 % @@ -66,3 +69,3 @@ % /* Use original arg order iff result is NaN; quieten sNaNs. */ % - w = fabsl(x+0.0)-fabsl(y+0.0); % + w = fabsl(x+0.0L)-fabsl(y+0); % GET_LDBL_MAN(manh,manl,a); % Index: e_pow.c % =================================================================== % RCS file: /home/ncvs/src/lib/msun/src/e_pow.c,v % retrieving revision 1.14 % diff -u -1 -r1.14 e_pow.c % --- e_pow.c 21 Oct 2011 06:26:07 -0000 1.14 % +++ e_pow.c 3 Jan 2012 18:02:10 -0000 % @@ -117,3 +117,3 @@ % iy > 0x7ff00000 || ((iy==0x7ff00000)&&(ly!=0))) % - return (x+0.0)+(y+0.0); % + return (x+0.0L)+(y+0); % Propagate the +0.0L hack... % Index: e_powf.c % =================================================================== % RCS file: /home/ncvs/src/lib/msun/src/e_powf.c,v % retrieving revision 1.16 % diff -u -1 -r1.16 e_powf.c % --- e_powf.c 21 Oct 2011 06:26:07 -0000 1.16 % +++ e_powf.c 3 Jan 2012 18:02:21 -0000 % @@ -75,3 +75,3 @@ % iy > 0x7f800000) % - return (x+0.0F)+(y+0.0F); % + return (x+0.0L)+(y+0); % % Index: e_remainder.c % =================================================================== % RCS file: /home/ncvs/src/lib/msun/src/e_remainder.c,v % retrieving revision 1.12 % diff -u -1 -r1.12 e_remainder.c % --- e_remainder.c 30 Mar 2008 20:47:42 -0000 1.12 % +++ e_remainder.c 3 Jan 2012 17:43:21 -0000 % @@ -47,7 +47,7 @@ % /* purge off exception values */ % - if((hp|lp)==0) return (x*p)/(x*p); /* p = 0 */ % - if((hx>=0x7ff00000)|| /* x not finite */ % + if(((hp|lp)==0)|| /* p = 0 */ % + (hx>=0x7ff00000)|| /* x not finite */ % ((hp>=0x7ff00000)&& /* p is NaN */ % (((hp-0x7ff00000)|lp)!=0))) % - return ((long double)x*p)/((long double)x*p); % + return ((x+0.0L)*(p+0))/((x+0.0L)*(p+0)); % This already used a different way of extending to long double. Not so good, since it is MD whether extensions quieten NaNs before mixing. Also, rearrange code so as not to have a special case for p = 0. This might be an optimization for p != 0, but I only wanted the NaN handling to be uniform. % Index: e_remainderf.c % =================================================================== % RCS file: /home/ncvs/src/lib/msun/src/e_remainderf.c,v % retrieving revision 1.8 % diff -u -1 -r1.8 e_remainderf.c % --- e_remainderf.c 12 Feb 2008 17:11:36 -0000 1.8 % +++ e_remainderf.c 3 Jan 2012 17:44:06 -0000 % @@ -38,6 +38,6 @@ % /* purge off exception values */ % - if(hp==0) return (x*p)/(x*p); /* p = 0 */ % - if((hx>=0x7f800000)|| /* x not finite */ % + if((hp==0)|| /* p = 0 */ % + (hx>=0x7f800000)|| /* x not finite */ % ((hp>0x7f800000))) /* p is NaN */ % - return ((long double)x*p)/((long double)x*p); % + return ((x+0.0L)*(p+0))/((x+0.0L)*(p+0)); % remainderl() uses remquol(), so e_remainderl.c isn't changed in these patches. % Index: s_csinh.c % =================================================================== % RCS file: /home/ncvs/src/lib/msun/src/s_csinh.c,v % retrieving revision 1.2 % diff -u -1 -r1.2 s_csinh.c % --- s_csinh.c 21 Oct 2011 06:29:32 -0000 1.2 % +++ s_csinh.c 3 Jan 2012 06:20:18 -0000 % @@ -27,2 +27,9 @@ % /* % + * XXX TODO: % + * Change x +-I y to x + I (+-y) or vice versa? We currently use the % + * former former for args and the latter for results. % + * s/the invalid floating-point exception/FE_INVALID/g % + */ % + % +/* % * Hyperbolic sine of a complex argument z = x + i y. % @@ -65,3 +72,3 @@ % return (cpack(sinh(x), y)); % - if (ix < 0x40360000) /* small x: normal case */ % + if (ix < 0x40360000) /* |x| < 22: normal case */ % return (cpack(sinh(x) * cos(y), cosh(x) * sin(y))); % @@ -85,12 +92,12 @@ % /* % - * sinh(+-0 +- I Inf) = sign(d(+-0, dNaN))0 + I dNaN. % - * The sign of 0 in the result is unspecified. Choice = normally % - * the same as dNaN. Raise the invalid floating-point exception. % - * % - * sinh(+-0 +- I NaN) = sign(d(+-0, NaN))0 + I d(NaN). % - * The sign of 0 in the result is unspecified. Choice = normally % - * the same as d(NaN). % + * sinh(+-0 +- I Inf) = +-0 + I dNaN. % + * The sign of 0 in the result is unspecified. Choice = same sign % + * as the argument. Raise the invalid floating-point exception. % + * % + * sinh(+-0 +- I NaN) = +-0 + I d(NaN). % + * The sign of 0 in the result is unspecified. Choice = same sign % + * as the argument. % */ % - if ((ix | lx) == 0 && iy >= 0x7ff00000) % - return (cpack(copysign(0, x * (y - y)), y - y)); % + if ((ix | lx) == 0) /* && iy >= 0x7ff00000 */ % + return (cpack(x, y - y)); % % @@ -101,7 +108,4 @@ % */ % - if ((iy | ly) == 0 && ix >= 0x7ff00000) { % - if (((hx & 0xfffff) | lx) == 0) % - return (cpack(x, y)); % - return (cpack(x, copysign(0, y))); % - } % + if ((iy | ly) == 0) /* && ix >= 0x7ff00000 */ % + return (cpack(x + x, y)); % % @@ -115,4 +119,4 @@ % */ % - if (ix < 0x7ff00000 && iy >= 0x7ff00000) % - return (cpack(y - y, x * (y - y))); % + if (ix < 0x7ff00000) /* && iy >= 0x7ff00000 */ % + return (cpack(y - y, y - y)); % % @@ -120,8 +124,8 @@ % * sinh(+-Inf + I NaN) = +-Inf + I d(NaN). % - * The sign of Inf in the result is unspecified. Choice = normally % - * the same as d(NaN). % + * The sign of Inf in the result is unspecified. Choice = same sign % + * as the argument. % * % * sinh(+-Inf +- I Inf) = +Inf + I dNaN. % - * The sign of Inf in the result is unspecified. Choice = always +. % - * Raise the invalid floating-point exception. % + * The sign of Inf in the result is unspecified. Choice = same sign % + * as the argument. Raise the invalid floating-point exception. % * % @@ -129,5 +133,5 @@ % */ % - if (ix >= 0x7ff00000 && ((hx & 0xfffff) | lx) == 0) { % + if (ix == 0x7ff00000 && lx == 0) { % if (iy >= 0x7ff00000) % - return (cpack(x * x, x * (y - y))); % + return (cpack(x, y - y)); % return (cpack(x * cos(y), INFINITY * sin(y))); % @@ -146,3 +150,3 @@ % */ % - return (cpack((x * x) * (y - y), (x + x) * (y - y))); % + return (cpack((x + x) * (y - y), (x * x) * (y - y))); % } % @@ -153,3 +157,3 @@ % % - /* csin(z) = -I * csinh(I * z) */ % + /* csin(z) = -I * csinh(I * z). */ % z = csinh(cpack(-cimag(z), creal(z))); Beginning of cleanups and NaN fixes for committed complex functions. % Index: s_csinhf.c % =================================================================== % RCS file: /home/ncvs/src/lib/msun/src/s_csinhf.c,v % retrieving revision 1.2 % diff -u -1 -r1.2 s_csinhf.c % --- s_csinhf.c 21 Oct 2011 06:29:32 -0000 1.2 % +++ s_csinhf.c 3 Jan 2012 06:20:30 -0000 % @@ -27,3 +27,3 @@ % /* % - * Hyperbolic sine of a complex argument z. See s_csinh.c for details. % + * Float version of csinh(). See s_csinh.c for details. % */ % @@ -58,3 +58,3 @@ % return (cpackf(sinhf(x), y)); % - if (ix < 0x41100000) /* small x: normal case */ % + if (ix < 0x41100000) /* |x| < 9: normal case */ % return (cpackf(sinhf(x) * cosf(y), coshf(x) * sinf(y))); % @@ -64,3 +64,3 @@ % /* x < 88.7: expf(|x|) won't overflow */ % - h = expf(fabsf(x)) * 0.5f; % + h = expf(fabsf(x)) * 0.5F; % return (cpackf(copysignf(h, x) * cosf(y), h * sinf(y))); % @@ -77,17 +77,14 @@ % % - if (ix == 0 && iy >= 0x7f800000) % - return (cpackf(copysignf(0, x * (y - y)), y - y)); % + if (ix == 0) /* && iy >= 0x7f800000 */ % + return (cpackf(x, y - y)); % % - if (iy == 0 && ix >= 0x7f800000) { % - if ((hx & 0x7fffff) == 0) % - return (cpackf(x, y)); % - return (cpackf(x, copysignf(0, y))); % - } % + if (iy == 0) /* && ix >= 0x7f800000 */ % + return (cpackf(x + x , y)); % % - if (ix < 0x7f800000 && iy >= 0x7f800000) % - return (cpackf(y - y, x * (y - y))); % + if (ix < 0x7f800000) /* && iy >= 0x7f800000 */ % + return (cpackf(y - y, y - y)); % % - if (ix >= 0x7f800000 && (hx & 0x7fffff) == 0) { % + if (ix == 0x7f800000) { % if (iy >= 0x7f800000) % - return (cpackf(x * x, x * (y - y))); % + return (cpackf(x, y - y)); % return (cpackf(x * cosf(y), INFINITY * sinf(y))); % @@ -95,3 +92,3 @@ % % - return (cpackf((x * x) * (y - y), (x + x) * (y - y))); % + return (cpackf((x + x) * (y - y), (x * x) * (y - y))); % } % Index: s_csqrt.c % =================================================================== % RCS file: /home/ncvs/src/lib/msun/src/s_csqrt.c,v % retrieving revision 1.4 % diff -u -1 -r1.4 s_csqrt.c % --- s_csqrt.c 8 Aug 2008 00:15:16 -0000 1.4 % +++ s_csqrt.c 25 Oct 2011 14:51:10 -0000 % @@ -65,3 +65,3 @@ % t = (b - b) / (b - b); /* raise invalid if b is not a NaN */ % - return (cpack(a, t)); /* return NaN + NaN i */ % + return (cpack(a + a, t)); /* return NaN + NaN i */ % } csqrt*() didn't even quieten NaNs. Many more of the committed complex functions would have failed my tests without fixes like this. The inverse hyerbolic ones somehow have fewer problems with NaNs, perhaps by using atan2() more. % Index: s_csqrtf.c % =================================================================== % RCS file: /home/ncvs/src/lib/msun/src/s_csqrtf.c,v % retrieving revision 1.3 % diff -u -1 -r1.3 s_csqrtf.c % --- s_csqrtf.c 8 Aug 2008 00:15:16 -0000 1.3 % +++ s_csqrtf.c 25 Oct 2011 14:51:21 -0000 % @@ -56,3 +56,3 @@ % t = (b - b) / (b - b); /* raise invalid if b is not a NaN */ % - return (cpackf(a, t)); /* return NaN + NaN i */ % + return (cpackf(a + a, t)); /* return NaN + NaN i */ % } % Index: s_csqrtl.c % =================================================================== % RCS file: /home/ncvs/src/lib/msun/src/s_csqrtl.c,v % retrieving revision 1.2 % diff -u -1 -r1.2 s_csqrtl.c % --- s_csqrtl.c 8 Aug 2008 00:15:16 -0000 1.2 % +++ s_csqrtl.c 25 Oct 2011 14:51:29 -0000 % @@ -65,3 +65,3 @@ % t = (b - b) / (b - b); /* raise invalid if b is not a NaN */ % - return (cpackl(a, t)); /* return NaN + NaN i */ % + return (cpackl(a + a, t)); /* return NaN + NaN i */ % } % Index: s_ctanh.c % =================================================================== % RCS file: /home/ncvs/src/lib/msun/src/s_ctanh.c,v % retrieving revision 1.2 % diff -u -1 -r1.2 s_ctanh.c % --- s_ctanh.c 21 Oct 2011 06:30:16 -0000 1.2 % +++ s_ctanh.c 25 Oct 2011 14:30:18 -0000 % @@ -87,2 +87,13 @@ % /* % + * XXX this is missing the dNaN/d(NaN) notation, which tells us the % + * following: % + * dNaN is a default NaN unrelated to any NaN args % + * d(NaN) is a unary conversion (usually quieting) of the arg `NaN' % + * % + * XXX everything is missing: % + * d(NaN1, NaN2) and d(NaN, y) % + * which should be used for binary conversions. % + * % + * XXX this misspells I as i. % + * % * ctanh(NaN + i 0) = NaN + i 0 % @@ -104,3 +115,3 @@ % if ((ix & 0xfffff) | lx) /* x is NaN */ % - return (cpack(x, (y == 0 ? y : x * y))); % + return (cpack(x + x, y == 0 ? y : x * y)); % SET_HIGH_WORD(x, hx - 0x40000000); /* x = copysign(1, x) */ ctanh*() needs more work than the others. It also has the largest errors. Some above 6 ulps. % Index: s_ctanhf.c % =================================================================== % RCS file: /home/ncvs/src/lib/msun/src/s_ctanhf.c,v % retrieving revision 1.2 % diff -u -1 -r1.2 s_ctanhf.c % --- s_ctanhf.c 21 Oct 2011 06:30:16 -0000 1.2 % +++ s_ctanhf.c 25 Oct 2011 14:30:57 -0000 % @@ -53,3 +53,3 @@ % if (ix & 0x7fffff) % - return (cpackf(x, (y == 0 ? y : x * y))); % + return (cpackf(x + x, y == 0 ? y : x * y)); % SET_FLOAT_WORD(x, hx - 0x40000000); % Index: s_fabsl.c % =================================================================== % RCS file: /home/ncvs/src/lib/msun/src/s_fabsl.c,v % retrieving revision 1.4 % diff -u -1 -r1.4 s_fabsl.c % --- s_fabsl.c 25 Apr 2012 18:07:35 -0000 1.4 % +++ s_fabsl.c 12 Jul 2012 13:02:28 -0000 % @@ -40,4 +40,4 @@ % u.e = x; % - u.bits.sign = 0; % - return (u.e); % + u.xbits.expsign &= 0x7fff; % + return (u.e + 0); % fabsl() probably shouldn't quieten NaNs, since the hardware normally doesn't. All fabs*() functions are usually inline or in asm, so the fdlibm version is rarely used except for testing it. I didn't hack on fabsf() or fabs() similarly because NaN errors in them didn't shoow up in testing. Also de-pessimize the sign handling a little. compilers tends to produce pessimal code for direct bit-field accesses, with u.bits.sign the worst case. and u.xbits.expsign not too bad. % Index: s_remquo.c % =================================================================== % RCS file: /home/ncvs/src/lib/msun/src/s_remquo.c,v % retrieving revision 1.3 % diff -u -1 -r1.3 s_remquo.c % --- s_remquo.c 7 Apr 2012 03:59:12 -0000 1.3 % +++ s_remquo.c 12 Jul 2012 13:06:25 -0000 % @@ -46,3 +46,3 @@ % ((hy|((ly|-ly)>>31))>0x7ff00000)) /* or y is NaN */ % - return (x*y)/(x*y); % + return ((x+0.0L)*(y+0))/((x+0.0L)*(y+0)); % if(hx<=hy) { % Index: s_remquof.c % =================================================================== % RCS file: /home/ncvs/src/lib/msun/src/s_remquof.c,v % retrieving revision 1.2 % diff -u -1 -r1.2 s_remquof.c % --- s_remquof.c 7 Apr 2012 03:59:12 -0000 1.2 % +++ s_remquof.c 12 Jul 2012 13:06:32 -0000 % @@ -43,3 +43,3 @@ % if(hy==0||hx>=0x7f800000||hy>0x7f800000) /* y=0,NaN;or x not finite */ % - return (x*y)/(x*y); % + return ((x+0.0L)*(y+0))/((x+0.0L)*(y+0)); % if(hx Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 36F33106566B for ; Tue, 14 Aug 2012 10:46:13 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from mail08.syd.optusnet.com.au (mail08.syd.optusnet.com.au [211.29.132.189]) by mx1.freebsd.org (Postfix) with ESMTP id 4DD758FC0C for ; Tue, 14 Aug 2012 10:46:11 +0000 (UTC) Received: from c122-106-171-246.carlnfd1.nsw.optusnet.com.au (c122-106-171-246.carlnfd1.nsw.optusnet.com.au [122.106.171.246]) by mail08.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id q7EAk7Qw005407 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Tue, 14 Aug 2012 20:46:08 +1000 Date: Tue, 14 Aug 2012 20:46:07 +1000 (EST) From: Bruce Evans X-X-Sender: bde@besplex.bde.org To: Stephen Montgomery-Smith In-Reply-To: <50297E43.7090309@missouri.edu> Message-ID: <20120814201105.T934@besplex.bde.org> References: <5017111E.6060003@missouri.edu> <501C361D.4010807@missouri.edu> <20120804165555.X1231@besplex.bde.org> <501D51D7.1020101@missouri.edu> <20120805030609.R3101@besplex.bde.org> <501D9C36.2040207@missouri.edu> <20120805175106.X3574@besplex.bde.org> <501EC015.3000808@missouri.edu> <20120805191954.GA50379@troutmask.apl.washington.edu> <20120807205725.GA10572@server.rulingia.com> <20120809025220.N4114@besplex.bde.org> <5027F07E.9060409@missouri.edu> <20120814003614.H3692@besplex.bde.org> <50295F5C.6010800@missouri.edu> <20120814072946.S5260@besplex.bde.org> <50297CA5.5010900@missouri.edu> <50297E43.7090309@missouri.edu> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: freebsd-numerics@freebsd.org Subject: Re: Complex arg-trig functions X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 14 Aug 2012 10:46:13 -0000 On Mon, 13 Aug 2012, Stephen Montgomery-Smith wrote: > On 08/13/2012 05:16 PM, Stephen Montgomery-Smith wrote: >> On 08/13/2012 04:45 PM, Bruce Evans wrote: >> >>> y can have any sign I think. But the problem only seemed to happen with >>> denormals and/or NaNs. There might be a problem with NaNs not giving one >>> of the canceling negatives. >> >> OK. >> >>>>> @ --- 408,420 ---- >>>>> @ @ if (ISFINITE(bx) && ISFINITE(by) && (x > >>>>> RECIP_SQRT_EPSILON_100 || y > RECIP_SQRT_EPSILON_100)) { >>>>> @ ! /* XXX following can also raise overflow */ >>>> >>>> I don't see how the code could raise an overflow. The output of clog >>>> should always be very much less than DBL_MAX. (Originally I had >>>> clog(2*z), and that could raise an unwarranted overflow.) >>> >>> @ if (ISFINITE(bx) && ISFINITE(by) && (x > RECIP_SQRT_EPSILON_100 >>> || y > RECIP_SQRT_EPSILON_100)) { >>> @ ! /* XXX following can also raise overflow */ >>> @ ! if (huge+x+y>one) { /* raise inexact */ >>> @ ! w = clog_for_large_values(z); >>> @ ! /* Can't add M_LN2 to w since it should clobber -0*I. */ >>> @ ! rx = fabs(cimag(w)); >>> @ ! ry = creal(w) + M_LN2; >>> @ if (sy == 0) >>> @ ! ry = -ry; >>> @ ! return (cpack(rx, ry)); >>> @ } >>> @ } >>> >>> clog() won't overflow spuriously, but huge+x+y might. >> >> Yes, I didn't think of that! >> >>> ((int)x == 0)' is a safer method of raising inexact for certain x. >> >> But this only works if x is less than 1. >> >> OK, how about this: >> >> sqrt_huge = 1e150; >> if (sqrt_huge+x>one || sqrt_huge+y>one) ... > > Oops > > if (sqrt_huge+x>one && sqrt_huge+y>one) x and y can be DBL_MAX, giving overflow. I think raising overflow is never correct, since clog() never overflows for large values, and ccacos() apparently reduces to a rearrangement of clog() for large values. BTW, you can probably omit the ISFINITE() tests in: >>> @ if (ISFINITE(bx) && ISFINITE(by) && (x > RECIP_SQRT_EPSILON_100 >>> || y > RECIP_SQRT_EPSILON_100)) { since if bx or by is NaN, then it isn't > RECIP_SQRT_EPSILON_100, and if it is Inf then I think handling it the same as DBL_MAX gives the correct result. NaNs and Infs now fall through to do_hard_work(). Wouldn't it be easier to never pass them to do_hard_work()? For just setting inexact, try an expression using `tiny'. There are many examples to choose from. According to $(grep tiny.*inex *.c): % e_sinh.c: if(shuge+x>one) return x;/* sinh(tiny) = tiny with inexact */ % e_sinhf.c: if(shuge+x>one) return x;/* sinh(tiny) = tiny with inexact */ Ones like you have. % e_sqrt.c: z = one-tiny; /* trigger inexact flag */ % e_sqrtf.c: z = one-tiny; /* trigger inexact flag */ Works generally, modulo compiler bugs and extra precision, provided z is used. % s_erf.c: * erf(x) = sign(x) *(1 - tiny) (raise inexact) % s_expm1.c: if(x+tiny<0.0) /* raise inexact */ % s_expm1f.c: if(x+tiny<(float)0.0) /* raise inexact */ % s_tanh.c: if(huge+x>one) return x; /* tanh(tiny) = tiny with inexact */ 3 more that depend too much on x. % s_tanh.c: z = one - tiny; /* raise inexact flag */ % s_tanhf.c: if(huge+x>one) return x; /* tanh(tiny) = tiny with inexact */ % s_tanhf.c: z = one - tiny; /* raise inexact flag */ To get z used, try `if ((int)(1 - tiny) == 1)'. To avoid compiler bugs, it is necessary for `tiny' to be static const volatile (where `tiny' is already static const). Only a few places in msun use a volatile `tiny', so you could not worry about the compiler bugs equally and wait for them to go away or for someone to notice that inexact is not set properly. clang has similar bugs for huge*huge. gcc doesn't evaluate huge*huge at compile time, but clang does. Both evaluate tiny*tiny and 1-tiny at compile time. Spelling 1 as `one' has no effect on the compiler bugs. Note that the expressions that mix in x only do so to avoid setting inexact when x = +-0, or maybe to preserve the sign of x, without using a branch to classify this x. Here we already have branches to classify x as large. Bruce From owner-freebsd-numerics@FreeBSD.ORG Tue Aug 14 16:08:46 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 8531A106566C for ; Tue, 14 Aug 2012 16:08:46 +0000 (UTC) (envelope-from stephen@missouri.edu) Received: from wilberforce.math.missouri.edu (wilberforce.math.missouri.edu [128.206.184.213]) by mx1.freebsd.org (Postfix) with ESMTP id 2A5038FC12 for ; Tue, 14 Aug 2012 16:08:45 +0000 (UTC) Received: from [127.0.0.1] (wilberforce.math.missouri.edu [128.206.184.213]) by wilberforce.math.missouri.edu (8.14.5/8.14.5) with ESMTP id q7EG8hF7027960; Tue, 14 Aug 2012 11:08:43 -0500 (CDT) (envelope-from stephen@missouri.edu) Message-ID: <502A780B.2010106@missouri.edu> Date: Tue, 14 Aug 2012 11:08:43 -0500 From: Stephen Montgomery-Smith User-Agent: Mozilla/5.0 (X11; Linux i686; rv:14.0) Gecko/20120714 Thunderbird/14.0 MIME-Version: 1.0 To: Bruce Evans References: <5017111E.6060003@missouri.edu> <501C361D.4010807@missouri.edu> <20120804165555.X1231@besplex.bde.org> <501D51D7.1020101@missouri.edu> <20120805030609.R3101@besplex.bde.org> <501D9C36.2040207@missouri.edu> <20120805175106.X3574@besplex.bde.org> <501EC015.3000808@missouri.edu> <20120805191954.GA50379@troutmask.apl.washington.edu> <20120807205725.GA10572@server.rulingia.com> <20120809025220.N4114@besplex.bde.org> <5027F07E.9060409@missouri.edu> <20120814003614.H3692@besplex.bde.org> <50295F5C.6010800@missouri.edu> <20120814072946.S5260@besplex.bde.org> <50297CA5.5010900@missouri.edu> <50297E43.7090309@missouri.edu> <20120814201105.T934@besplex.bde.org> In-Reply-To: <20120814201105.T934@besplex.bde.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-numerics@freebsd.org Subject: Re: Complex arg-trig functions X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 14 Aug 2012 16:08:46 -0000 On 08/14/2012 05:46 AM, Bruce Evans wrote: > On Mon, 13 Aug 2012, Stephen Montgomery-Smith wrote: > >> >> if (sqrt_huge+x>one && sqrt_huge+y>one) > > x and y can be DBL_MAX, giving overflow. Why? When x is DBL_MAX, sqrt_huge is so very much smaller than DBL_MAX that DBL_MAX+sqrt_huge should be DBL_MAX within floating point precision. So no overflow. >>>> @ if (ISFINITE(bx) && ISFINITE(by) && (x > RECIP_SQRT_EPSILON_100 >>>> || y > RECIP_SQRT_EPSILON_100)) { > > since if bx or by is NaN, then it isn't > RECIP_SQRT_EPSILON_100, and > if it is Inf then I think handling it the same as DBL_MAX gives the > correct result. My original code did this. But for some reason, it didn't work in all cases. I didn't take note of which cases failed. > NaNs and Infs now fall through to do_hard_work(). > Wouldn't it be easier to never pass them to do_hard_work()? It seemed to me that there is a logic behind why the the infs and nans produce the results they do. I noticed that do_the_hard_work() already got the answers correct for the real part *rx. Getting the imaginary part to work as well seemed to me to be the cleanest way to make it work. (I added all the nan and inf checking after writing the rest of the code.) > > For just setting inexact, try an expression using `tiny'. There are > many examples to choose from. According to $(grep tiny.*inex *.c): If you still judge my solution incorrect, then I will look into these different solutions. From owner-freebsd-numerics@FreeBSD.ORG Tue Aug 14 16:17:29 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 16F741065674 for ; Tue, 14 Aug 2012 16:17:29 +0000 (UTC) (envelope-from stephen@missouri.edu) Received: from wilberforce.math.missouri.edu (wilberforce.math.missouri.edu [128.206.184.213]) by mx1.freebsd.org (Postfix) with ESMTP id B80208FC12 for ; Tue, 14 Aug 2012 16:17:28 +0000 (UTC) Received: from [127.0.0.1] (wilberforce.math.missouri.edu [128.206.184.213]) by wilberforce.math.missouri.edu (8.14.5/8.14.5) with ESMTP id q7EGHRY7028538 for ; Tue, 14 Aug 2012 11:17:27 -0500 (CDT) (envelope-from stephen@missouri.edu) Message-ID: <502A7A17.1090803@missouri.edu> Date: Tue, 14 Aug 2012 11:17:27 -0500 From: Stephen Montgomery-Smith User-Agent: Mozilla/5.0 (X11; Linux i686; rv:14.0) Gecko/20120714 Thunderbird/14.0 MIME-Version: 1.0 To: freebsd-numerics@freebsd.org References: <5017111E.6060003@missouri.edu> <501C361D.4010807@missouri.edu> <20120804165555.X1231@besplex.bde.org> <501D51D7.1020101@missouri.edu> <20120805030609.R3101@besplex.bde.org> <501D9C36.2040207@missouri.edu> <20120805175106.X3574@besplex.bde.org> <501EC015.3000808@missouri.edu> <20120805191954.GA50379@troutmask.apl.washington.edu> <20120807205725.GA10572@server.rulingia.com> <20120809025220.N4114@besplex.bde.org> <5027F07E.9060409@missouri.edu> <20120814003614.H3692@besplex.bde.org> <50295F5C.6010800@missouri.edu> <20120814072946.S5260@besplex.bde.org> <50297CA5.5010900@missouri.edu> <50297E43.7090309@missouri.edu> <20120814201105.T934@besplex.bde.org> <502A780B.2010106@missouri.edu> In-Reply-To: <502A780B.2010106@missouri.edu> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: Complex arg-trig functions X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 14 Aug 2012 16:17:29 -0000 On 08/14/2012 11:08 AM, Stephen Montgomery-Smith wrote: > On 08/14/2012 05:46 AM, Bruce Evans wrote: >> On Mon, 13 Aug 2012, Stephen Montgomery-Smith wrote: >> >>> >>> if (sqrt_huge+x>one && sqrt_huge+y>one) >> >> x and y can be DBL_MAX, giving overflow. > > Why? When x is DBL_MAX, sqrt_huge is so very much smaller than DBL_MAX > that DBL_MAX+sqrt_huge should be DBL_MAX within floating point > precision. So no overflow. I wrote a short test program. It seems to work. From owner-freebsd-numerics@FreeBSD.ORG Tue Aug 14 16:51:27 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 25BFA1065680 for ; Tue, 14 Aug 2012 16:51:27 +0000 (UTC) (envelope-from stephen@missouri.edu) Received: from wilberforce.math.missouri.edu (wilberforce.math.missouri.edu [128.206.184.213]) by mx1.freebsd.org (Postfix) with ESMTP id EA62B8FC14 for ; Tue, 14 Aug 2012 16:51:26 +0000 (UTC) Received: from [127.0.0.1] (wilberforce.math.missouri.edu [128.206.184.213]) by wilberforce.math.missouri.edu (8.14.5/8.14.5) with ESMTP id q7EGpOvR030708; Tue, 14 Aug 2012 11:51:24 -0500 (CDT) (envelope-from stephen@missouri.edu) Message-ID: <502A820C.6060804@missouri.edu> Date: Tue, 14 Aug 2012 11:51:24 -0500 From: Stephen Montgomery-Smith User-Agent: Mozilla/5.0 (X11; Linux i686; rv:14.0) Gecko/20120714 Thunderbird/14.0 MIME-Version: 1.0 To: Bruce Evans References: <5017111E.6060003@missouri.edu> <501C361D.4010807@missouri.edu> <20120804165555.X1231@besplex.bde.org> <501D51D7.1020101@missouri.edu> <20120805030609.R3101@besplex.bde.org> <501D9C36.2040207@missouri.edu> <20120805175106.X3574@besplex.bde.org> <501EC015.3000808@missouri.edu> <20120805191954.GA50379@troutmask.apl.washington.edu> <20120807205725.GA10572@server.rulingia.com> <20120809025220.N4114@besplex.bde.org> <5027F07E.9060409@missouri.edu> <20120814003614.H3692@besplex.bde.org> <50295887.2010608@missouri.edu> <20120814055931.Q4897@besplex.bde.org> <50297468.20902@missouri.edu> <20120814173931.V934@besplex.bde.org> In-Reply-To: <20120814173931.V934@besplex.bde.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-numerics@freebsd.org Subject: Re: Complex arg-trig functions X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 14 Aug 2012 16:51:27 -0000 On 08/14/2012 05:09 AM, Bruce Evans wrote: > On Mon, 13 Aug 2012, Stephen Montgomery-Smith wrote: > >> But also, what about the problem of when the input is close to one of >> the non-trivial roots of sin, cos, etc? As a mathematician, I >> wouldn't be shocked if sin(M_PI) was 1e-15 or such like. > > M_PI is a rational approximation to pi, so it would be a serious error of > sin() on it were zero. I see your point. But anyone who uses M_PI in their code certainly intends for it to represent Pi, not its rational approximation in double arithmetic. But ... > sin(pi) is an easy case. More intesting is > sin((double)(DBL_MAX/pi) * pi), getting this to work is more of an intellectual exercise than anything useful. (Also it wasn't clear to me whether in this case pi represents real pi or its rational approximation.) When x is incredibly large (close to DBL_MAX), a mathematician would consider x to represent all the numbers between x-x*DBL_EPSILON to x+x*DBL_EPSILON (approximately), or more precisely, all the numbers that are within 0.5 ULP of x. So as a Mathematician I would prefer to think of sin(close_to_DBL_MAX) as undefined. (Although as a programmer, I would hate it if it spat out NaN - I would prefer the meaningless answer.) On the other hand, getting sin(x) correct when x is close to, say, 10pi would be useful. It would be nice if sin(x+0.01)-sin(x) is approximately 0.01*cos(x), and if the computation (x mod pi/2) didn't take into account a few extra digits of pi, and can see this failing for certain values of x. > I already avoid sign errors for NaNs in many places including > hypot*() and atan2*(). This is probably needed for clog() to > pass my tests: I see code like "if (y!=y) return (y+y)". Does "y+y" quieten the NaNs as well as "y+0.0"? Is my code compliant in this regard? > It is bad style to re-use a variable. It micro-optimizes for 30 year > old compilers. It makes the code harder to writem read and debug. > Modern compilers will automatically reuse register and stack resources > for variables if this is possible and doesn't interfere too much with > debugging. That makes sense. > Not reusing variables seems to give slightly better object > code more often than slightly worse object code, by simplifying the > lifetime analysis needed for this optimization. That surprises me a bit. (Although now that I think about it, letting the compiler make the decision seems the better thing to do.) > I only try re-using > variables near the start of a function (maybe x = creal(z); use(x); > x = fabs(x)), to try to stop the compiler making so many copies of x. > This somtimes works. (The typical pessimization avoided by this is > when x passed on the stack. gcc likes to copy it to another place > on the stack, using pessimal methods, and then never use the original > copy. This is good for debugging, but otherwise not very good.). What do you mean by "pessimization"? From owner-freebsd-numerics@FreeBSD.ORG Tue Aug 14 17:02:14 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 22A56106566B for ; Tue, 14 Aug 2012 17:02:14 +0000 (UTC) (envelope-from stephen@missouri.edu) Received: from wilberforce.math.missouri.edu (wilberforce.math.missouri.edu [128.206.184.213]) by mx1.freebsd.org (Postfix) with ESMTP id D505D8FC08 for ; Tue, 14 Aug 2012 17:02:13 +0000 (UTC) Received: from [127.0.0.1] (wilberforce.math.missouri.edu [128.206.184.213]) by wilberforce.math.missouri.edu (8.14.5/8.14.5) with ESMTP id q7EH2CG7031435 for ; Tue, 14 Aug 2012 12:02:13 -0500 (CDT) (envelope-from stephen@missouri.edu) Message-ID: <502A8494.2050707@missouri.edu> Date: Tue, 14 Aug 2012 12:02:12 -0500 From: Stephen Montgomery-Smith User-Agent: Mozilla/5.0 (X11; Linux i686; rv:14.0) Gecko/20120714 Thunderbird/14.0 MIME-Version: 1.0 To: freebsd-numerics@freebsd.org References: <5017111E.6060003@missouri.edu> <501C361D.4010807@missouri.edu> <20120804165555.X1231@besplex.bde.org> <501D51D7.1020101@missouri.edu> <20120805030609.R3101@besplex.bde.org> <501D9C36.2040207@missouri.edu> <20120805175106.X3574@besplex.bde.org> <501EC015.3000808@missouri.edu> <20120805191954.GA50379@troutmask.apl.washington.edu> <20120807205725.GA10572@server.rulingia.com> <20120809025220.N4114@besplex.bde.org> <5027F07E.9060409@missouri.edu> <20120814003614.H3692@besplex.bde.org> <50295887.2010608@missouri.edu> <20120814055931.Q4897@besplex.bde.org> <50297468.20902@missouri.edu> <20120814173931.V934@besplex.bde.org> <502A820C.6060804@missouri.edu> In-Reply-To: <502A820C.6060804@missouri.edu> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: Complex arg-trig functions X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 14 Aug 2012 17:02:14 -0000 On 08/14/2012 11:51 AM, Stephen Montgomery-Smith wrote: > On the other hand, getting sin(x) correct when x is close to, say, 10pi > would be useful. It would be nice if sin(x+0.01)-sin(x) is > approximately 0.01*cos(x), and if the computation (x mod pi/2) didn't > take into account a few extra digits of pi, and can see this failing for > certain values of x. s/0.01/n*DBL_EPSILON/ for small n. From owner-freebsd-numerics@FreeBSD.ORG Tue Aug 14 17:37:19 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 5C59B106566B for ; Tue, 14 Aug 2012 17:37:19 +0000 (UTC) (envelope-from stephen@missouri.edu) Received: from wilberforce.math.missouri.edu (wilberforce.math.missouri.edu [128.206.184.213]) by mx1.freebsd.org (Postfix) with ESMTP id 25A958FC08 for ; Tue, 14 Aug 2012 17:37:18 +0000 (UTC) Received: from [127.0.0.1] (wilberforce.math.missouri.edu [128.206.184.213]) by wilberforce.math.missouri.edu (8.14.5/8.14.5) with ESMTP id q7EHbGVp033674 for ; Tue, 14 Aug 2012 12:37:17 -0500 (CDT) (envelope-from stephen@missouri.edu) Message-ID: <502A8CCC.5080606@missouri.edu> Date: Tue, 14 Aug 2012 12:37:16 -0500 From: Stephen Montgomery-Smith User-Agent: Mozilla/5.0 (X11; Linux i686; rv:14.0) Gecko/20120714 Thunderbird/14.0 MIME-Version: 1.0 To: freebsd-numerics@freebsd.org Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Subject: Status of expl logl X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 14 Aug 2012 17:37:19 -0000 Are people working on expl, logl and log1pl? I was trying to brainstorm ideas. If one had an expl, one could create a logl as follows: long double logl(long double x) { long double y; y = log(x); y -= 1 - x*expl(-y); } Of course you would need prechecks to make sure log(x) doesn't overflow, etc. But my thinking is this. Use log(x) as a starting value for Newton's Method. Since Newton's Method roughly doubles the number digits of precision, only one iteration is needed. It doesn't work so well if 1/e < x < e. Anyway, just throwing out an idea. Also, I came across this algorithm: http://en.wikipedia.org/wiki/Logarithm#Arithmetic-geometric_mean_approximation It does need a few extra bits of precision than you want in the final answer. But it is used by mpfr, and it is surprisingly fast. From owner-freebsd-numerics@FreeBSD.ORG Tue Aug 14 17:52:58 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id E3380106566C for ; Tue, 14 Aug 2012 17:52:58 +0000 (UTC) (envelope-from sgk@troutmask.apl.washington.edu) Received: from troutmask.apl.washington.edu (troutmask.apl.washington.edu [128.95.76.21]) by mx1.freebsd.org (Postfix) with ESMTP id C0BE38FC08 for ; Tue, 14 Aug 2012 17:52:58 +0000 (UTC) Received: from troutmask.apl.washington.edu (localhost.apl.washington.edu [127.0.0.1]) by troutmask.apl.washington.edu (8.14.5/8.14.5) with ESMTP id q7EHqwAW069960; Tue, 14 Aug 2012 10:52:58 -0700 (PDT) (envelope-from sgk@troutmask.apl.washington.edu) Received: (from sgk@localhost) by troutmask.apl.washington.edu (8.14.5/8.14.5/Submit) id q7EHqvGl069959; Tue, 14 Aug 2012 10:52:57 -0700 (PDT) (envelope-from sgk) Date: Tue, 14 Aug 2012 10:52:57 -0700 From: Steve Kargl To: Stephen Montgomery-Smith Message-ID: <20120814175257.GA69865@troutmask.apl.washington.edu> References: <502A8CCC.5080606@missouri.edu> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <502A8CCC.5080606@missouri.edu> User-Agent: Mutt/1.4.2.3i Cc: freebsd-numerics@freebsd.org Subject: Re: Status of expl logl X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 14 Aug 2012 17:52:59 -0000 On Tue, Aug 14, 2012 at 12:37:16PM -0500, Stephen Montgomery-Smith wrote: > Are people working on expl, logl and log1pl? > Yes. I have expl and expm1l written. Need to fix a few issues that Bruce pointed out, and need to write an ld128/expm1l. Bruce has multiple versions of logl and log1pl. I have a copy of one of his ld80/logl, which I've been testing/using for a a few years. > I was trying to brainstorm ideas. If one had an expl, one could create > a logl as follows: > > long double logl(long double x) > { > long double y; > > y = log(x); > y -= 1 - x*expl(-y); > } > > Of course you would need prechecks to make sure log(x) doesn't overflow, > etc. > > But my thinking is this. Use log(x) as a starting value for Newton's > Method. Since Newton's Method roughly doubles the number digits of > precision, only one iteration is needed. > > It doesn't work so well if 1/e < x < e. > > Anyway, just throwing out an idea. > It is much easier to read Tang's papers, and implement his algorithms. Then, you send the code to Bruce and watch the optimization machine churn over the code. :-) PTP Tang, "Table-Driven Implementation of the Expml Function In IEEE Floating-Point Arithmetic," ACM Trans. Math. Soft., 18, 1992, 211-222. PTP Tang, "Table-Driven Implementation of the Exponential Function in IEEE Floating-Point Arithmetic," ACM Trans. Math. Soft., 15, 1989, 144-157. PTP Tang, "Table-Driven Implementation of the Logarithm Function in IEEE Floating-Point Arithmetic," ACM Trans. Math. Soft., 16, 1990, 378-400. -- Steve From owner-freebsd-numerics@FreeBSD.ORG Tue Aug 14 18:35:19 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 5FB65106566B for ; Tue, 14 Aug 2012 18:35:19 +0000 (UTC) (envelope-from sgk@troutmask.apl.washington.edu) Received: from troutmask.apl.washington.edu (troutmask.apl.washington.edu [128.95.76.21]) by mx1.freebsd.org (Postfix) with ESMTP id 224578FC12 for ; Tue, 14 Aug 2012 18:35:19 +0000 (UTC) Received: from troutmask.apl.washington.edu (localhost.apl.washington.edu [127.0.0.1]) by troutmask.apl.washington.edu (8.14.5/8.14.5) with ESMTP id q7EIZIQC070203; Tue, 14 Aug 2012 11:35:18 -0700 (PDT) (envelope-from sgk@troutmask.apl.washington.edu) Received: (from sgk@localhost) by troutmask.apl.washington.edu (8.14.5/8.14.5/Submit) id q7EIZIXZ070202; Tue, 14 Aug 2012 11:35:18 -0700 (PDT) (envelope-from sgk) Date: Tue, 14 Aug 2012 11:35:18 -0700 From: Steve Kargl To: Stephen Montgomery-Smith Message-ID: <20120814183518.GA70092@troutmask.apl.washington.edu> References: <502A8CCC.5080606@missouri.edu> <20120814175257.GA69865@troutmask.apl.washington.edu> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20120814175257.GA69865@troutmask.apl.washington.edu> User-Agent: Mutt/1.4.2.3i Cc: freebsd-numerics@freebsd.org Subject: Re: Status of expl logl X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 14 Aug 2012 18:35:19 -0000 On Tue, Aug 14, 2012 at 10:52:57AM -0700, Steve Kargl wrote: > On Tue, Aug 14, 2012 at 12:37:16PM -0500, Stephen Montgomery-Smith wrote: > > Are people working on expl, logl and log1pl? > > > I forgot to mention that if you're looking for another function to implement, then AFAIK no one is working on ld80/powl() and ld128/powl(). See the comment in src/e_pow.c for the algorithm used in fdlibm. -- Steve From owner-freebsd-numerics@FreeBSD.ORG Tue Aug 14 18:40:33 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 07575106566C for ; Tue, 14 Aug 2012 18:40:33 +0000 (UTC) (envelope-from stephen@missouri.edu) Received: from wilberforce.math.missouri.edu (wilberforce.math.missouri.edu [128.206.184.213]) by mx1.freebsd.org (Postfix) with ESMTP id B91168FC17 for ; Tue, 14 Aug 2012 18:40:27 +0000 (UTC) Received: from [127.0.0.1] (wilberforce.math.missouri.edu [128.206.184.213]) by wilberforce.math.missouri.edu (8.14.5/8.14.5) with ESMTP id q7EIePQ4050088 for ; Tue, 14 Aug 2012 13:40:25 -0500 (CDT) (envelope-from stephen@missouri.edu) Message-ID: <502A9B99.7090309@missouri.edu> Date: Tue, 14 Aug 2012 13:40:25 -0500 From: Stephen Montgomery-Smith User-Agent: Mozilla/5.0 (X11; Linux i686; rv:14.0) Gecko/20120714 Thunderbird/14.0 MIME-Version: 1.0 To: freebsd-numerics@freebsd.org References: <5017111E.6060003@missouri.edu> <501C361D.4010807@missouri.edu> <20120804165555.X1231@besplex.bde.org> <501D51D7.1020101@missouri.edu> <20120805030609.R3101@besplex.bde.org> <501D9C36.2040207@missouri.edu> <20120805175106.X3574@besplex.bde.org> <501EC015.3000808@missouri.edu> <20120805191954.GA50379@troutmask.apl.washington.edu> <20120807205725.GA10572@server.rulingia.com> <20120809025220.N4114@besplex.bde.org> <5027F07E.9060409@missouri.edu> <20120814003614.H3692@besplex.bde.org> <50295887.2010608@missouri.edu> <20120814055931.Q4897@besplex.bde.org> <50297468.20902@missouri.edu> <20120814173931.V934@besplex.bde.org> <502A820C.6060804@missouri.edu> <502A8494.2050707@missouri.edu> In-Reply-To: <502A8494.2050707@missouri.edu> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: Complex arg-trig functions X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 14 Aug 2012 18:40:33 -0000 I was thinking more about the formulas of the type casinhl(z) = clogl(z+csqrtl(z^2+1)). It seems to be far more inaccurate than I originally thought. mpc uses this method, and it tries to adjust the number of bits it uses until it get the right answer. If you give it something like z = 1L + 1e-3000L*I, mpc takes an extraordinarily large amount of time to do the calculation (it is worse for acos than for asin). I added some printf statements to the code for mpc. To calculate to 100 bits, mpc is sometimes using a precision of 7000 bits for its internal calculations. Also, for its acos and atan functions, it tries it with a certain number of bits, then adds a small number to the precision, and tries again. With asin, it at least multiples the precision by 1.5 with each retry. I adjusted the code so that acos and atan do the same as asin, and now it goes very much faster. But they would still be better off using the Hull, Fairgrieve and Tang algorithm, which seems to be very superior. And in their case they wouldn't have to worry about underflow and overflow. I also looked at Mathematica. It doesn't seem to use the above formula. But whatever it does use, it is only marginally better. From owner-freebsd-numerics@FreeBSD.ORG Tue Aug 14 18:47:17 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 9EB6C106564A for ; Tue, 14 Aug 2012 18:47:17 +0000 (UTC) (envelope-from stephen@missouri.edu) Received: from wilberforce.math.missouri.edu (wilberforce.math.missouri.edu [128.206.184.213]) by mx1.freebsd.org (Postfix) with ESMTP id 6680F8FC14 for ; Tue, 14 Aug 2012 18:47:16 +0000 (UTC) Received: from [127.0.0.1] (wilberforce.math.missouri.edu [128.206.184.213]) by wilberforce.math.missouri.edu (8.14.5/8.14.5) with ESMTP id q7EIlGht050538; Tue, 14 Aug 2012 13:47:16 -0500 (CDT) (envelope-from stephen@missouri.edu) Message-ID: <502A9D34.7070203@missouri.edu> Date: Tue, 14 Aug 2012 13:47:16 -0500 From: Stephen Montgomery-Smith User-Agent: Mozilla/5.0 (X11; Linux i686; rv:14.0) Gecko/20120714 Thunderbird/14.0 MIME-Version: 1.0 To: Steve Kargl References: <502A8CCC.5080606@missouri.edu> <20120814175257.GA69865@troutmask.apl.washington.edu> In-Reply-To: <20120814175257.GA69865@troutmask.apl.washington.edu> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-numerics@freebsd.org Subject: Re: Status of expl logl X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 14 Aug 2012 18:47:17 -0000 On 08/14/2012 12:52 PM, Steve Kargl wrote: > It is much easier to read Tang's papers, and implement his > algorithms. Then, you send the code to Bruce and watch > the optimization machine churn over the code. :-) > > PTP Tang, "Table-Driven Implementation of the Expml Function In IEEE > Floating-Point Arithmetic," ACM Trans. Math. Soft., 18, 1992, > 211-222. > > PTP Tang, "Table-Driven Implementation of the Exponential Function > in IEEE Floating-Point Arithmetic," ACM Trans. Math. Soft., 15, 1989, > 144-157. > > PTP Tang, "Table-Driven Implementation of the Logarithm Function in IEEE > Floating-Point Arithmetic," ACM Trans. Math. Soft., 16, 1990, 378-400. > That must be the same Tang who co-wrote the paper with Hull and Fairgrieve on the arcsin. I had an email conversation with Fairgrieve a few days ago, because I wanted to know if they had written a paper on the arctan. It turned out that Hull died soon after the paper on arcsine was completed. Fairgrieve sent me the paper on arctan, which they neither completed nor published. He doesn't want it spread widely because I think he wants to publish it one day. But it is very similar to the algorithm I developed myself for catanh. I do have a bad habit of trying to create the mathematics for myself, and internally I consider it cheating to read the literature. This habit has bitten me several times in the past, and I have had papers rejected because I didn't properly cite the existing literature, and had reinvented the wheel. From owner-freebsd-numerics@FreeBSD.ORG Tue Aug 14 18:52:52 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 24616106566C for ; Tue, 14 Aug 2012 18:52:51 +0000 (UTC) (envelope-from stephen@missouri.edu) Received: from wilberforce.math.missouri.edu (wilberforce.math.missouri.edu [128.206.184.213]) by mx1.freebsd.org (Postfix) with ESMTP id A405C8FC08 for ; Tue, 14 Aug 2012 18:52:51 +0000 (UTC) Received: from [127.0.0.1] (wilberforce.math.missouri.edu [128.206.184.213]) by wilberforce.math.missouri.edu (8.14.5/8.14.5) with ESMTP id q7EIqoOk050901; Tue, 14 Aug 2012 13:52:50 -0500 (CDT) (envelope-from stephen@missouri.edu) Message-ID: <502A9E82.9090905@missouri.edu> Date: Tue, 14 Aug 2012 13:52:50 -0500 From: Stephen Montgomery-Smith User-Agent: Mozilla/5.0 (X11; Linux i686; rv:14.0) Gecko/20120714 Thunderbird/14.0 MIME-Version: 1.0 To: Steve Kargl References: <502A8CCC.5080606@missouri.edu> <20120814175257.GA69865@troutmask.apl.washington.edu> <20120814183518.GA70092@troutmask.apl.washington.edu> In-Reply-To: <20120814183518.GA70092@troutmask.apl.washington.edu> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-numerics@freebsd.org Subject: Re: Status of expl logl X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 14 Aug 2012 18:52:52 -0000 On 08/14/2012 01:35 PM, Steve Kargl wrote: > On Tue, Aug 14, 2012 at 10:52:57AM -0700, Steve Kargl wrote: >> On Tue, Aug 14, 2012 at 12:37:16PM -0500, Stephen Montgomery-Smith wrote: >>> Are people working on expl, logl and log1pl? >>> >> > > I forgot to mention that if you're looking for another > function to implement, then AFAIK no one is working on > ld80/powl() and ld128/powl(). See the comment in > src/e_pow.c for the algorithm used in fdlibm. > What is the logic that dictates the file names in msun/src? Some are s_foo.c, some are e_foo.c, some are k_foo.c, etc. From owner-freebsd-numerics@FreeBSD.ORG Tue Aug 14 19:10:00 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 683AD1065675 for ; Tue, 14 Aug 2012 19:10:00 +0000 (UTC) (envelope-from sgk@troutmask.apl.washington.edu) Received: from troutmask.apl.washington.edu (troutmask.apl.washington.edu [128.95.76.21]) by mx1.freebsd.org (Postfix) with ESMTP id 429E58FC18 for ; Tue, 14 Aug 2012 19:10:00 +0000 (UTC) Received: from troutmask.apl.washington.edu (localhost.apl.washington.edu [127.0.0.1]) by troutmask.apl.washington.edu (8.14.5/8.14.5) with ESMTP id q7EJ9wwf070374; Tue, 14 Aug 2012 12:09:58 -0700 (PDT) (envelope-from sgk@troutmask.apl.washington.edu) Received: (from sgk@localhost) by troutmask.apl.washington.edu (8.14.5/8.14.5/Submit) id q7EJ9w3t070373; Tue, 14 Aug 2012 12:09:58 -0700 (PDT) (envelope-from sgk) Date: Tue, 14 Aug 2012 12:09:58 -0700 From: Steve Kargl To: Stephen Montgomery-Smith Message-ID: <20120814190958.GA70225@troutmask.apl.washington.edu> References: <502A8CCC.5080606@missouri.edu> <20120814175257.GA69865@troutmask.apl.washington.edu> <20120814183518.GA70092@troutmask.apl.washington.edu> <502A9E82.9090905@missouri.edu> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <502A9E82.9090905@missouri.edu> User-Agent: Mutt/1.4.2.3i Cc: freebsd-numerics@freebsd.org Subject: Re: Status of expl logl X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 14 Aug 2012 19:10:00 -0000 On Tue, Aug 14, 2012 at 01:52:50PM -0500, Stephen Montgomery-Smith wrote: > On 08/14/2012 01:35 PM, Steve Kargl wrote: > >On Tue, Aug 14, 2012 at 10:52:57AM -0700, Steve Kargl wrote: > >>On Tue, Aug 14, 2012 at 12:37:16PM -0500, Stephen Montgomery-Smith wrote: > >>>Are people working on expl, logl and log1pl? > >>> > >> > > > >I forgot to mention that if you're looking for another > >function to implement, then AFAIK no one is working on > >ld80/powl() and ld128/powl(). See the comment in > >src/e_pow.c for the algorithm used in fdlibm. > > > > > What is the logic that dictates the file names in msun/src? > > Some are s_foo.c, some are e_foo.c, some are k_foo.c, etc. See http://www.netlib.org/fdlibm/readme, section 3. -- Steve From owner-freebsd-numerics@FreeBSD.ORG Tue Aug 14 19:39:30 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 45510106564A for ; Tue, 14 Aug 2012 19:39:30 +0000 (UTC) (envelope-from stephen@missouri.edu) Received: from wilberforce.math.missouri.edu (wilberforce.math.missouri.edu [128.206.184.213]) by mx1.freebsd.org (Postfix) with ESMTP id 0C04A8FC17 for ; Tue, 14 Aug 2012 19:39:29 +0000 (UTC) Received: from [127.0.0.1] (wilberforce.math.missouri.edu [128.206.184.213]) by wilberforce.math.missouri.edu (8.14.5/8.14.5) with ESMTP id q7EJdToY053930; Tue, 14 Aug 2012 14:39:29 -0500 (CDT) (envelope-from stephen@missouri.edu) Message-ID: <502AA971.4010403@missouri.edu> Date: Tue, 14 Aug 2012 14:39:29 -0500 From: Stephen Montgomery-Smith User-Agent: Mozilla/5.0 (X11; Linux i686; rv:14.0) Gecko/20120714 Thunderbird/14.0 MIME-Version: 1.0 To: Steve Kargl References: <502A8CCC.5080606@missouri.edu> <20120814175257.GA69865@troutmask.apl.washington.edu> <20120814183518.GA70092@troutmask.apl.washington.edu> In-Reply-To: <20120814183518.GA70092@troutmask.apl.washington.edu> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-numerics@freebsd.org Subject: Re: Status of expl logl X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 14 Aug 2012 19:39:30 -0000 On 08/14/2012 01:35 PM, Steve Kargl wrote: > On Tue, Aug 14, 2012 at 10:52:57AM -0700, Steve Kargl wrote: >> On Tue, Aug 14, 2012 at 12:37:16PM -0500, Stephen Montgomery-Smith wrote: >>> Are people working on expl, logl and log1pl? >>> >> > > I forgot to mention that if you're looking for another > function to implement, then AFAIK no one is working on > ld80/powl() and ld128/powl(). See the comment in > src/e_pow.c for the algorithm used in fdlibm. > So I am looking through src/e_pow.c. It seems to me that the constants L1, L2, L3, etc, are 3/5, 3/7, 3/9, etc, but not exactly these constants. So they must have used some process where they jiggled the constants around, perhaps using trial and error, to get a few extra ulp. Is that right? Also, I am trying to see what P1, P2, P3, etc are. They seem to be related to the factorial (maybe a power series related to exp(x)), but I must admit that I am not getting it. Is there a more detailed reference to how these numbers were obtained? A paper somewhere? Finally, what is ovfl (in the definition of ovt) meant to be? Thanks, Stephen From owner-freebsd-numerics@FreeBSD.ORG Tue Aug 14 19:56:59 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id C895F106566B for ; Tue, 14 Aug 2012 19:56:59 +0000 (UTC) (envelope-from sgk@troutmask.apl.washington.edu) Received: from troutmask.apl.washington.edu (troutmask.apl.washington.edu [128.95.76.21]) by mx1.freebsd.org (Postfix) with ESMTP id 85A068FC18 for ; Tue, 14 Aug 2012 19:56:59 +0000 (UTC) Received: from troutmask.apl.washington.edu (localhost.apl.washington.edu [127.0.0.1]) by troutmask.apl.washington.edu (8.14.5/8.14.5) with ESMTP id q7EJuxdm070623; Tue, 14 Aug 2012 12:56:59 -0700 (PDT) (envelope-from sgk@troutmask.apl.washington.edu) Received: (from sgk@localhost) by troutmask.apl.washington.edu (8.14.5/8.14.5/Submit) id q7EJuxiP070622; Tue, 14 Aug 2012 12:56:59 -0700 (PDT) (envelope-from sgk) Date: Tue, 14 Aug 2012 12:56:59 -0700 From: Steve Kargl To: Stephen Montgomery-Smith Message-ID: <20120814195659.GA70571@troutmask.apl.washington.edu> References: <502A8CCC.5080606@missouri.edu> <20120814175257.GA69865@troutmask.apl.washington.edu> <20120814183518.GA70092@troutmask.apl.washington.edu> <502AA971.4010403@missouri.edu> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <502AA971.4010403@missouri.edu> User-Agent: Mutt/1.4.2.3i Cc: freebsd-numerics@freebsd.org Subject: Re: Status of expl logl X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 14 Aug 2012 19:56:59 -0000 On Tue, Aug 14, 2012 at 02:39:29PM -0500, Stephen Montgomery-Smith wrote: > On 08/14/2012 01:35 PM, Steve Kargl wrote: > >On Tue, Aug 14, 2012 at 10:52:57AM -0700, Steve Kargl wrote: > >>On Tue, Aug 14, 2012 at 12:37:16PM -0500, Stephen Montgomery-Smith wrote: > >>>Are people working on expl, logl and log1pl? > >>> > >> > > > >I forgot to mention that if you're looking for another > >function to implement, then AFAIK no one is working on > >ld80/powl() and ld128/powl(). See the comment in > >src/e_pow.c for the algorithm used in fdlibm. > > > > So I am looking through src/e_pow.c. > > It seems to me that the constants L1, L2, L3, etc, are 3/5, 3/7, 3/9, > etc, but not exactly these constants. So they must have used some > process where they jiggled the constants around, perhaps using trial and > error, to get a few extra ulp. Is that right? > > Also, I am trying to see what P1, P2, P3, etc are. They seem to be > related to the factorial (maybe a power series related to exp(x)), but I > must admit that I am not getting it. > > Is there a more detailed reference to how these numbers were obtained? > A paper somewhere? > > Finally, what is ovfl (in the definition of ovt) meant to be? I haven't looked too closely at the details of pow[fl](). I am not aware of any published paper that gives the details. AFAIK, the comment in e_pow.c is only detailed description (other than the code). I tried to find a paper about pow() implementations on Sunday with a very cursory google search. Came up empty. The L and P constants are used in lines 235 and 299, line 235: r = s2*s2*(L1+s2*(L2+s2*(L3+s2*(L4+s2*(L5+s2*L6))))); line 299: t1 = z - t*(P1+t*(P2+t*(P3+t*(P4+t*P5)))); These are polynomials that are evaluated via Horner's method. I suspect the jiggling that you mention is actually a result of a Remes minimax procedure. ovfl looks like it's used to define an overflow threshold (ie, ovt). -- Steve From owner-freebsd-numerics@FreeBSD.ORG Tue Aug 14 20:06:10 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id CF0D1106564A for ; Tue, 14 Aug 2012 20:06:10 +0000 (UTC) (envelope-from stephen@missouri.edu) Received: from wilberforce.math.missouri.edu (wilberforce.math.missouri.edu [128.206.184.213]) by mx1.freebsd.org (Postfix) with ESMTP id 7B78A8FC0A for ; Tue, 14 Aug 2012 20:06:10 +0000 (UTC) Received: from [127.0.0.1] (wilberforce.math.missouri.edu [128.206.184.213]) by wilberforce.math.missouri.edu (8.14.5/8.14.5) with ESMTP id q7EK63pQ055707; Tue, 14 Aug 2012 15:06:04 -0500 (CDT) (envelope-from stephen@missouri.edu) Message-ID: <502AAFAC.4020906@missouri.edu> Date: Tue, 14 Aug 2012 15:06:04 -0500 From: Stephen Montgomery-Smith User-Agent: Mozilla/5.0 (X11; Linux i686; rv:14.0) Gecko/20120714 Thunderbird/14.0 MIME-Version: 1.0 To: Steve Kargl References: <502A8CCC.5080606@missouri.edu> <20120814175257.GA69865@troutmask.apl.washington.edu> <20120814183518.GA70092@troutmask.apl.washington.edu> <502AA971.4010403@missouri.edu> <20120814195659.GA70571@troutmask.apl.washington.edu> In-Reply-To: <20120814195659.GA70571@troutmask.apl.washington.edu> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-numerics@freebsd.org Subject: Re: Status of expl logl X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 14 Aug 2012 20:06:10 -0000 On 08/14/2012 02:56 PM, Steve Kargl wrote: > On Tue, Aug 14, 2012 at 02:39:29PM -0500, Stephen Montgomery-Smith wrote: >> On 08/14/2012 01:35 PM, Steve Kargl wrote: >>> On Tue, Aug 14, 2012 at 10:52:57AM -0700, Steve Kargl wrote: >>>> On Tue, Aug 14, 2012 at 12:37:16PM -0500, Stephen Montgomery-Smith wrote: >>>>> Are people working on expl, logl and log1pl? >>>>> >>>> >>> >>> I forgot to mention that if you're looking for another >>> function to implement, then AFAIK no one is working on >>> ld80/powl() and ld128/powl(). See the comment in >>> src/e_pow.c for the algorithm used in fdlibm. >>> >> >> So I am looking through src/e_pow.c. >> >> It seems to me that the constants L1, L2, L3, etc, are 3/5, 3/7, 3/9, >> etc, but not exactly these constants. So they must have used some >> process where they jiggled the constants around, perhaps using trial and >> error, to get a few extra ulp. Is that right? >> >> Also, I am trying to see what P1, P2, P3, etc are. They seem to be >> related to the factorial (maybe a power series related to exp(x)), but I >> must admit that I am not getting it. >> >> Is there a more detailed reference to how these numbers were obtained? >> A paper somewhere? >> >> Finally, what is ovfl (in the definition of ovt) meant to be? > > I haven't looked too closely at the details of pow[fl](). I am > not aware of any published paper that gives the details. AFAIK, > the comment in e_pow.c is only detailed description (other than > the code). I tried to find a paper about pow() implementations > on Sunday with a very cursory google search. Came up empty. > > The L and P constants are used in lines 235 and 299, > line 235: r = s2*s2*(L1+s2*(L2+s2*(L3+s2*(L4+s2*(L5+s2*L6))))); > line 299: t1 = z - t*(P1+t*(P2+t*(P3+t*(P4+t*P5)))); > > These are polynomials that are evaluated via Horner's method. > I suspect the jiggling that you mention is actually a result > of a Remes minimax procedure. > > ovfl looks like it's used to define an overflow threshold (ie, ovt). > OK. I'll look into trying to reverse engineer the code. But the new semester is starting, and I will have to go back to work. So I may put it off for a long time, or not do it. (That is to say, if someone else wants to do it, they will not be treading on my toes.) From owner-freebsd-numerics@FreeBSD.ORG Tue Aug 14 22:16:03 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id BDAE4106564A for ; Tue, 14 Aug 2012 22:16:03 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from mail15.syd.optusnet.com.au (mail15.syd.optusnet.com.au [211.29.132.196]) by mx1.freebsd.org (Postfix) with ESMTP id 5453C8FC17 for ; Tue, 14 Aug 2012 22:16:02 +0000 (UTC) Received: from c122-106-171-246.carlnfd1.nsw.optusnet.com.au (c122-106-171-246.carlnfd1.nsw.optusnet.com.au [122.106.171.246]) by mail15.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id q7EMFqoL004566 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Wed, 15 Aug 2012 08:15:54 +1000 Date: Wed, 15 Aug 2012 08:15:52 +1000 (EST) From: Bruce Evans X-X-Sender: bde@besplex.bde.org To: Steve Kargl In-Reply-To: <20120814195659.GA70571@troutmask.apl.washington.edu> Message-ID: <20120815070807.B3431@besplex.bde.org> References: <502A8CCC.5080606@missouri.edu> <20120814175257.GA69865@troutmask.apl.washington.edu> <20120814183518.GA70092@troutmask.apl.washington.edu> <502AA971.4010403@missouri.edu> <20120814195659.GA70571@troutmask.apl.washington.edu> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: Stephen Montgomery-Smith , freebsd-numerics@freebsd.org Subject: Re: Status of expl logl X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 14 Aug 2012 22:16:03 -0000 On Tue, 14 Aug 2012, Steve Kargl wrote: > On Tue, Aug 14, 2012 at 02:39:29PM -0500, Stephen Montgomery-Smith wrote: >> On 08/14/2012 01:35 PM, Steve Kargl wrote: >>> On Tue, Aug 14, 2012 at 10:52:57AM -0700, Steve Kargl wrote: >>>> On Tue, Aug 14, 2012 at 12:37:16PM -0500, Stephen Montgomery-Smith wrote: >>>>> Are people working on expl, logl and log1pl? -rw-r--r-- 1 bde wheel 37470 Aug 10 00:08 ld128/s_logl.c -rw-r--r-- 1 bde wheel 32946 Aug 10 00:07 ld80/s_logl.c -rw-r--r-- 1 bde wheel 32056 Aug 10 00:06 s_log.c -rw-r--r-- 1 bde wheel 30053 Aug 10 00:05 s_logf.c These are mostly large comments and larger tables. Each file implements log, log10, log2, log1p for the given precision. Internal accuracy is about 7 extra bits of precision. Speed on core2 with optimal CFLAGS is 30-70 cycles depending on the precision. >> ... >> So I am looking through src/e_pow.c. >> >> It seems to me that the constants L1, L2, L3, etc, are 3/5, 3/7, 3/9, >> etc, but not exactly these constants. So they must have used some >> process where they jiggled the constants around, perhaps using trial and >> error, to get a few extra ulp. Is that right? Hopefully not by trial and error. There is the Remes algorithm. I use a 1000 line (plus infrastructure) pari program to implement a form of this algorithm. In general, for nice functions like cos through not so nice functions like tan near 0, the best possible (minimax) polynomial approximation seems to need about 1 term less than a Taylor approximation, and are only about 1 bit better than a Chebyshev approximation. >> Also, I am trying to see what P1, P2, P3, etc are. They seem to be >> related to the factorial (maybe a power series related to exp(x)), but I >> must admit that I am not getting it. These must be for exp(). Indeed, they are identical with the Pn's in e_exp.c. They are essentially Bernoulli numbers. One generating function for Bernoulli numbers with a certain normalization is z/(exp(z) - 1) := sum(n = 0, Inf, B[n]/n!*z^n) and e_exp.c transforms exp(x) to essentially x/(exp(x) - 1). Applying Remes to them makes Pn not quite a nice fraction. >> Is there a more detailed reference to how these numbers were obtained? >> A paper somewhere? e_exp.c mentions Remes. >> Finally, what is ovfl (in the definition of ovt) meant to be? Something to do with exp's overflow threshold I think. The comment about ovt doesn't seem to match the value (I don't see why the value is so small), but the comment is very similar to the one that I wrote for exp's overflow threshold in ld80/s_expl.c. (fdlibm e_exp.c only gives the magic number for the threshold.) > I haven't looked too closely at the details of pow[fl](). I am > not aware of any published paper that gives the details. AFAIK, > the comment in e_pow.c is only detailed description (other than > the code). I tried to find a paper about pow() implementations > on Sunday with a very cursory google search. Came up empty. > > The L and P constants are used in lines 235 and 299, > line 235: r = s2*s2*(L1+s2*(L2+s2*(L3+s2*(L4+s2*(L5+s2*L6))))); > line 299: t1 = z - t*(P1+t*(P2+t*(P3+t*(P4+t*P5)))); > > These are polynomials that are evaluated via Horner's method. > I suspect the jiggling that you mention is actually a result > of a Remes minimax procedure. > > ovfl looks like it's used to define an overflow threshold (ie, ovt). I hardly looked at e_pow.c before. It is apparently half about repeating e_log.c and e_exp.c, to get at their extra internal precision. In old BSD libm (which FreeBSD used briefly before fdlibm was imported), pow() and some other functions got their extra precision from __log__D() and __exp__D(). These functions are still in msun/bsdsrc and are used in tgamma(). My logl() and Steve's expl() have similar extra precision internally, and run about 4 times faster than the old BSD functions. Bruce From owner-freebsd-numerics@FreeBSD.ORG Tue Aug 14 23:41:44 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 405831065672 for ; Tue, 14 Aug 2012 23:41:44 +0000 (UTC) (envelope-from stephen@missouri.edu) Received: from wilberforce.math.missouri.edu (wilberforce.math.missouri.edu [128.206.184.213]) by mx1.freebsd.org (Postfix) with ESMTP id DD0928FC20 for ; Tue, 14 Aug 2012 23:41:43 +0000 (UTC) Received: from [127.0.0.1] (wilberforce.math.missouri.edu [128.206.184.213]) by wilberforce.math.missouri.edu (8.14.5/8.14.5) with ESMTP id q7ENffVT077666; Tue, 14 Aug 2012 18:41:41 -0500 (CDT) (envelope-from stephen@missouri.edu) Message-ID: <502AE235.6050702@missouri.edu> Date: Tue, 14 Aug 2012 18:41:41 -0500 From: Stephen Montgomery-Smith User-Agent: Mozilla/5.0 (X11; Linux i686; rv:14.0) Gecko/20120714 Thunderbird/14.0 MIME-Version: 1.0 To: Bruce Evans References: <502A8CCC.5080606@missouri.edu> <20120814175257.GA69865@troutmask.apl.washington.edu> <20120814183518.GA70092@troutmask.apl.washington.edu> <502AA971.4010403@missouri.edu> <20120814195659.GA70571@troutmask.apl.washington.edu> <20120815070807.B3431@besplex.bde.org> In-Reply-To: <20120815070807.B3431@besplex.bde.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-numerics@freebsd.org, Steve Kargl Subject: Re: Status of expl logl X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 14 Aug 2012 23:41:44 -0000 On 08/14/2012 05:15 PM, Bruce Evans wrote: > On Tue, 14 Aug 2012, Steve Kargl wrote: > >> On Tue, Aug 14, 2012 at 02:39:29PM -0500, Stephen Montgomery-Smith wrote: >>> On 08/14/2012 01:35 PM, Steve Kargl wrote: >>>> On Tue, Aug 14, 2012 at 10:52:57AM -0700, Steve Kargl wrote: >>>>> On Tue, Aug 14, 2012 at 12:37:16PM -0500, Stephen Montgomery-Smith >>>>> wrote: >>>>>> Are people working on expl, logl and log1pl? > > -rw-r--r-- 1 bde wheel 37470 Aug 10 00:08 ld128/s_logl.c > -rw-r--r-- 1 bde wheel 32946 Aug 10 00:07 ld80/s_logl.c > -rw-r--r-- 1 bde wheel 32056 Aug 10 00:06 s_log.c > -rw-r--r-- 1 bde wheel 30053 Aug 10 00:05 s_logf.c > > These are mostly large comments and larger tables. Each file implements > log, log10, log2, log1p for the given precision. Internal accuracy is > about 7 extra bits of precision. Speed on core2 with optimal CFLAGS is > 30-70 cycles depending on the precision. > >>> ... >>> So I am looking through src/e_pow.c. >>> >>> It seems to me that the constants L1, L2, L3, etc, are 3/5, 3/7, 3/9, >>> etc, but not exactly these constants. So they must have used some >>> process where they jiggled the constants around, perhaps using trial and >>> error, to get a few extra ulp. Is that right? > > Hopefully not by trial and error. There is the Remes algorithm. Just looked it up. It looks very nice. >>> Also, I am trying to see what P1, P2, P3, etc are. They seem to be >>> related to the factorial (maybe a power series related to exp(x)), but I >>> must admit that I am not getting it. > > These must be for exp(). Indeed, they are identical with the Pn's in > e_exp.c. They are essentially Bernoulli numbers. One generating > function for Bernoulli numbers with a certain normalization is > > z/(exp(z) - 1) := sum(n = 0, Inf, B[n]/n!*z^n) > > and e_exp.c transforms exp(x) to essentially x/(exp(x) - 1). Applying > Remes to them makes Pn not quite a nice fraction. Yes. They are the coefficients of 2z/(exp(z)-1). Thank you. From owner-freebsd-numerics@FreeBSD.ORG Tue Aug 14 23:58:46 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 791C4106566B for ; Tue, 14 Aug 2012 23:58:46 +0000 (UTC) (envelope-from peter@rulingia.com) Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by mx1.freebsd.org (Postfix) with ESMTP id 062598FC0C for ; Tue, 14 Aug 2012 23:58:45 +0000 (UTC) Received: from server.rulingia.com (c220-239-249-137.belrs5.nsw.optusnet.com.au [220.239.249.137]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id q7ENwcnm012582 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Wed, 15 Aug 2012 09:58:38 +1000 (EST) (envelope-from peter@rulingia.com) X-Bogosity: Ham, spamicity=0.000000 Received: from server.rulingia.com (localhost.rulingia.com [127.0.0.1]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id q7ENwW8S034599 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Wed, 15 Aug 2012 09:58:32 +1000 (EST) (envelope-from peter@server.rulingia.com) Received: (from peter@localhost) by server.rulingia.com (8.14.5/8.14.5/Submit) id q7ENwWCC034598; Wed, 15 Aug 2012 09:58:32 +1000 (EST) (envelope-from peter) Date: Wed, 15 Aug 2012 09:58:32 +1000 From: Peter Jeremy To: Bruce Evans Message-ID: <20120814235832.GC33399@server.rulingia.com> References: <502A8CCC.5080606@missouri.edu> <20120814175257.GA69865@troutmask.apl.washington.edu> <20120814183518.GA70092@troutmask.apl.washington.edu> <502AA971.4010403@missouri.edu> <20120814195659.GA70571@troutmask.apl.washington.edu> <20120815070807.B3431@besplex.bde.org> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="k1lZvvs/B4yU6o8G" Content-Disposition: inline In-Reply-To: <20120815070807.B3431@besplex.bde.org> X-PGP-Key: http://www.rulingia.com/keys/peter.pgp User-Agent: Mutt/1.5.21 (2010-09-15) Cc: freebsd-numerics@freebsd.org Subject: Re: Status of expl logl X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 14 Aug 2012 23:58:46 -0000 --k1lZvvs/B4yU6o8G Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On 2012-Aug-15 08:15:52 +1000, Bruce Evans wrote: >I hardly looked at e_pow.c before. It is apparently half about repeating >e_log.c and e_exp.c, to get at their extra internal precision. I expect cpow() will similarly have to copy slabs of code from e_pow.c to avoid losing precision or domain. Is it worth pulling some of this "common" code out so that it can be shared amongst all the different functions that need it? --=20 Peter Jeremy --k1lZvvs/B4yU6o8G Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.19 (FreeBSD) iEYEARECAAYFAlAq5igACgkQ/opHv/APuId46wCghPMSIsmybG9yPbkBaXib2iBW O6AAn1pN52NokWGks8qs9tG6/2/csbiN =2gzD -----END PGP SIGNATURE----- --k1lZvvs/B4yU6o8G-- From owner-freebsd-numerics@FreeBSD.ORG Wed Aug 15 03:31:37 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 1897E106564A for ; Wed, 15 Aug 2012 03:31:37 +0000 (UTC) (envelope-from stephen@missouri.edu) Received: from wilberforce.math.missouri.edu (wilberforce.math.missouri.edu [128.206.184.213]) by mx1.freebsd.org (Postfix) with ESMTP id C9D068FC08 for ; Wed, 15 Aug 2012 03:31:36 +0000 (UTC) Received: from [127.0.0.1] (wilberforce.math.missouri.edu [128.206.184.213]) by wilberforce.math.missouri.edu (8.14.5/8.14.5) with ESMTP id q7F3VZ5D094416 for ; Tue, 14 Aug 2012 22:31:35 -0500 (CDT) (envelope-from stephen@missouri.edu) Message-ID: <502B1817.5070401@missouri.edu> Date: Tue, 14 Aug 2012 22:31:35 -0500 From: Stephen Montgomery-Smith User-Agent: Mozilla/5.0 (X11; Linux i686; rv:14.0) Gecko/20120714 Thunderbird/14.0 MIME-Version: 1.0 To: freebsd-numerics@freebsd.org References: <5017111E.6060003@missouri.edu> <501C361D.4010807@missouri.edu> <20120804165555.X1231@besplex.bde.org> <501D51D7.1020101@missouri.edu> <20120805030609.R3101@besplex.bde.org> <501D9C36.2040207@missouri.edu> <20120805175106.X3574@besplex.bde.org> <501EC015.3000808@missouri.edu> <20120805191954.GA50379@troutmask.apl.washington.edu> <20120807205725.GA10572@server.rulingia.com> <20120809025220.N4114@besplex.bde.org> <5027F07E.9060409@missouri.edu> <20120814003614.H3692@besplex.bde.org> <50295887.2010608@missouri.edu> <20120814055931.Q4897@besplex.bde.org> <50297468.20902@missouri.edu> <20120814173931.V934@besplex.bde.org> <502A820C.6060804@missouri.edu> <502A8494.2050707@missouri.edu> <502A9B99.7090309@missouri.edu> In-Reply-To: <502A9B99.7090309@missouri.edu> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: Complex arg-trig functions X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 15 Aug 2012 03:31:37 -0000 I was looking through the code e_acosh.c, and it made me realize I could get a small fraction more ULP in catrig.c by making the replacements: 216c216 < *rx = log1p(Am1 + sqrt(Am1*(A+1))); --- > *rx = log1p(Am1 + sqrt(2*Am1 + Am1*Am1)); 282c282 < *sqrt_A2my2 = sqrt(Amy*(A+y)); --- > *sqrt_A2my2 = sqrt(2*y*Amy + Amy*Amy); I'm not quite sure if the second replacement makes much difference, but the first replacement seemed quite effective. From owner-freebsd-numerics@FreeBSD.ORG Wed Aug 15 13:35:55 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id EAD021065674 for ; Wed, 15 Aug 2012 13:35:54 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from mail14.syd.optusnet.com.au (mail14.syd.optusnet.com.au [211.29.132.195]) by mx1.freebsd.org (Postfix) with ESMTP id 2E1AF8FC17 for ; Wed, 15 Aug 2012 13:35:53 +0000 (UTC) Received: from c122-106-171-246.carlnfd1.nsw.optusnet.com.au (c122-106-171-246.carlnfd1.nsw.optusnet.com.au [122.106.171.246]) by mail14.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id q7FDZidA007438 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Wed, 15 Aug 2012 23:35:46 +1000 Date: Wed, 15 Aug 2012 23:35:44 +1000 (EST) From: Bruce Evans X-X-Sender: bde@besplex.bde.org To: Stephen Montgomery-Smith In-Reply-To: <502A780B.2010106@missouri.edu> Message-ID: <20120815223631.N1751@besplex.bde.org> References: <5017111E.6060003@missouri.edu> <501C361D.4010807@missouri.edu> <20120804165555.X1231@besplex.bde.org> <501D51D7.1020101@missouri.edu> <20120805030609.R3101@besplex.bde.org> <501D9C36.2040207@missouri.edu> <20120805175106.X3574@besplex.bde.org> <501EC015.3000808@missouri.edu> <20120805191954.GA50379@troutmask.apl.washington.edu> <20120807205725.GA10572@server.rulingia.com> <20120809025220.N4114@besplex.bde.org> <5027F07E.9060409@missouri.edu> <20120814003614.H3692@besplex.bde.org> <50295F5C.6010800@missouri.edu> <20120814072946.S5260@besplex.bde.org> <50297CA5.5010900@missouri.edu> <50297E43.7090309@missouri.edu> <20120814201105.T934@besplex.bde.org> <502A780B.2010106@missouri.edu> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: freebsd-numerics@freebsd.org, Bruce Evans Subject: Re: Complex arg-trig functions X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 15 Aug 2012 13:35:55 -0000 On Tue, 14 Aug 2012, Stephen Montgomery-Smith wrote: > On 08/14/2012 05:46 AM, Bruce Evans wrote: >> On Mon, 13 Aug 2012, Stephen Montgomery-Smith wrote: >> >>> if (sqrt_huge+x>one && sqrt_huge+y>one) >> >> x and y can be DBL_MAX, giving overflow. > > Why? When x is DBL_MAX, sqrt_huge is so very much smaller than DBL_MAX that > DBL_MAX+sqrt_huge should be DBL_MAX within floating point precision. So no > overflow. Oh, I see now. The only problem is if someone sets the rounding mode to FE_UPWARD, but we depend on it not being changed from the default of FE_TONEAREST in many places. > It seemed to me that there is a logic behind why the the infs and nans > produce the results they do. I noticed that do_the_hard_work() already got > the answers correct for the real part *rx. Getting the imaginary part to > work as well seemed to me to be the cleanest way to make it work. (I added > all the nan and inf checking after writing the rest of the code.) An up-front check may still be simpler, and gives more control. In csqrt*(), I needed an explicit check and special expressions to get uniform behaviour. I added this to the NaN mixing in catan[h]*(), and now all my tests pass: % diff -c2 catrig.c~ catrig.c % *** catrig.c~ Sun Aug 12 17:29:18 2012 % --- catrig.c Wed Aug 15 11:57:02 2012 % *************** % *** 605,609 **** % */ % if (ISNAN(x) || ISNAN(y)) % ! return (cpack(x+y, x+y)); % % /* (x,inf) and (inf,y) and (inf,inf) -> (0,PI/2) */ % --- 609,613 ---- % */ % if (ISNAN(x) || ISNAN(y)) % ! return (cpack((x+0.0L)+(y+0), (x+0.0L)+(y+0))); % % /* (x,inf) and (inf,y) and (inf,inf) -> (0,PI/2) */ Use this expression in all precisions. I forgot to comment it. Adding 0 quietens signaling NaNs before mixing NaNs. I should have tried y+y. Adding 0.0L promotes part of the expression to long double together with quietening signaling NaNs. The rest of the expression is promoted to match. I should try the old way again: of (long double)x+x. % diff -c2 catrigf.c~ catrigf.c % *** catrigf.c~ Sun Aug 12 17:00:52 2012 % --- catrigf.c Wed Aug 15 11:57:08 2012 % *************** % *** 349,353 **** % % if (isnan(x) || isnan(y)) % ! return (cpackf(x+y, x+y)); % % if (isinf(x) || isinf(y)) % --- 351,355 ---- % % if (isnan(x) || isnan(y)) % ! return (cpack((x+0.0L)+(y+0), (x+0.0L)+(y+0))); % % if (isinf(x) || isinf(y)) % diff -c2 catrigl.c~ catrigl.c % *** catrigl.c~ Sun Aug 12 06:54:46 2012 % --- catrigl.c Wed Aug 15 11:58:46 2012 % *************** % *** 323,327 **** % % if (isnan(x) || isnan(y)) % ! return (cpackl(x+y, x+y)); % % if (isinf(x) || isinf(y)) % --- 325,329 ---- % % if (isnan(x) || isnan(y)) % ! return (cpack((x+0.0L)+(y+0), (x+0.0L)+(y+0))); % % if (isinf(x) || isinf(y)) % Index: ../s_csqrt.c % =================================================================== % RCS file: /home/ncvs/src/lib/msun/src/s_csqrt.c,v % retrieving revision 1.4 % diff -u -2 -r1.4 s_csqrt.c % --- ../s_csqrt.c 8 Aug 2008 00:15:16 -0000 1.4 % +++ ../s_csqrt.c 14 Aug 2012 20:34:07 -0000 % @@ -34,14 +34,5 @@ % #include "math_private.h" % % -/* % - * gcc doesn't implement complex multiplication or division correctly, % - * so we need to handle infinities specially. We turn on this pragma to % - * notify conforming c99 compilers that the fast-but-incorrect code that % - * gcc generates is acceptable, since the special cases have already been % - * handled. % - */ % -#pragma STDC CX_LIMITED_RANGE ON Remove this. There was only 1 complex expression, and it depended on the negation of this pragma to work. Since gcc doesn't support this pragma, the expression only worked accidentally when it was optimized. % - % -/* We risk spurious overflow for components >= DBL_MAX / (1 + sqrt(2)). */ % +/* For avoiding overflow for components >= DBL_MAX / (1 + sqrt(2)). */ % #define THRESH 0x1.a827999fcef32p+1022 We only risked this threshold being wrong. % % @@ -50,7 +41,5 @@ % { % double complex result; % - double a, b; % - double t; % - int scale; % + double a, b, rx, ry, scale, t; % % a = creal(z); `scale' is now a scale factor intead of a flag. New variables to fix the complex expression. Fix style bugs. % @@ -64,5 +53,5 @@ % if (isnan(a)) { % t = (b - b) / (b - b); /* raise invalid if b is not a NaN */ % - return (cpack(a, t)); /* return NaN + NaN i */ % + return (cpack(a + a, a + 0.0L + t)); /* return NaN + NaN i */ % } % if (isinf(a)) { Old fix for not quietening a. Also, mix the NaNs (if there are 2) in the imaginary part. This depends on the hardware doing the right thing when t is the default NaN, and it does in all cases tested: when b is NaN, we want t to be b quietened and the imaginary part to be the mix of a quietened and b quietened; when b is not NaN, we want both parts to be a quietened. We don't want t to have an effect if it is just the default NaN. The real part should mix the NaNs too, but I didn't do it right and got some inconsistencies. % @@ -78,16 +67,25 @@ % return (cpack(a, copysign(b - b, b))); % } % - /* % - * The remaining special case (b is NaN) is handled just fine by % - * the normal code path below. % - */ % + if (isnan(b)) { % + t = (a - a) / (a - a); /* raise invalid */ % + return (cpack(b + b, b + 0.0L + t)); /* return NaN + NaN i */ % + } It was easier add a special case than to fall through. Again t is only needed for the side effects of its expression. It would be better to assign it to a volatile variable and return b + B for both parts. % % /* Scale to avoid overflow. */ % if (fabs(a) >= THRESH || fabs(b) >= THRESH) { % - a *= 0.25; % - b *= 0.25; % - scale = 1; % + if (fabs(a) >= 0x1p-1021) % + a *= 0.25; % + if (fabs(b) >= 0x1p-1021) % + b *= 0.25; While testing the NaNs, I noticed several other bugs. This fixes spurious underflow when one of a or b is tiny. There are too many scattered fabs()'s. It would be better to take fabs()'s up front, or determine exponents and use exponents in the threshold tests. % + scale = 2; % } else { % - scale = 0; % + scale = 1; % + } % + % + /* Scale to reduce inaccuracies when both components are denormal. */ % + if (fabs(a) <= 0x1p-1023 && fabs(b) <= 0x1p-1023) { % + a *= 0x1p54; % + b *= 0x1p54; % + scale = 0x1p-27; % } % This is like a fix in clog(). hypot() handles denormals OK, but necessarily loses accuracy when it returns a denormal result, so the expression (a + hypot(a, b)) is more inaccurate than necessary. With scaling, it is accurate to about 1 ulp at first, and then inverse scaling makes it accurate to 1 denormal ulp in even more cases. For example, let a = 0 and b = smallest denormal. Then csqrt(z) should be sqrt(2)/2*(1+I)* rounded = (1+I)*, but t = sqrt(a + hypot(a, b)) * 0.5 is * 0.5 rounded = 0, since the sqrt() has to round down to and then the muliplication has to round down again. Here (a + hypot(a, b)) is exact but the sqrt() and the multiplication aren't. The error without this fix was about 34 ulps on values near 2**36 times the smallest denormal. It was many gulps on smaller values. % @@ -95,15 +93,13 @@ % if (a >= 0) { % t = sqrt((a + hypot(a, b)) * 0.5); % - result = cpack(t, b / (2 * t)); % + rx = t; % + ry = b / (2 * t); % } else { % t = sqrt((-a + hypot(a, b)) * 0.5); % - result = cpack(fabs(b) / (2 * t), copysign(t, b)); % + rx = fabs(b) / (2 * t); % + ry = copysign(t, b); % } % % - /* Rescale. */ % - if (scale) % - return (result * 2); % - else % - return (result); % + return (cpack(rx * scale, ry * scale)); % } % Multiplication of the complex value result by either the fixed scale 2 or the variable scale 'scale' should probably clobber the sign of 0 in many cases, should it should probably be equivalent to multiplication by (iscale + I * 0). This is like the sign clobbering for adding ln2. This is not wanted, so we should't do a complex multiplication. When the scale factor was 2, gcc optimized the multiplication to an addition, even at -O0 IIRC, so the sign was accidentally 0. Otherwise, -O always optimized to avoid clobbering. But with the variable scale and -O0, a full complex multiplication was done. % Index: ../s_csqrtf.c % =================================================================== % RCS file: /home/ncvs/src/lib/msun/src/s_csqrtf.c,v % retrieving revision 1.3 % diff -u -2 -r1.3 s_csqrtf.c % --- ../s_csqrtf.c 8 Aug 2008 00:15:16 -0000 1.3 % +++ ../s_csqrtf.c 14 Aug 2012 20:34:21 -0000 % @@ -33,18 +33,12 @@ % #include "math_private.h" % % -/* % - * gcc doesn't implement complex multiplication or division correctly, % - * so we need to handle infinities specially. We turn on this pragma to % - * notify conforming c99 compilers that the fast-but-incorrect code that % - * gcc generates is acceptable, since the special cases have already been % - * handled. % - */ % -#pragma STDC CX_LIMITED_RANGE ON % - This didn't use complex arithmetic before. % float complex % csqrtf(float complex z) % { % - float a = crealf(z), b = cimagf(z); % double t; % + float a, b; % + % + a = creal(z); % + b = cimag(z); % % /* Handle special cases. */ More style fixes. % @@ -55,5 +49,5 @@ % if (isnan(a)) { % t = (b - b) / (b - b); /* raise invalid if b is not a NaN */ % - return (cpackf(a, t)); /* return NaN + NaN i */ % + return (cpackf(a + a, a + 0.0L + t)); /* return NaN + NaN i */ % } % if (isinf(a)) { % @@ -69,8 +63,8 @@ % return (cpackf(a, copysignf(b - b, b))); % } % - /* % - * The remaining special case (b is NaN) is handled just fine by % - * the normal code path below. % - */ % + if (isnan(b)) { % + t = (a - a) / (a - a); /* raise invalid */ % + return (cpack(b + b, b + 0.0L + t)); /* return NaN + NaN i */ % + } % % /* NaN fixes. % @@ -81,8 +75,8 @@ % if (a >= 0) { % t = sqrt((a + hypot(a, b)) * 0.5); % - return (cpackf(t, b / (2.0 * t))); % + return (cpackf(t, b / (2 * t))); % } else { % t = sqrt((-a + hypot(a, b)) * 0.5); % - return (cpackf(fabsf(b) / (2.0 * t), copysignf(t, b))); % + return (cpackf(fabsf(b) / (2 * t), copysignf(t, b))); % } % } Style fixes. Overflow and accuracy fixes are not needed, since we use extra precision. % Index: ../s_csqrtl.c % =================================================================== % RCS file: /home/ncvs/src/lib/msun/src/s_csqrtl.c,v % retrieving revision 1.2 % diff -u -2 -r1.2 s_csqrtl.c % --- ../s_csqrtl.c 8 Aug 2008 00:15:16 -0000 1.2 % +++ ../s_csqrtl.c 15 Aug 2012 09:04:11 -0000 % @@ -34,14 +34,5 @@ % #include "math_private.h" % % -/* % - * gcc doesn't implement complex multiplication or division correctly, % - * so we need to handle infinities specially. We turn on this pragma to % - * notify conforming c99 compilers that the fast-but-incorrect code that % - * gcc generates is acceptable, since the special cases have already been % - * handled. % - */ % -#pragma STDC CX_LIMITED_RANGE ON % - % -/* We risk spurious overflow for components >= LDBL_MAX / (1 + sqrt(2)). */ % +/* For avoiding spurious overflow for components >= LDBL_MAX / (1 + sqrt(2)). */ % #define THRESH (LDBL_MAX / 2.414213562373095048801688724209698L) % % @@ -50,7 +41,5 @@ % { % long double complex result; % - long double a, b; % - long double t; % - int scale; % + long double a, b, rx, ry, scale, t; % % a = creall(z); % @@ -64,5 +53,5 @@ % if (isnan(a)) { % t = (b - b) / (b - b); /* raise invalid if b is not a NaN */ % - return (cpackl(a, t)); /* return NaN + NaN i */ % + return (cpackl(a + a, a + 0.0L + t)); /* return NaN + NaN i */ % } % if (isinf(a)) { % @@ -78,16 +67,25 @@ % return (cpackl(a, copysignl(b - b, b))); % } % - /* % - * The remaining special case (b is NaN) is handled just fine by % - * the normal code path below. % - */ % + if (isnan(b)) { % + t = (a - a) / (a - a); /* raise invalid */ % + return (cpack(b + b, b + 0.0L + t)); /* return NaN + NaN i */ % + } % % /* Scale to avoid overflow. */ % if (fabsl(a) >= THRESH || fabsl(b) >= THRESH) { % - a *= 0.25; % - b *= 0.25; % - scale = 1; % + if (fabsl(a) >= 0x1p-16381L) % + a *= 0.25; % + if (fabsl(b) >= 0x1p-16381L) % + b *= 0.25; % + scale = 2; % } else { % - scale = 0; % + scale = 1; % + } % + % + /* Scale to reduce inaccuracies when both components are denormal. */ % + if (fabsl(a) <= 0x1p-16383L && fabsl(b) <= 0x1p-16383L) { % + a *= 0x1p64; % + b *= 0x1p64; % + scale = 0x1p-32; % } % % @@ -95,14 +93,12 @@ % if (a >= 0) { % t = sqrtl((a + hypotl(a, b)) * 0.5); % - result = cpackl(t, b / (2 * t)); % + rx = t; % + ry = b / (2 * t); % } else { % t = sqrtl((-a + hypotl(a, b)) * 0.5); % - result = cpackl(fabsl(b) / (2 * t), copysignl(t, b)); % + rx = fabs(b) / (2 * t); % + ry = copysign(t, b); % } % % - /* Rescale. */ % - if (scale) % - return (result * 2); % - else % - return (result); % + return (cpack(rx * scale, ry * scale)); % } Same changes for long doubles as for doubles, except for magic number. Bruce From owner-freebsd-numerics@FreeBSD.ORG Wed Aug 15 15:20:00 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 1CD791065675 for ; Wed, 15 Aug 2012 15:19:59 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from mail04.syd.optusnet.com.au (mail04.syd.optusnet.com.au [211.29.132.185]) by mx1.freebsd.org (Postfix) with ESMTP id 3DEBD8FC18 for ; Wed, 15 Aug 2012 15:19:58 +0000 (UTC) Received: from c122-106-171-246.carlnfd1.nsw.optusnet.com.au (c122-106-171-246.carlnfd1.nsw.optusnet.com.au [122.106.171.246]) by mail04.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id q7FFJoS5021129 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Thu, 16 Aug 2012 01:19:51 +1000 Date: Thu, 16 Aug 2012 01:19:50 +1000 (EST) From: Bruce Evans X-X-Sender: bde@besplex.bde.org To: Stephen Montgomery-Smith In-Reply-To: <502A820C.6060804@missouri.edu> Message-ID: <20120816010912.Q1751@besplex.bde.org> References: <5017111E.6060003@missouri.edu> <501C361D.4010807@missouri.edu> <20120804165555.X1231@besplex.bde.org> <501D51D7.1020101@missouri.edu> <20120805030609.R3101@besplex.bde.org> <501D9C36.2040207@missouri.edu> <20120805175106.X3574@besplex.bde.org> <501EC015.3000808@missouri.edu> <20120805191954.GA50379@troutmask.apl.washington.edu> <20120807205725.GA10572@server.rulingia.com> <20120809025220.N4114@besplex.bde.org> <5027F07E.9060409@missouri.edu> <20120814003614.H3692@besplex.bde.org> <50295887.2010608@missouri.edu> <20120814055931.Q4897@besplex.bde.org> <50297468.20902@missouri.edu> <20120814173931.V934@besplex.bde.org> <502A820C.6060804@missouri.edu> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: freebsd-numerics@freebsd.org Subject: Re: Complex arg-trig functions X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 15 Aug 2012 15:20:00 -0000 On Tue, 14 Aug 2012, Stephen Montgomery-Smith wrote: > On 08/14/2012 05:09 AM, Bruce Evans wrote: > > When x is incredibly large (close to DBL_MAX), a mathematician would consider > x to represent all the numbers between x-x*DBL_EPSILON to x+x*DBL_EPSILON > (approximately), or more precisely, all the numbers that are within 0.5 ULP > of x. > > So as a Mathematician I would prefer to think of sin(close_to_DBL_MAX) as > undefined. (Although as a programmer, I would hate it if it spat out NaN - I > would prefer the meaningless answer.) Isn't it both as a programmer? Mathematicians don't stop at DBL_MAX, but go to real infinity and then large cardinals :-). Old libraries had a TLOSS error for this. Even fdlibm had this. We seem to have lost a bit by removing this together with historical cruft. Grep shows the following lines matching TLOSS in fdlibm-5.3: % fdlibm.h: * set X_TLOSS = pi*2**52, which is possibly defined in % fdlibm.h:#define X_TLOSS 1.41484755040568800000e+16 % fdlibm.h:#define TLOSS 5 % k_standard.c: * 34-- j0(|x|>X_TLOSS) % k_standard.c: * 35-- y0(x>X_TLOSS) % k_standard.c: * 36-- j1(|x|>X_TLOSS) % k_standard.c: * 37-- y1(x>X_TLOSS) % k_standard.c: * 38-- jn(|x|>X_TLOSS, n) % k_standard.c: * 39-- yn(x>X_TLOSS, n) % k_standard.c: /* j0(|x|>X_TLOSS) */ % k_standard.c: exc.type = TLOSS; % k_standard.c: (void) WRITE2(": TLOSS error\n", 14); % ... Though sin() doesn't lose precision at large or not so large zeros, Bessel functions probably do. I think no one knows where many of their zeros near DBL_MAX are, since they are not evenly spaced like multiples of pi are, so every one needs an individual calculation. pari become incredibly slow at just finding them after the first few hundred, at least using the following perhaps too simple script: for (i = 0, 999, print(solve(x=i*Pi, (i+1)*Pi, besselj(0,x)))) This gets slower and slower as i increases and takes about 1 second per zero starting at i = 900. But after pari takes a few minutes to generate 1000 zeros, FreeBSD libm j0 and j0f pass checks of them all in 1.5 msec. The check is that there is a sign change on each side of the supposed zero. So FreeBSD j0 and j0f get at least 1 bit right (the sign bit) for the first 1000 zeros. > I see code like "if (y!=y) return (y+y)". Does "y+y" quieten the NaNs as > well as "y+0.0"? Is my code compliant in this regard? Yes, y+y works fine and is probably most efficient if there is only 1 NaN to return, since it doesn't require an extra instruction to load 0. >> I only try re-using >> variables near the start of a function (maybe x = creal(z); use(x); >> x = fabs(x)), to try to stop the compiler making so many copies of x. >> This somtimes works. (The typical pessimization avoided by this is >> when x passed on the stack. gcc likes to copy it to another place >> on the stack, using pessimal methods, and then never use the original >> copy. This is good for debugging, but otherwise not very good.). > > What do you mean by "pessimization"? Too much copying of data, without understanding of the memory hierarchy. On some arches, there is a large penalty for loads that don't match stores exactly. gcc-4.2 doesn't understand this, and sometimes enlarges the problem by unnecessary copying that creates mismatches where there were none in the original copies. Bruce From owner-freebsd-numerics@FreeBSD.ORG Wed Aug 15 15:39:58 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id C0FEC1065677 for ; Wed, 15 Aug 2012 15:39:58 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from mail01.syd.optusnet.com.au (mail01.syd.optusnet.com.au [211.29.132.182]) by mx1.freebsd.org (Postfix) with ESMTP id 403C28FC12 for ; Wed, 15 Aug 2012 15:39:57 +0000 (UTC) Received: from c122-106-171-246.carlnfd1.nsw.optusnet.com.au (c122-106-171-246.carlnfd1.nsw.optusnet.com.au [122.106.171.246]) by mail01.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id q7FFdmxP023395 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Thu, 16 Aug 2012 01:39:50 +1000 Date: Thu, 16 Aug 2012 01:39:48 +1000 (EST) From: Bruce Evans X-X-Sender: bde@besplex.bde.org To: Peter Jeremy In-Reply-To: <20120814235832.GC33399@server.rulingia.com> Message-ID: <20120816012258.U2233@besplex.bde.org> References: <502A8CCC.5080606@missouri.edu> <20120814175257.GA69865@troutmask.apl.washington.edu> <20120814183518.GA70092@troutmask.apl.washington.edu> <502AA971.4010403@missouri.edu> <20120814195659.GA70571@troutmask.apl.washington.edu> <20120815070807.B3431@besplex.bde.org> <20120814235832.GC33399@server.rulingia.com> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: freebsd-numerics@freebsd.org Subject: Re: Status of expl logl X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 15 Aug 2012 15:39:58 -0000 On Wed, 15 Aug 2012, Peter Jeremy wrote: > On 2012-Aug-15 08:15:52 +1000, Bruce Evans wrote: >> I hardly looked at e_pow.c before. It is apparently half about repeating >> e_log.c and e_exp.c, to get at their extra internal precision. > > I expect cpow() will similarly have to copy slabs of code from e_pow.c > to avoid losing precision or domain. Is it worth pulling some of this > "common" code out so that it can be shared amongst all the different > functions that need it? We already have kernels for exp and log, but it is difficult to find the right interfaces and internals, and these don't have them. There are also __exp__D() and __log__D(). My s_log*.c have yet another interface. With heavyweight inlining and optimization, it works OK to return internals in a big struct, with flags saying what was returned. However, the compiler never seems to do quite as well as manual inlining. Bruce From owner-freebsd-numerics@FreeBSD.ORG Wed Aug 15 15:58:23 2012 Return-Path: Delivered-To: freebsd-numerics@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 1F35F1065672 for ; Wed, 15 Aug 2012 15:58:23 +0000 (UTC) (envelope-from stephen@missouri.edu) Received: from wilberforce.math.missouri.edu (wilberforce.math.missouri.edu [128.206.184.213]) by mx1.freebsd.org (Postfix) with ESMTP id D0D8C8FC0C for ; Wed, 15 Aug 2012 15:58:22 +0000 (UTC) Received: from [128.206.184.213] (wilberforce.math.missouri.edu [128.206.184.213]) by wilberforce.math.missouri.edu (8.14.5/8.14.5) with ESMTP id q7FFwLbZ042742; Wed, 15 Aug 2012 10:58:21 -0500 (CDT) (envelope-from stephen@missouri.edu) Message-ID: <502BC71D.4080401@missouri.edu> Date: Wed, 15 Aug 2012 10:58:21 -0500 From: Stephen Montgomery-Smith User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:14.0) Gecko/20120728 Thunderbird/14.0 MIME-Version: 1.0 To: Bruce Evans , freebsd-numerics@FreeBSD.org References: <5017111E.6060003@missouri.edu> <501C361D.4010807@missouri.edu> <20120804165555.X1231@besplex.bde.org> <501D51D7.1020101@missouri.edu> <20120805030609.R3101@besplex.bde.org> <501D9C36.2040207@missouri.edu> <20120805175106.X3574@besplex.bde.org> <501EC015.3000808@missouri.edu> <20120805191954.GA50379@troutmask.apl.washington.edu> <20120807205725.GA10572@server.rulingia.com> <20120809025220.N4114@besplex.bde.org> <5027F07E.9060409@missouri.edu> <20120814003614.H3692@besplex.bde.org> <50295887.2010608@missouri.edu> <20120814055931.Q4897@besplex.bde.org> <50297468.20902@missouri.edu> <20120814173931.V934@besplex.bde.org> <502A820C.6060804@missouri.edu> <20120816010912.Q1751@besplex.bde.org> In-Reply-To: <20120816010912.Q1751@besplex.bde.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: Subject: Re: Complex arg-trig functions X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 15 Aug 2012 15:58:23 -0000 On 08/15/12 10:19, Bruce Evans wrote: > Though sin() doesn't lose precision at large or not so large zeros, Bessel > functions probably do. I think no one knows where many of their zeros > near DBL_MAX are, since they are not evenly spaced like multiples of pi > are, so every one needs an individual calculation. pari become incredibly > slow at just finding them after the first few hundred, at least using the > following perhaps too simple script: > > for (i = 0, 999, print(solve(x=i*Pi, (i+1)*Pi, besselj(0,x)))) > > This gets slower and slower as i increases and takes about 1 second > per zero starting at i = 900. But after pari takes a few minutes to > generate 1000 zeros, FreeBSD libm j0 and j0f pass checks of them all > in 1.5 msec. The check is that there is a sign change on each side > of the supposed zero. So FreeBSD j0 and j0f get at least 1 bit right > (the sign bit) for the first 1000 zeros. There are asymptotic formulas for the Bessel Functions for large x. So surely one could get very accurate formulas for the large roots. My guess is that to machine precision, the roots are evenly spaced (pi apart) for roots larger than 1/EPSILON (since I am guessing that the error in the asymptotics is about 1/x^(3/2)). http://en.wikipedia.org/wiki/Bessel_function#Asymptotic_forms Mathematica finds the roots very fast: N[BesselJZero[1/10, 10000], 50] 31415.141141713507985336657384821650558701674510788 I did it for the Bessel function of order 1/10, just to make sure that it wasn't storing huge tables. I know Mathematica is produced by an evil corporation, but I am constantly impressed by how well it performs. Hey - I just performed this experiment: Table[N[BesselJZero[0/10, n] - BesselJZero[0/10, n - 1], 50], {n, 100000, 100100}] The asymptotics are much better than I supposed, since these are Pi to about 18 decimal digits. From owner-freebsd-numerics@FreeBSD.ORG Wed Aug 15 16:45:36 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 53E2A106566C for ; Wed, 15 Aug 2012 16:45:36 +0000 (UTC) (envelope-from sgk@troutmask.apl.washington.edu) Received: from troutmask.apl.washington.edu (troutmask.apl.washington.edu [128.95.76.21]) by mx1.freebsd.org (Postfix) with ESMTP id 2B8C58FC08 for ; Wed, 15 Aug 2012 16:45:36 +0000 (UTC) Received: from troutmask.apl.washington.edu (localhost.apl.washington.edu [127.0.0.1]) by troutmask.apl.washington.edu (8.14.5/8.14.5) with ESMTP id q7FGjUf9075935; Wed, 15 Aug 2012 09:45:30 -0700 (PDT) (envelope-from sgk@troutmask.apl.washington.edu) Received: (from sgk@localhost) by troutmask.apl.washington.edu (8.14.5/8.14.5/Submit) id q7FGjT6x075934; Wed, 15 Aug 2012 09:45:29 -0700 (PDT) (envelope-from sgk) Date: Wed, 15 Aug 2012 09:45:29 -0700 From: Steve Kargl To: Stephen Montgomery-Smith Message-ID: <20120815164529.GA75763@troutmask.apl.washington.edu> References: <20120809025220.N4114@besplex.bde.org> <5027F07E.9060409@missouri.edu> <20120814003614.H3692@besplex.bde.org> <50295887.2010608@missouri.edu> <20120814055931.Q4897@besplex.bde.org> <50297468.20902@missouri.edu> <20120814173931.V934@besplex.bde.org> <502A820C.6060804@missouri.edu> <20120816010912.Q1751@besplex.bde.org> <502BC71D.4080401@missouri.edu> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <502BC71D.4080401@missouri.edu> User-Agent: Mutt/1.4.2.3i Cc: freebsd-numerics@freebsd.org, Bruce Evans Subject: Re: Complex arg-trig functions X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 15 Aug 2012 16:45:36 -0000 On Wed, Aug 15, 2012 at 10:58:21AM -0500, Stephen Montgomery-Smith wrote: > On 08/15/12 10:19, Bruce Evans wrote: > > >Though sin() doesn't lose precision at large or not so large zeros, Bessel > >functions probably do. I think no one knows where many of their zeros > >near DBL_MAX are, since they are not evenly spaced like multiples of pi > >are, so every one needs an individual calculation. pari become incredibly > >slow at just finding them after the first few hundred, at least using the > >following perhaps too simple script: > > > > for (i = 0, 999, print(solve(x=i*Pi, (i+1)*Pi, besselj(0,x)))) > > > > > > >This gets slower and slower as i increases and takes about 1 second > >per zero starting at i = 900. But after pari takes a few minutes to > >generate 1000 zeros, FreeBSD libm j0 and j0f pass checks of them all > >in 1.5 msec. The check is that there is a sign change on each side > >of the supposed zero. So FreeBSD j0 and j0f get at least 1 bit right > >(the sign bit) for the first 1000 zeros. > > There are asymptotic formulas for the Bessel Functions for large x. So > surely one could get very accurate formulas for the large roots. Well, there is http://dlmf.nist.gov/10.21 and in particular, http://dlmf.nist.gov/10.21#vi -- Steve From owner-freebsd-numerics@FreeBSD.ORG Wed Aug 15 17:13:32 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id D048D1065672 for ; Wed, 15 Aug 2012 17:13:31 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from mail01.syd.optusnet.com.au (mail01.syd.optusnet.com.au [211.29.132.182]) by mx1.freebsd.org (Postfix) with ESMTP id 605C58FC08 for ; Wed, 15 Aug 2012 17:13:31 +0000 (UTC) Received: from c122-106-171-246.carlnfd1.nsw.optusnet.com.au (c122-106-171-246.carlnfd1.nsw.optusnet.com.au [122.106.171.246]) by mail01.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id q7FHDSYo025257 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Thu, 16 Aug 2012 03:13:29 +1000 Date: Thu, 16 Aug 2012 03:13:28 +1000 (EST) From: Bruce Evans X-X-Sender: bde@besplex.bde.org To: Stephen Montgomery-Smith In-Reply-To: <502B1817.5070401@missouri.edu> Message-ID: <20120816030731.A2899@besplex.bde.org> References: <5017111E.6060003@missouri.edu> <501C361D.4010807@missouri.edu> <20120804165555.X1231@besplex.bde.org> <501D51D7.1020101@missouri.edu> <20120805030609.R3101@besplex.bde.org> <501D9C36.2040207@missouri.edu> <20120805175106.X3574@besplex.bde.org> <501EC015.3000808@missouri.edu> <20120805191954.GA50379@troutmask.apl.washington.edu> <20120807205725.GA10572@server.rulingia.com> <20120809025220.N4114@besplex.bde.org> <5027F07E.9060409@missouri.edu> <20120814003614.H3692@besplex.bde.org> <50295887.2010608@missouri.edu> <20120814055931.Q4897@besplex.bde.org> <50297468.20902@missouri.edu> <20120814173931.V934@besplex.bde.org> <502A820C.6060804@missouri.edu> <502A8494.2050707@missouri.edu> <502A9B99.7090309@missouri.edu> <502B1817.5070401@missouri.edu> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: freebsd-numerics@freebsd.org Subject: Re: Complex arg-trig functions X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 15 Aug 2012 17:13:32 -0000 On Tue, 14 Aug 2012, Stephen Montgomery-Smith wrote: > I was looking through the code e_acosh.c, and it made me realize I could get > a small fraction more ULP in catrig.c by making the replacements: > > 216c216 > < *rx = log1p(Am1 + sqrt(Am1*(A+1))); > --- >> *rx = log1p(Am1 + sqrt(2*Am1 + Am1*Am1)); > 282c282 > < *sqrt_A2my2 = sqrt(Amy*(A+y)); > --- >> *sqrt_A2my2 = sqrt(2*y*Amy + Amy*Amy); > > I'm not quite sure if the second replacement makes much difference, but the > first replacement seemed quite effective. This seems to be slightly worse. In my tests, it makes little difference to the peak error, but unimproves the number of correctly rounded cases quite often. 1,5c1,4 < amd64 float prec, on 2**12 x 2**12 args: < rcacos:max_er = 0x58460841 2.7585, avg_er = 0.317, #>=1:0.5 = 29084:255712 < rcacosh:max_er = 0x5e1e45e6 2.9412, avg_er = 0.262, #>=1:0.5 = 85868:3413684 < rcasin:max_er = 0x631b8183 3.0971, avg_er = 0.209, #>=1:0.5 = 38388:382508 < rcasinh:max_er = 0x5e1e45e6 2.9412, avg_er = 0.262, #>=1:0.5 = 85868:3413684 --- > rcacos:max_er = 0x57352248 2.7252, avg_er = 0.317, #>=1:0.5 = 28694:256172 > rcacosh:max_er = 0x5e1e45e6 2.9412, avg_er = 0.265, #>=1:0.5 = 107904:3459300 > rcasin:max_er = 0x631b8183 3.0971, avg_er = 0.209, #>=1:0.5 = 38332:382056 > rcasinh:max_er = 0x5e1e45e6 2.9412, avg_er = 0.265, #>=1:0.5 = 107904:3459300 ['<' is the old version, '>' the bew version] 17,20c16,19 < icacos:max_er = 0x5e1e45e6 2.9412, avg_er = 0.262, #>=1:0.5 = 85868:3413684 < icacosh:max_er = 0x58460841 2.7585, avg_er = 0.317, #>=1:0.5 = 29084:255712 < icasin:max_er = 0x5e1e45e6 2.9412, avg_er = 0.262, #>=1:0.5 = 85868:3413684 < icasinh:max_er = 0x631b8183 3.0971, avg_er = 0.209, #>=1:0.5 = 38388:382508 --- > icacos:max_er = 0x5e1e45e6 2.9412, avg_er = 0.265, #>=1:0.5 = 107904:3459300 > icacosh:max_er = 0x57352248 2.7252, avg_er = 0.317, #>=1:0.5 = 28694:256172 > icasin:max_er = 0x5e1e45e6 2.9412, avg_er = 0.265, #>=1:0.5 = 107904:3459300 > icasinh:max_er = 0x631b8183 3.0971, avg_er = 0.209, #>=1:0.5 = 38332:382056 32,37c31,34 < < amd64 double prec, on 2**12 x 2**12 args: < rcacos:max_er = 0x1b5a 3.4189, avg_er = 0.228, #>=1:0.5 = 2394:125988 < rcacosh:max_er = 0xf7d 1.9360, avg_er = 0.257, #>=1:0.5 = 612:2741860 < rcasin:max_er = 0x15c5 2.7212, avg_er = 0.113, #>=1:0.5 = 33296:99152 < rcasinh:max_er = 0xf7d 1.9360, avg_er = 0.257, #>=1:0.5 = 612:2741796 --- > rcacos:max_er = 0x1b5a 3.4189, avg_er = 0.228, #>=1:0.5 = 2374:125954 > rcacosh:max_er = 0xf8a 1.9424, avg_er = 0.258, #>=1:0.5 = 8396:2741812 > rcasin:max_er = 0x15c5 2.7212, avg_er = 0.113, #>=1:0.5 = 33312:99184 > rcasinh:max_er = 0xf8a 1.9424, avg_er = 0.258, #>=1:0.5 = 8396:2741748 42,45c39,42 < icacos:max_er = 0xf7d 1.9360, avg_er = 0.257, #>=1:0.5 = 612:2741860 < icacosh:max_er = 0x1b5a 3.4189, avg_er = 0.228, #>=1:0.5 = 2394:125988 < icasin:max_er = 0xf7d 1.9360, avg_er = 0.257, #>=1:0.5 = 612:2741796 < icasinh:max_er = 0x15c5 2.7212, avg_er = 0.113, #>=1:0.5 = 33296:99152 --- > icacos:max_er = 0xf8a 1.9424, avg_er = 0.258, #>=1:0.5 = 8396:2741812 > icacosh:max_er = 0x1b5a 3.4189, avg_er = 0.228, #>=1:0.5 = 2374:125954 > icasin:max_er = 0xf8a 1.9424, avg_er = 0.258, #>=1:0.5 = 8396:2741748 > icasinh:max_er = 0x15c5 2.7212, avg_er = 0.113, #>=1:0.5 = 33312:99184 The unimprovement on i386 is similar. This is surprising for the float case, since the expressions are evaluated in double precision. Bruce From owner-freebsd-numerics@FreeBSD.ORG Wed Aug 15 18:15:57 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 91381106564A for ; Wed, 15 Aug 2012 18:15:57 +0000 (UTC) (envelope-from stephen@missouri.edu) Received: from wilberforce.math.missouri.edu (wilberforce.math.missouri.edu [128.206.184.213]) by mx1.freebsd.org (Postfix) with ESMTP id 4DA6D8FC08 for ; Wed, 15 Aug 2012 18:15:56 +0000 (UTC) Received: from [128.206.184.213] (wilberforce.math.missouri.edu [128.206.184.213]) by wilberforce.math.missouri.edu (8.14.5/8.14.5) with ESMTP id q7FIFtfl051501; Wed, 15 Aug 2012 13:15:55 -0500 (CDT) (envelope-from stephen@missouri.edu) Message-ID: <502BE75B.6060301@missouri.edu> Date: Wed, 15 Aug 2012 13:15:55 -0500 From: Stephen Montgomery-Smith User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:14.0) Gecko/20120728 Thunderbird/14.0 MIME-Version: 1.0 To: Bruce Evans References: <5017111E.6060003@missouri.edu> <501C361D.4010807@missouri.edu> <20120804165555.X1231@besplex.bde.org> <501D51D7.1020101@missouri.edu> <20120805030609.R3101@besplex.bde.org> <501D9C36.2040207@missouri.edu> <20120805175106.X3574@besplex.bde.org> <501EC015.3000808@missouri.edu> <20120805191954.GA50379@troutmask.apl.washington.edu> <20120807205725.GA10572@server.rulingia.com> <20120809025220.N4114@besplex.bde.org> <5027F07E.9060409@missouri.edu> <20120814003614.H3692@besplex.bde.org> <50295887.2010608@missouri.edu> <20120814055931.Q4897@besplex.bde.org> <50297468.20902@missouri.edu> <20120814173931.V934@besplex.bde.org> <502A820C.6060804@missouri.edu> <502A8494.2050707@missouri.edu> <502A9B99.7090309@missouri.edu> <502B1817.5070401@missouri.edu> <20120816030731.A2899@besplex.bde.org> In-Reply-To: <20120816030731.A2899@besplex.bde.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-numerics@freebsd.org Subject: Re: Complex arg-trig functions X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 15 Aug 2012 18:15:57 -0000 On 08/15/12 12:13, Bruce Evans wrote: > On Tue, 14 Aug 2012, Stephen Montgomery-Smith wrote: > >> I was looking through the code e_acosh.c, and it made me realize I >> could get a small fraction more ULP in catrig.c by making the >> replacements: >> >> 216c216 >> < *rx = log1p(Am1 + sqrt(Am1*(A+1))); >> --- >>> *rx = log1p(Am1 + sqrt(2*Am1 + Am1*Am1)); >> 282c282 >> < *sqrt_A2my2 = sqrt(Amy*(A+y)); >> --- >>> *sqrt_A2my2 = sqrt(2*y*Amy + Amy*Amy); >> >> I'm not quite sure if the second replacement makes much difference, >> but the first replacement seemed quite effective. > > This seems to be slightly worse. In my tests, it makes little difference > to the peak error, but unimproves the number of correctly rounded cases > quite often. I ran some all night tests, and I came to the same conclusion, except for the peak error. I'll revert it back. From owner-freebsd-numerics@FreeBSD.ORG Wed Aug 15 20:42:08 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 8F028106566C for ; Wed, 15 Aug 2012 20:42:08 +0000 (UTC) (envelope-from stephen@missouri.edu) Received: from wilberforce.math.missouri.edu (wilberforce.math.missouri.edu [128.206.184.213]) by mx1.freebsd.org (Postfix) with ESMTP id 5DD4A8FC12 for ; Wed, 15 Aug 2012 20:42:08 +0000 (UTC) Received: from [127.0.0.1] (wilberforce.math.missouri.edu [128.206.184.213]) by wilberforce.math.missouri.edu (8.14.5/8.14.5) with ESMTP id q7FKg0Ks061106; Wed, 15 Aug 2012 15:42:01 -0500 (CDT) (envelope-from stephen@missouri.edu) Message-ID: <502C0998.7040004@missouri.edu> Date: Wed, 15 Aug 2012 15:42:00 -0500 From: Stephen Montgomery-Smith User-Agent: Mozilla/5.0 (X11; Linux i686; rv:14.0) Gecko/20120714 Thunderbird/14.0 MIME-Version: 1.0 To: Bruce Evans References: <5017111E.6060003@missouri.edu> <501C361D.4010807@missouri.edu> <20120804165555.X1231@besplex.bde.org> <501D51D7.1020101@missouri.edu> <20120805030609.R3101@besplex.bde.org> <501D9C36.2040207@missouri.edu> <20120805175106.X3574@besplex.bde.org> <501EC015.3000808@missouri.edu> <20120805191954.GA50379@troutmask.apl.washington.edu> <20120807205725.GA10572@server.rulingia.com> <20120809025220.N4114@besplex.bde.org> <5027F07E.9060409@missouri.edu> <20120814003614.H3692@besplex.bde.org> <50295F5C.6010800@missouri.edu> <20120814072946.S5260@besplex.bde.org> <50297CA5.5010900@missouri.edu> <50297E43.7090309@missouri.edu> <20120814201105.T934@besplex.bde.org> In-Reply-To: <20120814201105.T934@besplex.bde.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-numerics@freebsd.org Subject: Re: Complex arg-trig functions X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 15 Aug 2012 20:42:08 -0000 On 08/14/2012 05:46 AM, Bruce Evans wrote: > On Mon, 13 Aug 2012, Stephen Montgomery-Smith wrote: > >> On 08/13/2012 05:16 PM, Stephen Montgomery-Smith wrote: >>> On 08/13/2012 04:45 PM, Bruce Evans wrote: >>> >>>> y can have any sign I think. But the problem only seemed to happen >>>> with >>>> denormals and/or NaNs. There might be a problem with NaNs not >>>> giving one >>>> of the canceling negatives. >>> >>> OK. >>> >>>>>> @ --- 408,420 ---- >>>>>> @ @ if (ISFINITE(bx) && ISFINITE(by) && (x > >>>>>> RECIP_SQRT_EPSILON_100 || y > RECIP_SQRT_EPSILON_100)) { >>>>>> @ ! /* XXX following can also raise overflow */ >>>>> >>>>> I don't see how the code could raise an overflow. The output of clog >>>>> should always be very much less than DBL_MAX. (Originally I had >>>>> clog(2*z), and that could raise an unwarranted overflow.) >>>> >>>> @ if (ISFINITE(bx) && ISFINITE(by) && (x > RECIP_SQRT_EPSILON_100 >>>> || y > RECIP_SQRT_EPSILON_100)) { >>>> @ ! /* XXX following can also raise overflow */ >>>> @ ! if (huge+x+y>one) { /* raise inexact */ >>>> @ ! w = clog_for_large_values(z); >>>> @ ! /* Can't add M_LN2 to w since it should clobber >>>> -0*I. */ >>>> @ ! rx = fabs(cimag(w)); >>>> @ ! ry = creal(w) + M_LN2; >>>> @ if (sy == 0) >>>> @ ! ry = -ry; >>>> @ ! return (cpack(rx, ry)); >>>> @ } >>>> @ } >>>> >>>> clog() won't overflow spuriously, but huge+x+y might. >>> >>> Yes, I didn't think of that! >>> >>>> ((int)x == 0)' is a safer method of raising inexact for certain x. >>> >>> But this only works if x is less than 1. >>> >>> OK, how about this: >>> >>> sqrt_huge = 1e150; >>> if (sqrt_huge+x>one || sqrt_huge+y>one) ... >> >> Oops >> >> if (sqrt_huge+x>one && sqrt_huge+y>one) > > x and y can be DBL_MAX, giving overflow. I think raising overflow is > never correct, since clog() never overflows for large values, and > ccacos() apparently reduces to a rearrangement of clog() for large > values. > > BTW, you can probably omit the ISFINITE() tests in: > >>>> @ if (ISFINITE(bx) && ISFINITE(by) && (x > RECIP_SQRT_EPSILON_100 >>>> || y > RECIP_SQRT_EPSILON_100)) { > > since if bx or by is NaN, then it isn't > RECIP_SQRT_EPSILON_100, and > if it is Inf then I think handling it the same as DBL_MAX gives the > correct result. NaNs and Infs now fall through to do_hard_work(). > Wouldn't it be easier to never pass them to do_hard_work()? > > For just setting inexact, try an expression using `tiny'. There are > many examples to choose from. According to $(grep tiny.*inex *.c): > > % e_sinh.c: if(shuge+x>one) return x;/* sinh(tiny) = tiny with > inexact */ > % e_sinhf.c: if(shuge+x>one) return x;/* sinh(tiny) = tiny with > inexact */ > > Ones like you have. > > % e_sqrt.c: z = one-tiny; /* trigger inexact flag */ > % e_sqrtf.c: z = one-tiny; /* trigger inexact flag */ > > Works generally, modulo compiler bugs and extra precision, provided z is > used. > > % s_erf.c: * erf(x) = sign(x) *(1 - tiny) (raise inexact) > % s_expm1.c: if(x+tiny<0.0) /* raise inexact */ > % s_expm1f.c: if(x+tiny<(float)0.0) /* raise inexact */ > % s_tanh.c: if(huge+x>one) return x; /* tanh(tiny) = tiny with > inexact */ > > 3 more that depend too much on x. > > % s_tanh.c: z = one - tiny; /* raise inexact flag */ > % s_tanhf.c: if(huge+x>one) return x; /* tanh(tiny) = tiny with > inexact */ > % s_tanhf.c: z = one - tiny; /* raise inexact flag */ > > To get z used, try `if ((int)(1 - tiny) == 1)'. To avoid compiler > bugs, it is necessary for `tiny' to be static const volatile (where > `tiny' is already static const). Only a few places in msun use a > volatile `tiny', so you could not worry about the compiler bugs equally > and wait for them to go away or for someone to notice that inexact is > not set properly. clang has similar bugs for huge*huge. gcc doesn't > evaluate huge*huge at compile time, but clang does. Both evaluate > tiny*tiny and 1-tiny at compile time. Spelling 1 as `one' has no > effect on the compiler bugs. > > Note that the expressions that mix in x only do so to avoid setting > inexact when x = +-0, or maybe to preserve the sign of x, without using > a branch to classify this x. Here we already have branches to classify > x as large. > > Bruce > > All your solutions depend upon using (1-tiny) with the result being used. But what if FE_DOWNWARD is set? Then 1-tiny becomes 1-DBL_EPSILON. And then if the result is used, everything is off by 1 ulp. And if ((int)(1 - tiny) == 1) will fail. From owner-freebsd-numerics@FreeBSD.ORG Wed Aug 15 20:56:25 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id E5383106566C for ; Wed, 15 Aug 2012 20:56:25 +0000 (UTC) (envelope-from stephen@missouri.edu) Received: from wilberforce.math.missouri.edu (wilberforce.math.missouri.edu [128.206.184.213]) by mx1.freebsd.org (Postfix) with ESMTP id B33788FC14 for ; Wed, 15 Aug 2012 20:56:25 +0000 (UTC) Received: from [127.0.0.1] (wilberforce.math.missouri.edu [128.206.184.213]) by wilberforce.math.missouri.edu (8.14.5/8.14.5) with ESMTP id q7FKuOwp062046; Wed, 15 Aug 2012 15:56:24 -0500 (CDT) (envelope-from stephen@missouri.edu) Message-ID: <502C0CF8.8040003@missouri.edu> Date: Wed, 15 Aug 2012 15:56:24 -0500 From: Stephen Montgomery-Smith User-Agent: Mozilla/5.0 (X11; Linux i686; rv:14.0) Gecko/20120714 Thunderbird/14.0 MIME-Version: 1.0 To: Bruce Evans References: <5017111E.6060003@missouri.edu> <501C361D.4010807@missouri.edu> <20120804165555.X1231@besplex.bde.org> <501D51D7.1020101@missouri.edu> <20120805030609.R3101@besplex.bde.org> <501D9C36.2040207@missouri.edu> <20120805175106.X3574@besplex.bde.org> <501EC015.3000808@missouri.edu> <20120805191954.GA50379@troutmask.apl.washington.edu> <20120807205725.GA10572@server.rulingia.com> <20120809025220.N4114@besplex.bde.org> <5027F07E.9060409@missouri.edu> <20120814003614.H3692@besplex.bde.org> <50295F5C.6010800@missouri.edu> <20120814072946.S5260@besplex.bde.org> <50297CA5.5010900@missouri.edu> <50297E43.7090309@missouri.edu> <20120814201105.T934@besplex.bde.org> <502A780B.2010106@missouri.edu> <20120815223631.N1751@besplex.bde.org> In-Reply-To: <20120815223631.N1751@besplex.bde.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-numerics@freebsd.org Subject: Re: Complex arg-trig functions X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 15 Aug 2012 20:56:26 -0000 On 08/15/2012 08:35 AM, Bruce Evans wrote: > On Tue, 14 Aug 2012, Stephen Montgomery-Smith wrote: >> It seemed to me that there is a logic behind why the the infs and nans >> produce the results they do. I noticed that do_the_hard_work() >> already got the answers correct for the real part *rx. Getting the >> imaginary part to work as well seemed to me to be the cleanest way to >> make it work. (I added all the nan and inf checking after writing the >> rest of the code.) > > An up-front check may still be simpler, and gives more control. In > csqrt*(), I needed an explicit check and special expressions to get > uniform behaviour. I still like it the way I have it. There is a definite logic in the way infs and nans come out of casinh, etc. There is only one place I disagree with C99: catanh(1) = Inf + 0*I I think mpc gets it correct: atanh(1) = Inf + nan*I > I added this to the NaN mixing in catan[h]*(), > and now all my tests pass: > > % diff -c2 catrig.c~ catrig.c > % *** catrig.c~ Sun Aug 12 17:29:18 2012 > % --- catrig.c Wed Aug 15 11:57:02 2012 > % *************** > % *** 605,609 **** > % */ > % if (ISNAN(x) || ISNAN(y)) > % ! return (cpack(x+y, x+y)); > % % /* (x,inf) and (inf,y) and (inf,inf) -> (0,PI/2) */ > % --- 609,613 ---- > % */ > % if (ISNAN(x) || ISNAN(y)) > % ! return (cpack((x+0.0L)+(y+0), (x+0.0L)+(y+0))); > % % /* (x,inf) and (inf,y) and (inf,inf) -> (0,PI/2) */ > > Use this expression in all precisions. Would this work? if (ISNAN(x) || ISNAN(y)) return (cpack((x+x)+(y+y), (x+x)+(y+y))); > > I forgot to comment it. Adding 0 quietens signaling NaNs before mixing > NaNs. I should have tried y+y. Adding 0.0L promotes part of the > expression to long double together with quietening signaling NaNs. > The rest of the expression is promoted to match. I should try the > old way again: of (long double)x+x. > > % diff -c2 catrigf.c~ catrigf.c > % *** catrigf.c~ Sun Aug 12 17:00:52 2012 > % --- catrigf.c Wed Aug 15 11:57:08 2012 > % *************** > % *** 349,353 **** > % % if (isnan(x) || isnan(y)) > % ! return (cpackf(x+y, x+y)); > % % if (isinf(x) || isinf(y)) > % --- 351,355 ---- > % % if (isnan(x) || isnan(y)) > % ! return (cpack((x+0.0L)+(y+0), (x+0.0L)+(y+0))); > % % if (isinf(x) || isinf(y)) > % diff -c2 catrigl.c~ catrigl.c > % *** catrigl.c~ Sun Aug 12 06:54:46 2012 > % --- catrigl.c Wed Aug 15 11:58:46 2012 > % *************** > % *** 323,327 **** > % % if (isnan(x) || isnan(y)) > % ! return (cpackl(x+y, x+y)); > % % if (isinf(x) || isinf(y)) > % --- 325,329 ---- > % % if (isnan(x) || isnan(y)) > % ! return (cpack((x+0.0L)+(y+0), (x+0.0L)+(y+0))); > % % if (isinf(x) || isinf(y)) > % Index: ../s_csqrt.c > % =================================================================== > % RCS file: /home/ncvs/src/lib/msun/src/s_csqrt.c,v > % retrieving revision 1.4 > % diff -u -2 -r1.4 s_csqrt.c > % --- ../s_csqrt.c 8 Aug 2008 00:15:16 -0000 1.4 > % +++ ../s_csqrt.c 14 Aug 2012 20:34:07 -0000 > % @@ -34,14 +34,5 @@ > % #include "math_private.h" > % % -/* > % - * gcc doesn't implement complex multiplication or division correctly, > % - * so we need to handle infinities specially. We turn on this pragma to > % - * notify conforming c99 compilers that the fast-but-incorrect code that > % - * gcc generates is acceptable, since the special cases have already > been > % - * handled. > % - */ > % -#pragma STDC CX_LIMITED_RANGE ON > > Remove this. There was only 1 complex expression, and it depended on the > negation of this pragma to work. Since gcc doesn't support this pragma, > the expression only worked accidentally when it was optimized. I removed it. (I copied it verbatim from csqrt without really understanding it.) The part that follows - is this all referencing csqrt? > > % - > % -/* We risk spurious overflow for components >= DBL_MAX / (1 + > sqrt(2)). */ > % +/* For avoiding overflow for components >= DBL_MAX / (1 + sqrt(2)). */ > % #define THRESH 0x1.a827999fcef32p+1022 > > > .............. snip > This is like a fix in clog(). hypot() handles denormals OK, but > necessarily loses accuracy when it returns a denormal result, so > the expression (a + hypot(a, b)) is more inaccurate than necessary. Which code is being referenced here? I use expressions like this catrig. Although I think when I use it, I am somewhat certain that neither a nor b are denormal. From owner-freebsd-numerics@FreeBSD.ORG Wed Aug 15 21:49:02 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id CC528106564A for ; Wed, 15 Aug 2012 21:49:02 +0000 (UTC) (envelope-from stephen@missouri.edu) Received: from wilberforce.math.missouri.edu (wilberforce.math.missouri.edu [128.206.184.213]) by mx1.freebsd.org (Postfix) with ESMTP id 8A4A98FC0C for ; Wed, 15 Aug 2012 21:49:02 +0000 (UTC) Received: from [127.0.0.1] (wilberforce.math.missouri.edu [128.206.184.213]) by wilberforce.math.missouri.edu (8.14.5/8.14.5) with ESMTP id q7FLn1C0065447 for ; Wed, 15 Aug 2012 16:49:01 -0500 (CDT) (envelope-from stephen@missouri.edu) Message-ID: <502C194D.50903@missouri.edu> Date: Wed, 15 Aug 2012 16:49:01 -0500 From: Stephen Montgomery-Smith User-Agent: Mozilla/5.0 (X11; Linux i686; rv:14.0) Gecko/20120714 Thunderbird/14.0 MIME-Version: 1.0 To: freebsd-numerics@freebsd.org References: <5017111E.6060003@missouri.edu> <501C361D.4010807@missouri.edu> <20120804165555.X1231@besplex.bde.org> <501D51D7.1020101@missouri.edu> <20120805030609.R3101@besplex.bde.org> <501D9C36.2040207@missouri.edu> <20120805175106.X3574@besplex.bde.org> <501EC015.3000808@missouri.edu> <20120805191954.GA50379@troutmask.apl.washington.edu> <20120807205725.GA10572@server.rulingia.com> <20120809025220.N4114@besplex.bde.org> <5027F07E.9060409@missouri.edu> <20120814003614.H3692@besplex.bde.org> <50295F5C.6010800@missouri.edu> <20120814072946.S5260@besplex.bde.org> <50297CA5.5010900@missouri.edu> <50297E43.7090309@missouri.edu> <20120814201105.T934@besplex.bde.org> <502C0998.7040004@missouri.edu> In-Reply-To: <502C0998.7040004@missouri.edu> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: Complex arg-trig functions X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 15 Aug 2012 21:49:02 -0000 On 08/15/2012 03:42 PM, Stephen Montgomery-Smith wrote: > > All your solutions depend upon using (1-tiny) with the result being > used. But what if FE_DOWNWARD is set? Then 1-tiny becomes > 1-DBL_EPSILON. And then if the result is used, everything is off by 1 ulp. > > And > if ((int)(1 - tiny) == 1) > will fail. How about replacing if (huge+ax>one && huge+bx>one) .... with if ((int)(1/ax)==0 || (int)(1/bx)==0) .... (We know that one of ax or bx is larger than 1.) From owner-freebsd-numerics@FreeBSD.ORG Wed Aug 15 22:11:24 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 89F5A106564A for ; Wed, 15 Aug 2012 22:11:24 +0000 (UTC) (envelope-from stephen@missouri.edu) Received: from wilberforce.math.missouri.edu (wilberforce.math.missouri.edu [128.206.184.213]) by mx1.freebsd.org (Postfix) with ESMTP id 487628FC18 for ; Wed, 15 Aug 2012 22:11:22 +0000 (UTC) Received: from [127.0.0.1] (wilberforce.math.missouri.edu [128.206.184.213]) by wilberforce.math.missouri.edu (8.14.5/8.14.5) with ESMTP id q7FMBLNF066952 for ; Wed, 15 Aug 2012 17:11:21 -0500 (CDT) (envelope-from stephen@missouri.edu) Message-ID: <502C1E89.9070408@missouri.edu> Date: Wed, 15 Aug 2012 17:11:21 -0500 From: Stephen Montgomery-Smith User-Agent: Mozilla/5.0 (X11; Linux i686; rv:14.0) Gecko/20120714 Thunderbird/14.0 MIME-Version: 1.0 To: freebsd-numerics@freebsd.org References: <5017111E.6060003@missouri.edu> <501C361D.4010807@missouri.edu> <20120804165555.X1231@besplex.bde.org> <501D51D7.1020101@missouri.edu> <20120805030609.R3101@besplex.bde.org> <501D9C36.2040207@missouri.edu> <20120805175106.X3574@besplex.bde.org> <501EC015.3000808@missouri.edu> <20120805191954.GA50379@troutmask.apl.washington.edu> <20120807205725.GA10572@server.rulingia.com> <20120809025220.N4114@besplex.bde.org> <5027F07E.9060409@missouri.edu> <20120814003614.H3692@besplex.bde.org> <50295F5C.6010800@missouri.edu> <20120814072946.S5260@besplex.bde.org> <50297CA5.5010900@missouri.edu> <50297E43.7090309@missouri.edu> <20120814201105.T934@besplex.bde.org> <502C0998.7040004@missouri.edu> <502C194D.50903@missouri.edu> In-Reply-To: <502C194D.50903@missouri.edu> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: Complex arg-trig functions X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 15 Aug 2012 22:11:24 -0000 On 08/15/2012 04:49 PM, Stephen Montgomery-Smith wrote: > On 08/15/2012 03:42 PM, Stephen Montgomery-Smith wrote: > >> >> All your solutions depend upon using (1-tiny) with the result being >> used. But what if FE_DOWNWARD is set? Then 1-tiny becomes >> 1-DBL_EPSILON. And then if the result is used, everything is off by 1 >> ulp. >> >> And >> if ((int)(1 - tiny) == 1) >> will fail. > > How about replacing > > if (huge+ax>one && huge+bx>one) .... > > with > > if ((int)(1/ax)==0 || (int)(1/bx)==0) .... > > (We know that one of ax or bx is larger than 1.) if ((int)(1/(2+ax))==0) .... (because one of ax or bx might be 0). From owner-freebsd-numerics@FreeBSD.ORG Wed Aug 15 22:34:25 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id E8947106564A for ; Wed, 15 Aug 2012 22:34:25 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from mail01.syd.optusnet.com.au (mail01.syd.optusnet.com.au [211.29.132.182]) by mx1.freebsd.org (Postfix) with ESMTP id 795B28FC08 for ; Wed, 15 Aug 2012 22:34:25 +0000 (UTC) Received: from c122-106-171-246.carlnfd1.nsw.optusnet.com.au (c122-106-171-246.carlnfd1.nsw.optusnet.com.au [122.106.171.246]) by mail01.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id q7FMYL5l026171 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Thu, 16 Aug 2012 08:34:23 +1000 Date: Thu, 16 Aug 2012 08:34:21 +1000 (EST) From: Bruce Evans X-X-Sender: bde@besplex.bde.org To: Stephen Montgomery-Smith In-Reply-To: <502C1E89.9070408@missouri.edu> Message-ID: <20120816081911.W3938@besplex.bde.org> References: <5017111E.6060003@missouri.edu> <501C361D.4010807@missouri.edu> <20120804165555.X1231@besplex.bde.org> <501D51D7.1020101@missouri.edu> <20120805030609.R3101@besplex.bde.org> <501D9C36.2040207@missouri.edu> <20120805175106.X3574@besplex.bde.org> <501EC015.3000808@missouri.edu> <20120805191954.GA50379@troutmask.apl.washington.edu> <20120807205725.GA10572@server.rulingia.com> <20120809025220.N4114@besplex.bde.org> <5027F07E.9060409@missouri.edu> <20120814003614.H3692@besplex.bde.org> <50295F5C.6010800@missouri.edu> <20120814072946.S5260@besplex.bde.org> <50297CA5.5010900@missouri.edu> <50297E43.7090309@missouri.edu> <20120814201105.T934@besplex.bde.org> <502C0998.7040004@missouri.edu> <502C194D.50903@missouri.edu> <502C1E89.9070408@missouri.edu> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: freebsd-numerics@freebsd.org Subject: Re: Complex arg-trig functions X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 15 Aug 2012 22:34:26 -0000 I will be away for a couple of weeks. Hope nothing happens :-). On Wed, 15 Aug 2012, Stephen Montgomery-Smith wrote: > On 08/15/2012 04:49 PM, Stephen Montgomery-Smith wrote: >> On 08/15/2012 03:42 PM, Stephen Montgomery-Smith wrote: >>> >>> All your solutions depend upon using (1-tiny) with the result being >>> used. But what if FE_DOWNWARD is set? Then 1-tiny becomes >>> 1-DBL_EPSILON. And then if the result is used, everything is off by 1 >>> ulp. Yes, this is another example that msun depends a lot on the default rounding mode. >>> And >>> if ((int)(1 - tiny) == 1) >>> will fail. But this can be fixed to ((int)(1 + tiny) == 0). I was originally going to use addition, but then grep showed fdlibm mostly using subtraction. At least some uses of (one - tiny) seem to be intentional. E.g., s_tanh.c returns +-(one - tiny) for the asymptotes to value +-1. Although 1 would be more accurate, if the rounding mode is unusual then the caller must actually want the less accurate value that results from not rounding to nearest. (one - tiny) is always <= 1. This is consistent with the asymptote always being < 1. >> How about replacing >> >> if (huge+ax>one && huge+bx>one) .... >> >> with >> >> if ((int)(1/ax)==0 || (int)(1/bx)==0) .... >> >> (We know that one of ax or bx is larger than 1.) > > if ((int)(1/(2+ax))==0) .... > > (because one of ax or bx might be 0). Now it is getting too heavyweight. Although this is not a fast path, a division in it makes it really slow. Bruce From owner-freebsd-numerics@FreeBSD.ORG Wed Aug 15 22:39:20 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id DC05C1065672 for ; Wed, 15 Aug 2012 22:39:20 +0000 (UTC) (envelope-from stephen@missouri.edu) Received: from wilberforce.math.missouri.edu (wilberforce.math.missouri.edu [128.206.184.213]) by mx1.freebsd.org (Postfix) with ESMTP id 997158FC0A for ; Wed, 15 Aug 2012 22:39:20 +0000 (UTC) Received: from [127.0.0.1] (wilberforce.math.missouri.edu [128.206.184.213]) by wilberforce.math.missouri.edu (8.14.5/8.14.5) with ESMTP id q7FMdJYd068899; Wed, 15 Aug 2012 17:39:19 -0500 (CDT) (envelope-from stephen@missouri.edu) Message-ID: <502C2517.6010302@missouri.edu> Date: Wed, 15 Aug 2012 17:39:19 -0500 From: Stephen Montgomery-Smith User-Agent: Mozilla/5.0 (X11; Linux i686; rv:14.0) Gecko/20120714 Thunderbird/14.0 MIME-Version: 1.0 To: Bruce Evans References: <5017111E.6060003@missouri.edu> <501C361D.4010807@missouri.edu> <20120804165555.X1231@besplex.bde.org> <501D51D7.1020101@missouri.edu> <20120805030609.R3101@besplex.bde.org> <501D9C36.2040207@missouri.edu> <20120805175106.X3574@besplex.bde.org> <501EC015.3000808@missouri.edu> <20120805191954.GA50379@troutmask.apl.washington.edu> <20120807205725.GA10572@server.rulingia.com> <20120809025220.N4114@besplex.bde.org> <5027F07E.9060409@missouri.edu> <20120814003614.H3692@besplex.bde.org> <50295F5C.6010800@missouri.edu> <20120814072946.S5260@besplex.bde.org> <50297CA5.5010900@missouri.edu> <50297E43.7090309@missouri.edu> <20120814201105.T934@besplex.bde.org> <502C0998.7040004@missouri.edu> <502C194D.50903@missouri.edu> <502C1E89.9070408@missouri.edu> <20120816081911.W3938@besplex.bde.org> In-Reply-To: <20120816081911.W3938@besplex.bde.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-numerics@freebsd.org Subject: Re: Complex arg-trig functions X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 15 Aug 2012 22:39:20 -0000 On 08/15/2012 05:34 PM, Bruce Evans wrote: > ((int)(1 + tiny) == 0) OK. ((int)(1 + tiny) == 1) it is. I tested it, and it works. From owner-freebsd-numerics@FreeBSD.ORG Thu Aug 16 04:12:44 2012 Return-Path: Delivered-To: freebsd-numerics@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id D4D97106566C for ; Thu, 16 Aug 2012 04:12:44 +0000 (UTC) (envelope-from stephen@missouri.edu) Received: from wilberforce.math.missouri.edu (wilberforce.math.missouri.edu [128.206.184.213]) by mx1.freebsd.org (Postfix) with ESMTP id 917ED8FC08 for ; Thu, 16 Aug 2012 04:12:44 +0000 (UTC) Received: from [127.0.0.1] (wilberforce.math.missouri.edu [128.206.184.213]) by wilberforce.math.missouri.edu (8.14.5/8.14.5) with ESMTP id q7G4CgxD005104 for ; Wed, 15 Aug 2012 23:12:43 -0500 (CDT) (envelope-from stephen@missouri.edu) Message-ID: <502C733B.1080306@missouri.edu> Date: Wed, 15 Aug 2012 23:12:43 -0500 From: Stephen Montgomery-Smith User-Agent: Mozilla/5.0 (X11; Linux i686; rv:14.0) Gecko/20120714 Thunderbird/14.0 MIME-Version: 1.0 To: freebsd-numerics@freebsd.org References: <5017111E.6060003@missouri.edu> <501C361D.4010807@missouri.edu> <20120804165555.X1231@besplex.bde.org> <501D51D7.1020101@missouri.edu> <20120805030609.R3101@besplex.bde.org> <501D9C36.2040207@missouri.edu> <20120805175106.X3574@besplex.bde.org> <501EC015.3000808@missouri.edu> <20120805191954.GA50379@troutmask.apl.washington.edu> <20120807205725.GA10572@server.rulingia.com> <20120809025220.N4114@besplex.bde.org> <5027F07E.9060409@missouri.edu> <20120814003614.H3692@besplex.bde.org> <50295F5C.6010800@missouri.edu> <20120814072946.S5260@besplex.bde.org> <50297CA5.5010900@missouri.edu> <50297E43.7090309@missouri.edu> <20120814201105.T934@besplex.bde.org> <502A780B.2010106@missouri.edu> <20120815223631.N1751@besplex.bde.org> <502C0CF8.8040003@missouri.edu> In-Reply-To: <502C0CF8.8040003@missouri.edu> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: Complex arg-trig functions X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 16 Aug 2012 04:12:44 -0000 On 08/15/2012 03:56 PM, Stephen Montgomery-Smith wrote: > On 08/15/2012 08:35 AM, Bruce Evans wrote: >> On Tue, 14 Aug 2012, Stephen Montgomery-Smith wrote: >> I added this to the NaN mixing in catan[h]*(), >> and now all my tests pass: >> >> % diff -c2 catrig.c~ catrig.c >> % *** catrig.c~ Sun Aug 12 17:29:18 2012 >> % --- catrig.c Wed Aug 15 11:57:02 2012 >> % *************** >> % *** 605,609 **** >> % */ >> % if (ISNAN(x) || ISNAN(y)) >> % ! return (cpack(x+y, x+y)); >> % % /* (x,inf) and (inf,y) and (inf,inf) -> (0,PI/2) */ >> % --- 609,613 ---- >> % */ >> % if (ISNAN(x) || ISNAN(y)) >> % ! return (cpack((x+0.0L)+(y+0), (x+0.0L)+(y+0))); >> % % /* (x,inf) and (inf,y) and (inf,inf) -> (0,PI/2) */ >> >> Use this expression in all precisions. > > > Would this work? > > if (ISNAN(x) || ISNAN(y)) > return (cpack((x+x)+(y+y), (x+x)+(y+y))); > I know Bruce is gone for a couple of weeks, but can someone else answer these questions? I decided to start reading a bit about nans: http://en.wikipedia.org/wiki/NaN I don't understand why my original code: if (ISNAN(x) || ISNAN(y)) return (cpack(x+y, x+y)); doesn't return quiet nans, and generally do everything else it should do (like raise invalid if one or both are signaling nans). This is what I read on the web page: "Signaling NaNs, or sNaNs, are special forms of a NaN that when consumed by most operations should raise an invalid exception and then, if appropriate, be "quieted" into a qNaN that may then propagate." This is Bruce's code: return (cpack((x+0.0L)+(y+0), (x+0.0L)+(y+0))) and he says he does this for all precisions. Why does he add 0.0L to x, but only add 0 to y?