Date: Thu, 9 Nov 2017 01:33:00 +1100 (EST) From: Bruce Evans <brde@optusnet.com.au> To: "Montgomery-Smith, Stephen" <stephen@missouri.edu> Cc: "freebsd-numerics@freebsd.org" <freebsd-numerics@freebsd.org> Subject: Re: cpow and clog Message-ID: <20171108224608.B1089@besplex.bde.org> In-Reply-To: <4330a817-ae31-6336-7aa6-3e9981b02ab7@missouri.edu> References: <20171106194937.GA87725@freebird> <20171106204121.GB37361@troutmask.apl.washington.edu> <4330a817-ae31-6336-7aa6-3e9981b02ab7@missouri.edu>
next in thread | previous in thread | raw e-mail | index | archive | help
On Wed, 8 Nov 2017, Montgomery-Smith, Stephen wrote: > On 11/06/2017 02:41 PM, Steve Kargl wrote: >> On Mon, Nov 06, 2017 at 08:49:43PM +0100, Michael Danilov wrote: >>> Hello, >>> >>> I would like to have some feedback on my attempt to import OpenBSD >>> code for cpow and clog: >>> >>> https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D221341 >>> https://bugs.freebsd.org/bugzilla/attachment.cgi?id=3D187693 > > The problem with this implementation of clog is that they will produce > very inaccurate results when |z| is close to 1. A proper implementation > is given in the paper: > T. E. Hull, Thomas F. Fairgrieve, Ping Tak Peter Tang, =E2=80=9CImplement= ing > Complex Elementary Functions Using Exception Handling=E2=80=9D, ACM Trans= actions > on Mathematical Software, Volume 20 (1994), pp 215=E2=80=93244, Corrigend= a, p553. >=20 > They use a precision which is twice that of the required precision. For > example, if you want log(sqrt(x^2+y^2)) accurate to about 17 digits, you > will need to compute it to 34 digits. I don't see how that can be enough. I found that about 3 times as much precision is needed for the worst case in double precision. Closer to 160 bits than 150, and 2 times as much much is only 106 bits. This defeated my intuition, but it was easy to find the worst case by searching a few billion cases near z =3D 1 to improve my intuition, then a few more in a smaller search area to find the worst case. The authors of the paper were/are real numerical analysts, but CPUs were too slow in 1004 for them to routinely search billions of cases to check their algorithms. > If all you care about is getting a a good relative error for the total > answer (rather than insisting on a good relative error for the real and > imaginary parts separately), then for |x+I*y| close to one, compute the > real part using > > 0.5*logp1((x-1)*(x+1)+y*y) z =3D 1 is a very non-exceptional value, so clog() should be very accurate near it. I just noticed that draft C99 (n869) Annex F (IEEE754 binding) is bad enough to say that cpow(z, c) is implicitly specified by the simplistic formula: cpow(z, c) =3D cexp(c * clog(z)) (this is in a section with similar specifications like: ccos(z) =3D ccosh(I * z)) which are not incorrect since they don't lose precision or give spurious exceptions (provided I * z is not evaluated simplistically)). This bug is missing without Annex F, by allowing any quality of implementation. This bug is fixed in C99 through draft C11 (n1570) Annex by not requiring anything extra except raising all relevant exception and saying in a footnote that the above low quality implementation conforms since it raises all relevant exceptions (it might raise spurious exceptions but this conforms). Anyway, if you use the simplistic formula for cpow(), then clog() should be very accurate to avoid exponentiating a large error. Even the best possible minimal error of 0.5 ulps is not good. >>> What happened to the alternative implementation mentioned in the thread= below? >> >> bde has an implementation of clog[fl]. He may someday >> commit it. I don't know if anyone ever worked on cpow[fl]. >> I stopped working on powl and tgammal when I returned my >> commit bit due to differences with "higher-ranking" committers. > > bde's implementation was fantastic. He managed to use two double > variables to represent a single quadruple precision real (using a quad-double > technique which I believe is standard, but it was new to me). Thanks. I forgot the details, but just checked some. It uses more like 4 double variables to represent a single tri-double precision variable. Full quad-double precision would require something like 3 ripple carries per addition and I managed to reduce this something like only 1 ripple carry per addition by doing bi-double additions in a delicate order. This loses some precision but retains enough. This is least delicate near z =3D 1, but it is more accurate when it works, and not much slower, so it is used for approx. 0.5 < |z|^2 < 3. I don't know if this is standard. Other parts of the algorithm are standar= d. > Why he never committed it, I don't know. People on this list take way > too long to commit excellent software. > > For example, I had code for cacos, catan, etc, ready for a long time > before someone else stepped in and committed it. And I tested it to > death. I even found a bug in > T. E. Hull, Thomas F. Fairgrieve, Ping Tak Peter Tang, =E2=80=9CImplement= ing the > complex arcsin and arccosine functions using exception handling=E2=80=9D,= ACM > Transactions on Mathematical Software, Volume 23 (1997) pp 299=E2=80=9333= 5. We were waiting for you to commit your code :-). It is still not complete. It is missing many engineering details: - file names not in a normal namespace (ca* instead of normal e_c* or s_c*)= =2E East to fix, especially before committing. - multiple functions per file. Not so easy to fix. I didn't manage to fix it for my log*l(). I tried many not so good splittings using #include's of a .c file with ifdefs, and different kernel functions, and inline functions. Inline functions are now optimized well enough by clang (but static ones waste space and extern ones are hard to manage). My clog*() file has the same problem. Steve split it, but I haven't got around to even testing that. - switching the rounding precision for long double functions on i386. I couldn't see a clean way to do this. It can be done by sprinkling ENTERIT() and RETURNI() macros. ENTERIT() (ENTERI() with the return type specified) is also needed for clogl(), but hasn't been committed yet. - comment stripping and/or generating source files from a template. You used this for development. The scripts to do this were not very suitabl= e for committing, but perhaps better than nothing. IIRC, the template was the double version with a few extras. The double version has full comments, and all comments are stripped in the float and double versions= , including in the committed versions. But the normal style is only strip large comments and adjust small comments in the float and double versions. My clog*() handles this problem in the opposite way not strip= ping any comments and minimizing differences in comments. It also minimizes difference in code by manual inspection/rewrite instead of a template. Bruce From owner-freebsd-numerics@freebsd.org Wed Nov 8 15:03:05 2017 Return-Path: <owner-freebsd-numerics@freebsd.org> Delivered-To: freebsd-numerics@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 596F7E55DA8 for <freebsd-numerics@mailman.ysv.freebsd.org>; Wed, 8 Nov 2017 15:03:05 +0000 (UTC) (envelope-from sgk@troutmask.apl.washington.edu) Received: from troutmask.apl.washington.edu (troutmask.apl.washington.edu [128.95.76.21]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "troutmask", Issuer "troutmask" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 396C37E227 for <freebsd-numerics@freebsd.org>; Wed, 8 Nov 2017 15:03:05 +0000 (UTC) (envelope-from sgk@troutmask.apl.washington.edu) Received: from troutmask.apl.washington.edu (localhost [127.0.0.1]) by troutmask.apl.washington.edu (8.15.2/8.15.2) with ESMTPS id vA8F337b053970 (version=TLSv1.2 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO); Wed, 8 Nov 2017 07:03:03 -0800 (PST) (envelope-from sgk@troutmask.apl.washington.edu) Received: (from sgk@localhost) by troutmask.apl.washington.edu (8.15.2/8.15.2/Submit) id vA8F325h053969; Wed, 8 Nov 2017 07:03:02 -0800 (PST) (envelope-from sgk) Date: Wed, 8 Nov 2017 07:03:02 -0800 From: Steve Kargl <sgk@troutmask.apl.washington.edu> To: Dima Pasechnik <dimpase@gmail.com> Cc: Eitan Adler <lists@eitanadler.com>, "Montgomery-Smith, Stephen" <stephen@missouri.edu>, "freebsd-numerics@freebsd.org" <freebsd-numerics@freebsd.org> Subject: Re: cpow and clog Message-ID: <20171108150302.GA53567@troutmask.apl.washington.edu> Reply-To: sgk@troutmask.apl.washington.edu References: <20171106194937.GA87725@freebird> <20171106204121.GB37361@troutmask.apl.washington.edu> <20171106213308.GA73787@troutmask.apl.washington.edu> <c4448deb-8ccf-60c2-f839-cc17ca047969@missouri.edu> <CAF6rxgnWg+m2fpAjy1kF6K5YkpzWF_pAC+qAHhu8oysEppibdw@mail.gmail.com> <20171108061501.GA51059@troutmask.apl.washington.edu> <CAAWYfq2VHBjrczYE4cmAHgaTPqb_kKC13rut9L1isa2ZLL0WYg@mail.gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <CAAWYfq2VHBjrczYE4cmAHgaTPqb_kKC13rut9L1isa2ZLL0WYg@mail.gmail.com> User-Agent: Mutt/1.7.2 (2016-11-26) X-BeenThere: freebsd-numerics@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "Discussions of high quality implementation of libm functions." <freebsd-numerics.freebsd.org> List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-numerics>, <mailto:freebsd-numerics-request@freebsd.org?subject=unsubscribe> List-Archive: <http://lists.freebsd.org/pipermail/freebsd-numerics/> List-Post: <mailto:freebsd-numerics@freebsd.org> List-Help: <mailto:freebsd-numerics-request@freebsd.org?subject=help> List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-numerics>, <mailto:freebsd-numerics-request@freebsd.org?subject=subscribe> X-List-Received-Date: Wed, 08 Nov 2017 15:03:05 -0000 On Wed, Nov 08, 2017 at 11:00:40AM +0000, Dima Pasechnik wrote: > > A patch not tagged by a version control system surely rots very quickly, > as noone except the submitter knows the exact state of the source tree it > is made against, especially if the tree is also not under a revision > control system. > And surely the submitter's memory is not perfect, either. The patch is against trunk on the day it was submitted. The patch was created under the officical revision control system used by FreeBSD. -- Steve
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20171108224608.B1089>