Date: Mon, 10 Dec 2007 22:22:11 +1100 (EST) From: Bruce Evans <brde@optusnet.com.au> To: Steve Kargl <sgk@troutmask.apl.washington.edu> Cc: freebsd-standards@FreeBSD.org Subject: Re: Implementation of expl() Message-ID: <20071210213513.Q30644@besplex.bde.org> In-Reply-To: <20071209213450.GA95726@troutmask.apl.washington.edu> References: <20071103001305.GA82775@troutmask.apl.washington.edu> <20071209212505.GA9698@VARK.MIT.EDU> <20071209213450.GA95726@troutmask.apl.washington.edu>
next in thread | previous in thread | raw e-mail | index | archive | help
On Sun, 9 Dec 2007, Steve Kargl wrote: > On Sun, Dec 09, 2007 at 04:25:05PM -0500, David Schultz wrote: >> On Fri, Nov 02, 2007, Steve Kargl wrote: >>> With all the success of developing several other missing >>> C99 math routines, I decided to tackle expl() (which I >>> need to tackle tanhl()). >> >> Hmm, great, but where's the patch? :) Maybe the mailing list >> software ate it. > > This is the current version. I need to revise how I computed > the ploynomial coefficient. In short, I mapped r in > [-ln(2)/2:ln(2)/2] into the range x in [-1,1] for the Chebyshev > interpolation, but I never scaled x back into r. This is the > reason why the lines "r = r * TOLN2;" exists. > > I don't remember if bde sent me comments on this code. I sure > he has plenty. :) I don't think I commented about expl() before. Sorry, it's further from what I would like than the trig functions. Here are some better polynomial approximations for it. ---------------------------------------------------------------------------- (1) 64-bit precision, approximating exp(), saves a term and should be more accurate. Note the difference in the plots between the Chebyshev approximation and the final approximation. The uglyness and extra error in the latter are due to rounding the very-high-precision coeffs used in the former to 64 bits. The final approximation uses a Remez algorithm on rounded coeffs to try to choose the best set of rounded coeffs (!= the best set of coeffs, rounded). Its plot is ugly because my algorithm is unstable for functions that aren't even, so I had to stop it after 1 significant iteration (adjusting the P3 term from 0xaaaaaaaaaaaaaaab.0p-66 to ...aa8.0p-66). If it worked properly then it would push all the extrema back to the top and bottom of the plot, and be slightly more accurate than the Chebyshev approximation. Remez gains very little over Chebyshev in infinite precision (about a factor of 2), but can compensate for huge errors (a factor of 100[0]) due to unfortunate rounding of coeffs. ---------------------------------------------------------------------------- --- Simple Lagrange interpolation at Chebyshev points --- 5.975e-24 |'''''x'''''''''x'''''''''''x''''''''''''x''''''''''_x''''''_''" | : x " " :: : : |_ : : : _ " : " : : : :: : |: : x : : : : : : : : :: : |: " : : : : " : " : x : x : |: : : " : _ : : : " : : : : |: : : : : : : " : : : : : : |: : : : " : : : : : : : : : |: : : : : : x : : : : _ : : |: : : : : : : : " : : : : : |:: : x _ : _ : : : : : : ::| `::`:```:````:`````:`````:``````:`````"`````:`````"````_```:`::` : : : : : _ : : : : : : : :| : : " : : : : x : : : : : :| : : : : : : : : : _ : : : :| : : : : : : _ : : : : : : :| : : : : _ : : : " : _ : : :| : x: " : x : _ : : : : : :| : :: : : : : : : : : _ " :| : : :: : x : _ x : : : _| : _ :_ _ _ _ : | -5.95e-24 x........."...........x............x...........x........._.....| -0.34658 0.34658 --- Lagrange/Chebyshev with coeff adjusted to 64-bit precision --- adjn n = 64 k = 3 j = 120 toter 0 --- rounded coeff --- P0 = 1.000000000000000000000000000000000000000, /* 0x8000000000000000.0p-63 */ P1 = 1.000000000000000000000000000000000000000, /* 0x8000000000000000.0p-63 */ P2 = 0.5000000000000000000000000000000000000000, /* 0x8000000000000000.0p-64 */ P3 = 0.1666666666666666666310000000000000000000, /* 0xaaaaaaaaaaaaaaa8.0p-66 */ P4 = 0.04166666666666666666100000000000000000000, /* 0xaaaaaaaaaaaaaaa9.0p-68 */ P5 = 0.008333333333333338066950000000000000000000, /* 0x8888888888889e5d.0p-70 */ P6 = 0.001388888888888889569550000000000000000000, /* 0xb60b60b60b60cf28.0p-73 */ P7 = 0.0001984126984124767297770000000000000000000, /* 0xd00d00d00c013acd.0p-76 */ P8 = 0.00002480158730156209466100000000000000000000, /* 0xd00d00d00c1851e1.0p-79 */ P9 = 0.000002755731927361941412700000000000000000000, /* 0xb8ef1d304cd05f90.0p-82 */ P10 = 0.0000002755731927069719307580000000000000000000, /* 0x93f27dbffa11c00d.0p-85 */ P11 = 0.00000002505205082467052704390000000000000000000, /* 0xd7320ad84885e745.0p-89 */ P12 = 0.000000002087671071533106249630000000000000000000, /* 0x8f76b2a8ea87f9b3.0p-92 */ P13 = 1.60924147227467723196999999999999999999 E-10, /* 0xb0f01edf2f5fac20.0p-96 */ P14 = 1.14941936278242808142000000000000000000 E-11, /* 0xca353f02fa6b7741.0p-100 */ 7.252e-24 |''''''''''''''''_''''''''''''''''''''_""_'''''''''''''''''''''| | " x x | | x _ x _" | | " x " | --------------_---------------_xxx"--------x------_------------- | _x : " _" _ | | : : _ x x _ : | | : " _ x : | | _ : "__x x _ : x | | : _ : x_ x : | | : " : " : | |_ : xx : : x | |: : _ : : x |: x " : : |:: : x : : |::: :: : _: :| : _ :| : :| : x| : | -2.79e-23 _..............................................................| -0.34658 0.34658 adjn n = 64 k = 3 toter 0 ---------------------------------------------------------------------------- (2) 113-bit precision, approximating exp(). ---------------------------------------------------------------------------- --- Simple Lagrange interpolation at Chebyshev points --- 3.239e-38 x''''x'''''x''''''_''''''''x''''''''"''''''''_''''''"'''''"''''" : : : :x : : x: : : | |: : :: :: : _ _ : :: :: : _| |" : _ : : : " : : " : : : _ : :| |: : : : : : : : : : : : : : : : : :| |: : : : x : : : : : : : : x : : : :| |: : : : : " : : : : : : " : : : : :| | : : _ : : : " : : : : " : : : x : : | | : _ : : : : : : : : : : : : : : _ : | | : : : : : : : _ " " _ : : : : : : : | | : : : : : : : : : : : : : : : : : : | ``x`:`:``:```:``:````:```:```:````:```:```:````:``:```:``:`:`x`` | : : : : : : : : : : : : : : : : : : | | : : : x : : : : : : : : : : x : : : | | : : : : x x : : : : : : x _ : : : : | | :: : : : : " : : : : " : : : : :: | | : : : : : : : x x : : : : : : : | | : :: : : : " : : " : : : :: : | | : _: :: : : : : : : :: :: : | | : :: :: :: :: :: :: :" : | | : : :_ _: :: :_ :: : : | -3.13e-38 |..x....x....."........x.......xx......._......."".....x...._..| -0.34658 0.34658 --- Lagrange/Chebyshev with coeff adjusted to 113-bit precision --- adjn n = 113 k = 1 j = 120 toter 0 --- rounded coeff --- P0 = 1.000000000000000000000000000000000000000, /* 0x10000000000000000000000000000.0p-112 */ P1 = 1.000000000000000000000000000000000000000, /* 0x10000000000000000000000000000.0p-112 */ P2 = 0.4999999999999999999999999999999999500000, /* 0x1ffffffffffffffffffffffffffff.0p-114 */ P3 = 0.1666666666666666666666666666666666600000, /* 0x15555555555555555555555555555.0p-115 */ P4 = 0.04166666666666666666666666666668415500000, /* 0x155555555555555555555555560af.0p-117 */ P5 = 0.008333333333333333333333333333336938300000, /* 0x11111111111111111111111111a6d.0p-119 */ P6 = 0.001388888888888888888888888886437923800000, /* 0x16c16c16c16c16c16c16c15fa9389.0p-122 */ P7 = 0.0001984126984126984126984126980627689100000, /* 0x1a01a01a01a01a01a01a0191e8217.0p-125 */ P8 = 0.00002480158730158730158730175735685579700000, /* 0x1a01a01a01a01a01a01ad93230fd1.0p-128 */ P9 = 0.000002755731922398589065255750604381149600000, /* 0x171de3a556c7338faac28603831fb.0p-131 */ P10 = 0.0000002755731922398589065187949851278033800000, /* 0x127e4fb7789f5c72e69d53c0ac1d3.0p-134 */ P11 = 0.00000002505210838544171877444493611961585000000, /* 0x1ae64567f544e38fdb412a0eb7de0.0p-138 */ P12 = 0.000000002087675698786810064988525091623603400000, /* 0x11eed8eff8d8981cadc22e13e3610.0p-141 */ P13 = 1.60590438368216158655710381729946340000 E-10, /* 0x16124613a86d09fa08f48ca8c4d5e.0p-145 */ P14 = 1.14707455977270929588227069467205860000 E-11, /* 0x193974a8c07640267fec271f370a0.0p-149 */ P15 = 7.6471637318180850420027983995938208999 E-13, /* 0x1ae7f3e733b16c573381f7ce23115.0p-153 */ P16 = 4.77947733504221136080232269675803310000 E-14, /* 0x1ae7f3e773e8b2e2b7a23c9a525ff.0p-157 */ P17 = 2.81145725589067097361019490883767970000 E-15, /* 0x1952c7706c73b79a1577d9ebcf121.0p-161 */ P18 = 1.56191903751316458500805670560905000000 E-16, /* 0x168276d284fb449ca1ff48bfbb891.0p-165 */ P19 = 8.2206265778415557385251561200634072999 E-18, /* 0x12f499f725c9f790053993c3f2b17.0p-169 */ P20 = 4.11616915419431123065482372671396870000 E-19, /* 0x1e5f394471e23048e2a82e8f22b34.0p-174 */ P21 = 1.9600698694144411728713343986 E-20, /* 0x1723f29b62196e8abacb3c5c40000.0p-178 */ 9.874e-37 "''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''| |_ | | | | _ | | : | | : | | "x | | " | | _ | | | | _ | | | | "x_ | | x | | x | | _ _x | x_ x | | "_ xx _" x | | """x " "x | | x x"x___" | | " __ __ _" | 0 ,,,,,,,,,,,,,,,,,,,,,,"xxx",,"x__x",,"x__x,,,,,,,,,,,,,,,,,,,,,, -0.34658 0.34658 adjn n = 113 k = 1 toter 0 ---------------------------------------------------------------------------- (3) 64-bit precision, approximating nqexp() := x * (exp(x) + 1) / (exp(x) - 1) exp(x) can be recovered from nqexp(), and the latter has much better numerical behaviour. E.g., it needs only 7 coeffs instead of 15. The Remez search completed and equalized all the extrema. It's interesting that The Chevyshev approximation equalized the extrema for exp() but not for nqexp(). ---------------------------------------------------------------------------- --- Simple Lagrange interpolation at Chebyshev points --- 2.327e-21 "''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''" : : : _" "_ : : :: :: : : :: :: : : : : "" "" : : : : : : x _ _ x : : : : : x : : : : x : : : : : : : : : : : : : : : _ " __ __ " _ : : : : : : : " "x_ _x" " : : : : `:_```:```:``````x``````"``````""``````"``````x``````:```:```_:` |:: : : x x : : ::| |:: _ _ x _ _ x _ _ ::| |:: : : _ _ _ _ : : ::| |:: : : " " : : ::| |: : : : : :| |: : x x : :| |: x x :| |: x x :| |: :| -2.23e-21 |_............................................................_| -0.34658 0.34658 --- Lagrange/Chebyshev with coeff adjusted to 64-bit precision --- adjn n = 64 k = 13 toter 0 C-1 = 2.000000000000000000000000000000000000000, /* 0x8000000000000000.0p-62 */ C0 = 0.1666666666666666660071100000000000000000, /* 0xaaaaaaaaaaaaaa7a.0p-66 */ C1 = -0.002777777777777671015759280000000000000000, /* -0xb60b60b60b5904a2.0p-72 */ C2 = 0.00006613756613181145420083840000000000000000, /* 0x8ab355dfd4d5cfe3.0p-77 */ C3 = -0.000001653439010798086927524850000000000000000, /* -0xddebbb58747e58c9.0p-83 */ C4 = 0.00000004175172569487852482302880000000000000000, /* 0xb35282048125b16f.0p-88 */ C5 = -0.000000001045797728772232364397340000000000000000, /* -0x8fbbbc85f0e61575.0p-93 */ 1.275e-21 "''"''''''''"''''''''''''"x''''''''''x"''''''''''''"''''''''"''" : : x _ _ x : : : :: " : : " " : : " :: : : :x : : : : : : x: : : : : : : : x x : : : : : : : : : : x " : : " x : : : : : : : _ : : : : : : _ : : : : : : : : : " " : : : : : : : " : : : : : : : : " : : : : : : x x x x : : : : : |:: : : _ : __ : _ : : ::| `::``_```:`````:``````:``````````````````:``````:`````:```_``::` |:: : : : : : : : : ::| |:: : x : : : : x : ::| |: : : : " " : : : :| |: : : : : : : : : :| |: : : " : : " : : :| |: : : : : : : : : :| |: : : : " " : : : :| |: " " : : : : " " :| |_ " : : " _| -1.28e-21 |......_.........._"........................"_.........._......| -0.34658 0.34658 adjn n = 64 ---------------------------------------------------------------------------- (4) 113-bit precision, approximating nqexp(). The Remez search was stable but made negative progress after the C3 (?) term so I stopped it early. It had only almost equalized the extrema when stopped. ---------------------------------------------------------------------------- --- Simple Lagrange interpolation at Chebyshev points --- 4.567e-37 |''''''"''''''''''''''''''''''''''''''''''''''''''''''''"''''''| | " : : " | | : : _ _ : : | | :: : : :_ _: : : :: | | :: : : :: :: : : :: | | x: : " : : x x : : " : :x | | :: _ : : : x x : : : _ :: | | :: : : " : : " " : : " : : :: | | :: : : : _ : __ __ : _ : : : :: | --:-:-:-:---:--:---x---x---_--"xx"--_---x---x---:--:---:-:-:-:-- | : : : : : : : : : : : : : : | | : : : : : : : x " " x : : : : : : : | | : :: : : : : " " : : : : :: : | |: :: : : x " " x : : :: :| |: :: " " " " :: :| |: :: : : " " : : :: :| |: _: : : :_ :| |: : _ _ : :| |: " " :| |: :| |" "| -5.88e-37 _.............................................................._ -0.34658 0.34658 --- Lagrange/Chebyshev with coeff adjusted to 113-bit precision --- adjn n = 113 k = 3 j = 120 toter 0 --- rounded coeff --- C-1 = 2.000000000000000000000000000000000000000, /* 0x10000000000000000000000000000.0p-111 */ C0 = 0.1666666666666666666666666666666662000000, /* 0x15555555555555555555555555542.0p-115 */ C1 = -0.002777777777777777777777777777553326200000, /* -0x16c16c16c16c16c16c16c16b85140.0p-121 */ C2 = 0.00006613756613756613756613752850992610100000, /* 0x11566abc011566abc0114a7e047ee.0p-126 */ C3 = -0.000001653439153439153439150292458494169500000, /* -0x1bbd779334ef0aac6588d54fb28a1.0p-132 */ C4 = 0.00000004175351397573619780544017497215914800000, /* 0x166a8f2bf70ebd9cab06da3dfcbe9.0p-137 */ C5 = -0.000000001056838027737493956831265550261534200000, /* -0x122805d6442668558fc0ca2fcc585.0p-142 */ C6 = 2.67650730612754829770358952948658760000 E-11, /* 0x1d6db2c4e01fe52c6eb7674cf86f8.0p-148 */ C7 = -6.7793605801133740547903998874808967000 E-13, /* -0x17da4e1ebc3556f34306a709a13c6.0p-153 */ C8 = 1.71721130775699965239461146049180330000 E-14, /* 0x1355864cf343fe6e71f6644193ab0.0p-158 */ C9 = -4.34912151080995129543227548203712180000 E-16, /* -0x1f56b690b4b95dd505b8a8e6666b1.0p-164 */ C10 = 1.08203893915267000322982006799182099999 E-17, /* 0x18f33b03a36ff97329d6990c65e9c.0p-169 */ 3.741e-37 |''''''"''''''''''''''''''''''''''''''''''''''''''''''''"''''''| | : _ _ : | | :: : _ " " _ : :: | | :: :_ ": :" _: : : | | _ x : : : :: " " " " :: : : : x _ | | : : : : : : : : : : : : : : : : : : | | :" : : _ : : : : : : : : : : x _ : ": | | :: : " : : : x : : : : x : : : : : :: | | :: : : : : : : : " " : : : : : : : :: | | :: : : : : " : x : : x : " : : : : :: | | :: : : : : : : : :: : : : : :: : :: | --::-:---::---x---:---:---:----""----:---:---:---x---::---:-::-- | :: : :: : : : : : : : : :: : :: | |: : : :" : : : : : : : : "_ : : :| |: :: " : : _ : : _ : : :: :| |: :: : : : : : : : : :: :| |: :: : x : " " : x : :: :| |: :_ : : : : : : : : _: :| |: :: _: :: :: :_ :: :| |: : : _: :_ : : :| |: : _ x x _ : :| -3.37e-37 _x.."......................................................"..x_ -0.34658 0.34658 adjn n = 113 k = 3 toter 0 --- Attempts to use 64-bit coeffs for the 113-bit approximation were unsuccessful -- the error was about 5e-24, but about 2^-(113+5) = 3e-36 is needed. The interval is just too wide for this to work. Probably only the first couple of coeffs need to have full precision. Bruce
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20071210213513.Q30644>