Date: Mon, 14 Jan 2019 10:05:00 -0500 From: Pedro Giffuni <pfg@FreeBSD.org> To: Bruce Evans <brde@optusnet.com.au> Cc: sgk@troutmask.apl.washington.edu, freebsd-numerics@freebsd.org Subject: Re: Undefined Behavior in lib/msun/src/e_pow.c (was Re: New math library from ARM) Message-ID: <6de27877-aad0-71f3-e118-25174a69b9b1@FreeBSD.org> In-Reply-To: <20190114231414.K1062@besplex.bde.org> References: <797a7755-db93-1b9c-f3b9-8850d948e098@FreeBSD.org> <20181231151904.GB823@troutmask.apl.washington.edu> <20181231152230.GC823@troutmask.apl.washington.edu> <06c8b6a2-ed26-f255-3947-c79b593a9dea@FreeBSD.org> <20190101045425.GA5767@troutmask.apl.washington.edu> <d01c8770-5431-0634-4ecc-37069ff76458@FreeBSD.org> <20190114231414.K1062@besplex.bde.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On 1/14/19 7:52 AM, Bruce Evans wrote: > On Tue, 1 Jan 2019, Pedro Giffuni wrote: > >> On 31/12/2018 23:54, Steve Kargl wrote: >>> On Mon, Dec 31, 2018 at 10:32:06PM -0500, Pedro Giffuni wrote: >>>> Hmm ... >>>> >>>> Looking at the changes in musl's libc I found this issue which seems >>>> real (although somewhat theoretical): >>>> >>>> https://git.musl-libc.org/cgit/musl/commit/src/math?id=688d3da0f1730daddbc954bbc2d27cc96ceee04c >>>> >>>> >>>> Is the attached patch acceptable? >>>> >>>> Also, their code is bit different here: >>>> >>>> https://git.musl-libc.org/cgit/musl/commit/src/math?id=282b1cd26649d69de038111f5876853df6ddc345 >>>> >>>> >>>> but we may also have to check fmaf(-0x1.26524ep-54, -0x1.cb7868p+11, >>>> 0x1.d10f5ep-29). >>>> >>>> Cheers, >>>> >>>> Pedro. >>>> >>>> Index: lib/msun/src/e_pow.c >>>> =================================================================== >>>> --- lib/msun/src/e_pow.c (revision 342665) >>>> +++ lib/msun/src/e_pow.c (working copy) >>>> @@ -130,6 +130,7 @@ >>>> if(hx<0) { >>>> if(iy>=0x43400000) yisint = 2; /* even integer y */ >>>> else if(iy>=0x3ff00000) { >>>> + uint32_t j; /* Avoid UB in bit operations below. */ >>>> k = (iy>>20)-0x3ff; /* exponent */ >>>> if(k>20) { >>>> j = ly>>(52-k); > > I was away. > > I see you committed a better version (with a cast). > Yes, I did what seemed to be the least invasive change, but then the problem with duct tape is that it may work well enough to forget about the issue. I haven't MFC'd it while you guys decided on a better solution. > The correct fix is more like the changing the file scope i and j from > int32_t > to u[_]int32_t. j is mostly used as a temporary to hold pure bits. > Its top > bit is usually not set, so plain int32_t works for it. The only other > place > where its top bit can obviously be set is in an EXTRACT_WORDS(j,i,z). > Then > both j and i can have their top bit set. EXTRACT_WORDS() into h[xy] and > l[xy] is also sloppy. There l[xy] is unsigned, but h[xy] is signed. The > top bit in the high word is soon discarded using ix = hx&0x7fffffff, etc. > Later, the sign is tested using hx<0 and this is the only excuse for hx > being signed. fdlibm does this quite often. It is a manual optimization > for ~30-40 year old compilers that couldn't turn (hx&8000000) != 0 into a > sign test iff the sign test is optimal. > The duct-tape cast still looks good enough ;). > Other style bugs in the above: > >>> I'll defer to Bruce on this. My only comments are >>> 1) declarations belong at the top of the file where all declarations >>> occur > > 1a) missing blank line after declaration. > >> Modern standards let you declare variables within blocks. For style >> we do prefer to start the function with them but in this case we can't. > > K&R 1978 C allows declaring variables within blocks. > > You can't declare j again in function scope since it is already there > with > a different type. > >>> 2) j is already declared as int32_t >> And it has to be a signed integer: we later check for negative values. > > I think that is only when j is misused later for EXTRACT_WORDS(). > > 2a,2b) Redeclaring j with a different type in an inner block is an even > larger style bug/obfuscation than declaring a new variable or redeclaring > j with the same type there. Redeclaring shadows it and this bug is > detected by Wshadow. Redeclaring it with a different type (as needed to > work) is especially obscure shadowing. > >>> 3) uint32_t should be written as u_int32_t. >> >> No, uint32_t is the standard type, u_int32_t is a BSDism. Also, the >> file consistently uses uint32_t. > > No, see previous replies. 3) is correct. > Yes, I acknowledged this. I was looking at the musl version which replaced all the u_int32_t's. >>> Are you sure that UB occurs? Or, is this an attempt to >>> placate a static analysis tool that only see shifting of >>> a signed type? Do you need to make a similar change to >>> e_powf.c? >> >> The point of the Undefined Behavior is precisely that it doesn't >> occur: we are lucky and the compiler behaves as it always has but the >> standard will let a different compiler behave differently and then we >> will have surprises. > > No, UB does occur, but is harmless. ly is unsigned and can have its top > bit set. j=ly>>(52-k) shifts right a few bits so doesn't overflow on > assignment to signed j. But then j<<(52-k) gives UB attempting to > recover > the top bit of ly gives UB in all cases where the top bit is nonzero. > > The else clause doesn't have this problem, since it shifts iy instead > of ly > and iy never has its top bit set. > > e_powf.c doesn't have this problem, since there is only 1 32-bit word for > float precision and it is handled like the top word for double precision > (only its mantissa bits are shifted). > > However, it is easier to use unsigned shifts than to see that signed > shifts > have defined behaviour. musl did well to only detect and/or fix the 1 > case > where UB behaviour clearly occurs. > Thanks for the feedback! For now I would prefer we focus more on whatever the ARM lib does better. Pedro.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?6de27877-aad0-71f3-e118-25174a69b9b1>