Date: Mon, 30 May 2016 09:36:34 +0300 From: Andrey Chernov <ache@freebsd.org> To: Bruce Evans <brde@optusnet.com.au>, Conrad Meyer <cem@freebsd.org> Cc: src-committers <src-committers@freebsd.org>, svn-src-all@freebsd.org, svn-src-head@freebsd.org Subject: Re: svn commit: r300965 - head/lib/libc/stdlib Message-ID: <5985bdc1-b821-f352-0bc5-c45c600c5318@freebsd.org> In-Reply-To: <20160530122100.X924@besplex.bde.org> References: <201605291639.u4TGdSwq032144@repo.freebsd.org> <CAG6CVpXuoetY2GvV7Zonueb0TvQfRcMAHQYLXhd6yab5Mi%2BR0Q@mail.gmail.com> <20160530122100.X924@besplex.bde.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On 30.05.2016 6:09, Bruce Evans wrote: > On Sun, 29 May 2016, Conrad Meyer wrote: > >> Does clang actually generate different code with this change? > > It should, on exotic arches. > >> On Sun, May 29, 2016 at 9:39 AM, Andrey A. Chernov <ache@freebsd.org> >> wrote: >>> Log: >>> Micro optimize: C standard guarantees that right shift for unsigned >>> value >>> fills left bits with zero, and we have exact 32bit unsigned value >>> (uint32_t), so there is no reason to add "& 0x7fffffff" here. > > Using uint32_t at all is an unportable pessimization. On exotic arches > that don't have native uint32_t registers, a theoretically perfect > implementation wouldn't implement uint32_t. This would expose the > brokenness of broken code, so a practical implementation would emulate > uint32_t so that the broken code would just run slower for arithmetic > and much slower for locked memory operations. > > uint32_t might be implemented not very slowly using 128-bit integer > registers, or more slowly using the 53-bit mantissa part of an IEEE > double precision floating point register. > > If uint32_t is emulated, then the compiler is forced to act as if the > code uses a longer type and does "& 0xffffffff" after every operation. > Thes extra operations can only be combined sometimes. More careful > code can use a minimal number of this or similar "&" operations. In > checksum calculations, one "&" at the end is usually enough. > >>> Modified: head/lib/libc/stdlib/random.c > > random.c was mostly written before uint32_t was standard, so it used > u_long and long. Perhaps it wasn't careful enough with the "&"s to > actually work unless u_long is precisely uint32_t and long is normal > 2's complement with benign overflow. > > Anyway, it was "fixed" (unimproved) using s/u_long/uint32_/ in most > places where the API/ABI doesn't require longs (there is 1 dubious > long left in a comment). The correct fix is s/u_long/uint_fast32_t > in most places and s/u_long/uint_least32_t/ in some places and then > fix any missing "&"'s. The "fast" and "least" types always exist, > unlike the fixed-width types, and using them asks for time/space > efficiency instead of emulated fixed-width. > > On non-exotic arches, fast == least == fixed-width, so the correct > substitution works as a quick fix even with missing "&"s. > > It is not necessary to use the newfangled standard integer types to > fix this here, since correct use of long types would work (they give > 32 bits), but long is wasteful if it actually 64 bits or longer. > > Even larger problems are looming with uintmax_t. Any code that is > careful enough to use it is likely to break or be bloated if it is > expanded. This is just like using u_long in old random(). > >>> ============================================================================== >>> >>> --- head/lib/libc/stdlib/random.c Sun May 29 16:32:56 >>> 2016 (r300964) >>> +++ head/lib/libc/stdlib/random.c Sun May 29 16:39:28 >>> 2016 (r300965) >>> @@ -430,7 +430,7 @@ random(void) >>> */ >>> f = fptr; r = rptr; >>> *f += *r; >>> - i = (*f >> 1) & 0x7fffffff; /* chucking least >>> random bit */ >>> + i = *f >> 1; /* chucking least random bit */ > > This gives an "&" to restore in the version with correct substitutions. > > It also breaks the indentation. (This file mostly indents comments to the > right of code to column 40, but column 48 was used here and now column 32 > is used.) > >>> if (++f >= end_ptr) { >>> f = state; >>> ++r; > > Bruce > I don't introduce uint32_t and int32_t here and don't have a slightest idea of which types will be better to change them. F.e. *f += *r; suppose unsigned 32bit overflow which don't naturally happens for large types. Assigning uint32_t to some large type then clip it to smaller after calculation - all of that can produce more code than save for calculation itself.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?5985bdc1-b821-f352-0bc5-c45c600c5318>