From owner-svn-src-all@freebsd.org Tue May 31 03:42:33 2016 Return-Path: Delivered-To: svn-src-all@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id ABB33B55529; Tue, 31 May 2016 03:42:33 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from mail106.syd.optusnet.com.au (mail106.syd.optusnet.com.au [211.29.132.42]) by mx1.freebsd.org (Postfix) with ESMTP id 396C11196; Tue, 31 May 2016 03:42:32 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from c122-106-149-109.carlnfd1.nsw.optusnet.com.au (c122-106-149-109.carlnfd1.nsw.optusnet.com.au [122.106.149.109]) by mail106.syd.optusnet.com.au (Postfix) with ESMTPS id 357E93C8334; Tue, 31 May 2016 13:42:23 +1000 (AEST) Date: Tue, 31 May 2016 13:42:23 +1000 (EST) From: Bruce Evans X-X-Sender: bde@besplex.bde.org To: Andrey Chernov cc: Bruce Evans , Conrad Meyer , src-committers , svn-src-all@freebsd.org, svn-src-head@freebsd.org Subject: Re: svn commit: r300965 - head/lib/libc/stdlib In-Reply-To: <5985bdc1-b821-f352-0bc5-c45c600c5318@freebsd.org> Message-ID: <20160531130326.G1052@besplex.bde.org> References: <201605291639.u4TGdSwq032144@repo.freebsd.org> <20160530122100.X924@besplex.bde.org> <5985bdc1-b821-f352-0bc5-c45c600c5318@freebsd.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Optus-CM-Score: 0 X-Optus-CM-Analysis: v=2.1 cv=c+ZWOkJl c=1 sm=1 tr=0 a=R/f3m204ZbWUO/0rwPSMPw==:117 a=L9H7d07YOLsA:10 a=9cW_t1CCXrUA:10 a=s5jvgZ67dGcA:10 a=kj9zAlcOel0A:10 a=jRzFTlLwY2iwN68wQYsA:9 a=CjuIK1q_8ugA:10 X-BeenThere: svn-src-all@freebsd.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: "SVN commit messages for the entire src tree \(except for " user" and " projects" \)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 31 May 2016 03:42:33 -0000 On Mon, 30 May 2016, Andrey Chernov wrote: > On 30.05.2016 6:09, Bruce Evans wrote: >> ... The correct fix is s/u_long/uint_fast32_t >> in most places and s/u_long/uint_least32_t/ in some places and then >> fix any missing "&"'s. The "fast" and "least" types always exist, >> unlike the fixed-width types, and using them asks for time/space >> efficiency instead of emulated fixed-width. >> ... [That was the correct fix for longs long ago, not your change here.] >>>> ============================================================================== >>>> >>>> --- head/lib/libc/stdlib/random.c Sun May 29 16:32:56 >>>> 2016 (r300964) >>>> +++ head/lib/libc/stdlib/random.c Sun May 29 16:39:28 >>>> 2016 (r300965) >>>> @@ -430,7 +430,7 @@ random(void) >>>> */ >>>> f = fptr; r = rptr; >>>> *f += *r; >>>> - i = (*f >> 1) & 0x7fffffff; /* chucking least >>>> random bit */ >>>> + i = *f >> 1; /* chucking least random bit */ >> >> This gives an "&" to restore in the version with correct substitutions. >> >> It also breaks the indentation. (This file mostly indents comments to the >> right of code to column 40, but column 48 was used here and now column 32 >> is used.) >> >>>> if (++f >= end_ptr) { >>>> f = state; >>>> ++r; > > I don't introduce uint32_t and int32_t here and don't have a slightest > idea of which types will be better to change them. F.e. *f += *r; > suppose unsigned 32bit overflow which don't naturally happens for large > types. Assigning uint32_t to some large type then clip it to smaller > after calculation - all of that can produce more code than save for > calculation itself. Er, I already said which types are better -- [u]int_fast32_t here. For *f += *r, it is then quite possible that clipping doesn't occur. The calculations should be done as much as possible in the natural register width and clipped only once at the end if possible. Here I think the addition gives only 1 extra bit and the right shift in the next bit immediately removes 1 bit and that is all the calculation does so it is not possible to combine masking steps. I have considerable experience using wide registers optimally in i386 (i387) FP code in libm. Without SSE, FP calculations can only be done in the i387. Clipping the extra precision after every step was only about 5 times slower on old CPUs with 0 or 1 pipelines, but it is serveral times slower than that with more pipelines. C has poor bindings related to this. It requires clipping after every cast and assignment. This is too slow, so gcc and clang don't do it. To get code that is both cast and correct, it is best to use float_t and double_t a lot, so that almost all calculations are done in the wide registers. This corresponds to using int_fastN_t instead of intN_t, int or long. Clipping steps are still unfortunately necessary to match APIs and ABIs, and very rarely to discard extra bits because they are really not wanted. With SSE, clipping after every step is only 2-3 times slower, but it is not necessary to widen for any step. However, not widening gives less accuracy in most cases. Bruce