Date: Tue, 30 Sep 2014 17:18:45 +1000 (EST) From: Bruce Evans <brde@optusnet.com.au> To: John Baldwin <jhb@freebsd.org> Cc: svn-src-head@freebsd.org, svn-src-all@freebsd.org, src-committers@freebsd.org, Colin Percival <cperciva@freebsd.org> Subject: Re: svn commit: r272207 - in head/games: factor primes Message-ID: <20140930161447.S1804@besplex.bde.org> In-Reply-To: <5561525.GofWoxIbJB@ralph.baldwin.cx> References: <201409270900.s8R90dWl029070@svn.freebsd.org> <1576403.4iOOFWFkUs@ralph.baldwin.cx> <5426B4EC.9040102@freebsd.org> <5561525.GofWoxIbJB@ralph.baldwin.cx>
next in thread | previous in thread | raw e-mail | index | archive | help
On Mon, 29 Sep 2014, John Baldwin wrote: > On Saturday, September 27, 2014 06:00:28 AM Colin Percival wrote: >> On 09/27/14 05:52, John Baldwin wrote: >>> On Saturday, September 27, 2014 09:00:39 AM Colin Percival wrote: >>>> #define BIG ULONG_MAX /* largest value will sieve */ >>> >>> Should this be UINT64_MAX (or however that is spelled) instead of >>> ULONG_MAX >>> now? (or is it even still used? I know your change removed its use in at >>> least one place.) >> >> It's not used any more. I was resisting the urge to spend time going >> through and removing ancient history (which is why I kept ubig instead of >> changing it to uint64_t everywhere). > > It's kind of misleading though as the value is wrong and the comment for it no > longer applies. How about this: > > Index: primes.c > =================================================================== > --- primes.c (revision 272281) > +++ primes.c (working copy) > @@ -169,7 +169,7 @@ > > /* > * read_num_buf -- > - * This routine returns a number n, where 0 <= n && n <= BIG. > + * This routine returns a non-negative number n > */ Syntax error (lost punctuation). > static ubig > read_num_buf(void) The comment is almost completely useless now. It is fairly obvious that the function returns a non-negative number. It cannot return anything else, since its return type is an unsigned integer type. The main action of this function is not commented on. It reads a number from stdin. The '_buf' in the name of this function is garbage. The only buffers involved are a local buffer in the function and any buffer for stdin. These are implementation details. It looks like an old version did the reading from stdin and passed the buffer as an arg. 'n' in the description is unused now. > @@ -214,7 +214,7 @@ > /* > * A number of systems can not convert double values into unsigned > * longs when the values are larger than the largest signed value. > - * We don't have this problem, so we can go all the way to BIG. > + * We don't have this problem, so we can go all the way. > */ > if (start < 3) { > start = (ubig)2; All the way where? This comment, and the code, is much more broken than the above. First, it is machine-dependent. Some systems, including ones supported by FreeBSD but probably not ones supported by 4.4BSD when the comment was written, do have a problem. Second, the problem is with conversion in the opposite direction. When this comment was written, most systems has only 32-bit unsigned longs and it was just impossible to convert large 53-bit doubles values to them without producing garbage. (I think the behaviour is on overflow of such conversions is undefined. gcc at least used to produce very machine-dependent garbage.) However, this program only supports numbers up to ULONG_MAX. It converts these to doubles and adds a few, and needs these calculations to be exact. Since 53 is much larger than 32, they used to be exact. It never converts in the opposite direction. So the comment was sort of backwards. After fixing only the backwardness, the comment remains machine-dependent. It became false with 64-bit ulongs (since 64 > 53). However, the limit is now SIEVE_MAX (or something near the square of this?), not UBIG = UINT64_MAX. Provided SIEVE_MAX (squared?) fits in strictly less than 53 bits, the comment corrected to give the limit of SIEVE_MAX (squared?) becomes correct again, and also not machine-dependent, provided no one expands SIEVE_MAX. Here is some (all?) of the buggy code that uses doubles: very old version: % /* % * sieve for primes 17 and higher % */ % /* note highest useful factor and sieve spot */ % if (stop-start > TABSIZE+TABSIZE) { % tab_lim = &table[TABSIZE]; /* sieve it all */ % fact_lim = (int)sqrt( % (double)(start)+TABSIZE+TABSIZE+1.0); % } else { % tab_lim = &table[(stop-start)/2]; /* partial sieve */ % fact_lim = (int)sqrt((double)(stop)+1.0); % } This has excessive casts and other style bugs, but the casts to double make it clear where the doubles are. When 'start' is larger than 2**53, converting it to double loses precision and adding TABSIZE, etc., to it might have no effect. Thus the value of sqrt() might be wrong. Casting this value to signed int is unecessarily broken, but works in most cases where the double calculation isn't already broken (the int cast breaks at 2**31, but the sqrt() calculation breaks at 2**26.5). Similarly for 'stop', except only 1 is added so the addition is even more likely to have no effect. The previous version has the bogus casts removed and some line-splitting style bugs fixed. It still has the precision bugs and no spaces around binary operators: % /* % * sieve for primes 17 and higher % */ % /* note highest useful factor and sieve spot */ % if (stop-start > TABSIZE+TABSIZE) { % tab_lim = &table[TABSIZE]; /* sieve it all */ % fact_lim = sqrt(start+1.0+TABSIZE+TABSIZE); % } else { % tab_lim = &table[(stop-start)/2]; /* partial sieve */ % fact_lim = sqrt(stop+1.0); % } I don't have the latest version to quote. Perhaps it modified these sqrt() calculations. Using floating point is still the easiest way to determine the limit. To actually find the correct limit for 64-bit numbers, this could use sqrt() and then add a few for safety, or square the approximate square root and increase it by 1 at a time until it is large enough. > Index: primes.h > =================================================================== > --- primes.h (revision 272281) > +++ primes.h (working copy) > @@ -45,7 +45,6 @@ > > /* ubig is the type that holds a large unsigned value */ It would be more useful to say that it holds the largest supported unsigned value, not just 1 large value. > typedef uint64_t ubig; /* must be >=32 bit unsigned value */ This comment was last correct when no machine had 64-bit longs. Except it was nonsense then too (ubig is a type, not a value). The previous comment says the same thing more clearly and without any machine-dependent limites, unless there is some magic requiring the type being at least 32 bits even when "a large unsigned value" takes less than 32 bits. > -#define BIG ULONG_MAX /* largest value will sieve */ > > /* bytes in sieve table (must be > 3*5*7*11) */ > #define TABSIZE 256*1024 Bruce
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20140930161447.S1804>