Date: Thu, 4 Jul 2013 12:47:54 +1000 (EST) From: Bruce Evans <brde@optusnet.com.au> To: "Andrey A. Chernov" <ache@FreeBSD.org> Cc: svn-src-head@FreeBSD.org, svn-src-all@FreeBSD.org, src-committers@FreeBSD.org Subject: Re: svn commit: r252608 - in head: include lib/libc/stdlib Message-ID: <20130704120336.G1176@besplex.bde.org> In-Reply-To: <201307032121.r63LLtkk022011@svn.freebsd.org> References: <201307032121.r63LLtkk022011@svn.freebsd.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On Wed, 3 Jul 2013, Andrey A. Chernov wrote: > Log: > 1) POSIX requires rand(3) return values to be in the [0, RAND_MAX] range, > but ACM formula we use have internal state (and return value) in the > [1, 0x7ffffffe] range, so our RAND_MAX (0x7fffffff) is never reached > because it is off by one, zero is not reached too. > > Correct both RAND_MAX and rand(3) return value, shifting last one > to the 0 by 1 subtracted, resulting POSIXed [0, 0x7ffffffd(=new RAND_MAX)] > range. > > 2) Add a checks for not overflowing on too big seeds. It may happens on > the machines, where sizeof(unsigned int) > 32 bits. > > Reviewed by: bde [1] > MFC after: 2 weeks Er, I think it is too dangerous to change either RAND_MAX or the offset without more preparation: - increasing the range returned (and increasing RAND_MAX to match) would obviously be binary-incompatible. Old binaries may have the old RAND_MAX built in to them, but will call the new rand(). They would be broken if the new rand() returned a value larger than the old RAND_MAX. But this change only reduces RAND_MAX. RAND_MAX was already 1 higher than could be returned. This change expands the range at the low end, so that 0 is now returned, but returning it was always possible and should have occurred. - changing the offset is more of a problem. It changes the sequence generated by each fixed seed. Of course, the sequence is not guaranteed to be repeated after all system changes. The C90/C99 specification is actually unusably fuzzy about this. C99 (n869.txt) says: %%% [#2] The srand function uses the argument as a seed for a new sequence of pseudo-random numbers to be returned by subsequent calls to rand. If srand is then called with the same seed value, the sequence of pseudo-random numbers shall be repeated. If rand is called before any calls to srand have been made, the same sequence shall be generated as when srand is first called with a seed value of 1. %%% Perahps this only says that the sequence shall be repeated within each "execution" of a C program, but that is not very useful. This is not fixed in POSIX, at least in old drafts. POSIX should at least say that the sequence shall be repeated across the lifetime of a single process (it can be more specific about "execution"). But to be useful, the repeatability must be much more than that (certainly across reboots, which is already more than POSIX can explicitly specify). > Modified: head/lib/libc/stdlib/rand.c > ============================================================================== > --- head/lib/libc/stdlib/rand.c Wed Jul 3 21:14:57 2013 (r252607) > +++ head/lib/libc/stdlib/rand.c Wed Jul 3 21:21:54 2013 (r252608) > @@ -84,6 +84,10 @@ int > rand_r(unsigned int *ctx) > { > u_long val = (u_long) *ctx; > +#ifndef USE_WEAK_SEEDING > + /* Transform to [1, 0x7ffffffe] range. */ > + val = (val % 0x7ffffffe) + 1; > +#endif > int r = do_rand(&val); > > *ctx = (unsigned int) val; Style bugs: - old: initializations in declarations. The one for 'r' is especially bad. Almost all of the work for the function is done in the initialization. - new: initialization not even in declarations. There is now a statement in the middle of the declarations. This wouldn't even compile in C90. This is collateral with the old style bugs -- when almost the whole function is written in initializers, it is difficult to insert statements into it without increasing the mess. - old: bogus mix of spellings of "unsigned". Here u_ is used in one place and "unsigned" in another place. rand.c uses "unsigned" with "long" in most places, but it is the "unsigned" form that is the style bug -- in rev.1.1, u_int and u_long were used consistently. > @@ -125,6 +133,10 @@ sranddev() > mib[0] = CTL_KERN; > mib[1] = KERN_ARND; > sysctl(mib, 2, (void *)&next, &len, NULL, 0); > +#ifndef USE_WEAK_SEEDING > + /* Transform to [1, 0x7ffffffe] range. */ > + next = (next % 0x7ffffffe) + 1; > +#endif > } Previous breakage of this is still not fixed. Old versions used /dev/random and had error handling involving use of the current time when _read() failed. They also spelled read() correctly, so that the Standard library function rand() is not broken is the application is a pure C90 one that supplies its own read(). The above has doesn't even have error checking, and is broken if the application supplies its own sysctl(). The old versions had all these bugs nested in their error handling, however -- they used gettimeofday() with an unsafe spelling and had no error checking for it. Stack garbage for the failing syscalls is not too bad for a random number. Style bug: the API name 'sranddev()' is bogus for a function that doesn't used /dev/random like it used to. Bruce
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20130704120336.G1176>