From owner-freebsd-current  Sat Oct  5 04:25:07 1996
Return-Path: owner-current
Received: (from root@localhost)
          by freefall.freebsd.org (8.7.5/8.7.3) id EAA09123
          for current-outgoing; Sat, 5 Oct 1996 04:25:07 -0700 (PDT)
Received: from godzilla.zeta.org.au (godzilla.zeta.org.au [203.2.228.19])
          by freefall.freebsd.org (8.7.5/8.7.3) with ESMTP id EAA09052;
          Sat, 5 Oct 1996 04:24:41 -0700 (PDT)
Received: (from bde@localhost) by godzilla.zeta.org.au (8.7.6/8.6.9) id VAA21012; Sat, 5 Oct 1996 21:15:05 +1000
Date: Sat, 5 Oct 1996 21:15:05 +1000
From: Bruce Evans <bde@zeta.org.au>
Message-Id: <199610051115.VAA21012@godzilla.zeta.org.au>
To: ache@nagual.ru, joerg_wunsch@uriah.heep.sax.de
Subject: Re: I plan to change random() for -current (was Re: rand() and random())
Cc: current@FreeBSD.org, freebsd-hackers@FreeBSD.org
Sender: owner-current@FreeBSD.org
X-Loop: FreeBSD.org
Precedence: bulk

>> I don't think so.  But they are much better with random().  See xmine,
>> if you wanna get a nice example.  It generates totally predictable
>> layouts when using rand().
>
>Totally predictable layouts not rand() illness only but random() too.
>It not depends well on different initial state, producing the same
>sequence. I finally dig out initial posting (below).

It says that the 0th bit is the most non-random.  In fact, it is obvious
from the formula for rand() that

	(rand() & 1) == previous_value_returned_by_rand ^ 1

:-(.

The posting also says that the bad values for rand() make the seeding
for random() bad.  This seems reasonable.

>IMHO we need to change our random() as suggested.

How do you know that the suggested method is better?

---

>From: warwick@cs.uq.edu.au (Warwick Allison)
>...
>warner@rmcs.cranfield.ac.uk (Alistair (Joe) Warner) writes:
>
>>    BSD:    state[i] = 1103515245 * state[i - 1] + 12345;
>>    Linux:  state[i] = 1103515145 * state[i - 1] + 12345;

ISO C example:    next = 1103515245 * next         + 12345;
                  return (unsigned int)(next / 65536) % 32768;

I.e., it returns bits 16-31 of the current state (right shifted 16).  This
is said to be better.  Folklore says that someone broke rand() by not
discarding the low bits when ints became 32 bits.

>value will make NO DIFFERENCE.  The change I give below is of the type
>required for success.
>
>
>#include <stdlib.h>
>
>#define LOOP 200
>#define ITER 200
>
>main()
>{
>        int s=time(0);
>        int i,l,m,j;
>
>        int seed=0;
>
>        printf("P1\n%d %d\n",ITER,LOOP);
>        for (l=0; l<LOOP; l++) {
>                srandom(seed); seed+=1;
>                for (i=0; i<ITER; i++) {
>                        int b=(random()>>8)&1; /* remove >>8 for WORSE */
>                        printf("%d\n",b);
>                }
>        }
>}

Similarly for rand().  You can throw away the lowest 16 bits yourself
to get a rand() no worse than the example in the ISO C standard.  This
is said to be bad, but I think the badness is more in the brokenness as
specified (the number of starting states is limited by the srand()
interface) and in the limited period (which is inherent in the LCRNG
implementation) than in the coefficients for the LCRNG.

>>To: bug-glibc@prep.ai.mit.edu
>>cc:
>>Subject: PARTIAL FIX FOR: srandom() in libc
>--------
>...
>Nothing can be done to change the EXTREME uniformity of the resulting
>image, as the code in srandom() makes a fundamental mistake: it uses a
>power of two as the modulo which is mathematically a `poor' choice. 
>Code I suggest is better uses the largest 31-bit prime (2^31-1), which
>is a `good' choice mentioned in the literature (I can find references,
>if that would be useful) - the image resulting from the above is then

According to Knuth, moduli of the form 2^n+1 and 2^n-1 have the advantage
that the low order bits are just as random as the high order ones.
However, "In most applications, the low-order bits are insignificant,
and the choice m=w [modulus = 2^n] is quite satisfactory - provided the
programmer using the random numbers does so wisely".  In the ISO example,
it is actually an advantage to have randomness concentrated in the high
order bits (is it?).  ISO only guarantees 16 bit integers, so a portably
example can only return 15 bits of randomness, so the low bits are not
useful; OTOH, longs have to be used to get a reasonably large period -
16 bit ints are too small.

>! 	  /*
>! 	   * Implements the following, without overflowing 31 bits:
>! 	   *
>! 	   *     state[i] = (16807 * state[i - 1]) % 2147483647;
>! 	   *
>! 	   *     2^31-1 (prime) = 2147483647 = 127773*16807+2836
>! 	   */
>! 	  long int hi = state[i-1] / 127773;
>! 	  long int lo = state[i-1] % 127773;
>! 	  long int test = 16807*lo - 2836*hi;
>! 	  state[i] = test + (test<0 ? 2147483647 : 0);

This method is also found in the BSD4.4Lite libkern/rand.c.  I guess it
can be trusted (as much as the BSD rand.c :-).

Bruce