From owner-freebsd-current  Sun Oct  6 17:59:36 1996
Return-Path: owner-current
Received: (from root@localhost)
          by freefall.freebsd.org (8.7.5/8.7.3) id RAA23181
          for current-outgoing; Sun, 6 Oct 1996 17:59:36 -0700 (PDT)
Received: from phaeton.artisoft.com (phaeton.Artisoft.COM [198.17.250.211])
          by freefall.freebsd.org (8.7.5/8.7.3) with SMTP id RAA23175;
          Sun, 6 Oct 1996 17:59:32 -0700 (PDT)
Received: (from terry@localhost) by phaeton.artisoft.com (8.6.11/8.6.9) id RAA13128; Sun, 6 Oct 1996 17:56:37 -0700
From: Terry Lambert <terry@lambert.org>
Message-Id: <199610070056.RAA13128@phaeton.artisoft.com>
Subject: Re: I plan to change random() for -current (was Re: rand() and random())
To: ache@nagual.ru (=?KOI8-R?Q?=E1=CE=C4=D2=C5=CA_=FE=C5=D2=CE=CF=D7?=)
Date: Sun, 6 Oct 1996 17:56:37 -0700 (MST)
Cc: terry@lambert.org, joerg_wunsch@uriah.heep.sax.de,
        freebsd-hackers@freebsd.org, current@freebsd.org
In-Reply-To: <199610052204.CAA07197@nagual.ru> from "=?KOI8-R?Q?=E1=CE=C4=D2=C5=CA_=FE=C5=D2=CE=CF=D7?=" at Oct 6, 96 02:04:17 am
X-Mailer: ELM [version 2.4 PL24]
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Sender: owner-current@freebsd.org
X-Loop: FreeBSD.org
Precedence: bulk

> > There is a historical dependence of much physics code on the
> > repeatability of identical seeding for the linear congruential
> > generator as a "randomness" base for repeatable Monte Carlo based
> > testing of relativistically invariant P-P, N-P, and N-N pair production
> > collisions.
> 
> The fix _not_ breaks repeatability of identical seeding.

Repeatability means identical results compared to historical values
for the same interface.

> > If you *do* change the random algorithms, then you should *leave the
> > rand48() code along*.  I can not stress this enough.  You will damage
> > repeatability of experiments for which source code is unavailable, and
> > only the results remain.
> 
> I don't understand your statement well, random() already have different
> implementations in different OSes. If you mean that previous FreeBSD
> dynamic-linked binaries can produce different results, yes, it is
> any upgrade cost. Make static binaries if source code is unavailable.

Random is not random.  Random is pseudo-random.  I think what is being
forgotton is that pseudo-randomness is useful because of its repeatability
in many, many circumstances.

> Depending on predictable system function results which claimed to
> be 'random' is bad idea in general (and mans/docs/standards
> not declare such possibility too). They only say that "this function
> [not all possible versions of this function]
> gives the same sequence for the same seed". Real practice when
> rand() and random() functions changes between different OSes
> and inside one OS too confirms it. I remember that Unix v6 rand()
> was different with what we have currently, so we must return
> to Unix v6 variant according to your logic.

The code in question is from the Berkeley Physics package, in FORTRAN,
for generation of relativitically invariant pair production events.

I would be happy if you would keep BSD compatability, since BSD UNIX is
where the code was written to run.

The point is not repeatability, per se.  It is that the event stream
will be identical for a given set of N events for a given physics.  The
intent of doing this is to ensure that there is no statistical variance
introduced by the period of the generator.

The particular code in question uses the 48 bit linear congruential
method.  However, it is reasonable to presume that similar code exists
for any given interface dependency.

The point is that in 15 years, I can rerun the same event set with a
different physics, and get the same event data which I then use the
physics I am testing to constrain allowable events.

It is statistically *important* to know how many events, out of 100
million events, are disallowed by a given constraint.

As an example, a recent run of the code with a set of "Dion" physics
constraints checked some laboratory experiments dealing with identifying
the energy range of the carrier of the weak force to three decimal
places.  This gives the theoretical model a very high probability of
being a correct model (as it happens, the same model predicted the
W particle more than 8 years before it was experimentally discovered).


For any pseudo-random generator, code is equally likely to depend on
the "pseudo" as it is to depend on the "random".  Any change to either
bears a great deal of consideration.

I personally have optics code that depends on the pseudo-randomness of
the generator to create point origin vectors for testing theoretical
chromatic aberration, and correcting aberration in real optical systems
with CCD collectors.  Among other things, it's used to remove aberration
effects from the lens correction package when processing raw Hubble
telescope data to look for extrasolar planets.


In point of fact, you are suggesting "correcting" a "problem" because
you want a different (not better) random distribution than what you
currently get.

I respectfully suggest that you should consider packing around your
own random number generator with the code that needs the different
distribution, rather than munging the existing code.  Historical
behaviour of pseudo-random library services is a topic requiring a
*lot* of care before changes are introduced.  I really haven't seen
what I would consider enough thought or discussion to merit a change.

As always: my opinions.


					Regards,
					Terry Lambert
					terry@lambert.org
---
Any opinions in this posting are my own and not those of my present
or previous employers.