Date: Wed, 26 Jul 2000 03:50:26 -0700 (PDT) From: Kris Kennaway <kris@FreeBSD.org> To: Mark Murray <mark@grondar.za> Cc: arch@FreeBSD.org Subject: Re: Estimating entropy Message-ID: <Pine.BSF.4.21.0007260333360.95417-100000@freefall.freebsd.org> In-Reply-To: <200007261019.MAA00605@grimreaper.grondar.za>
next in thread | previous in thread | raw e-mail | index | archive | help
On Wed, 26 Jul 2000, Mark Murray wrote:
> > 2) Keep a frequency table and calculate or estimate the shannon entropy
> > periodically. This may be feasible if we treat the samples as 8-bit
> > sources, as you only have to loop over 256 values and calculate a log_2 of
> > the probabilities (although lack of FP in the kernel would complicate
> > this)
>
> I have been looking for articles on Shannon entropy; all I can find
> is a theorem that covers ergodic systems. Do you have any online
> references?
It's pretty simple: if you have an information source which outputs data
with a given probability distribution p(x) (technically speaking a
statistical ensemble of sources, in order to define the probability) then
the entropy content is
S = - \Int_x dx p(x) ln_2 p(x)
Where \Int_x is an integral over x.
If x is a discrete variable (as it is in most/all of the cases we'll be
looking at) then replace \Int_x dx with \Sigma_x (i.e. sum over all
possible output values)
Basically, (Shannon) entropy is a measure (in bits) of how much
information you learn by performing the measurement, given that some
measurements are more likely than others and therefore not as much of a
"surprise" (you probably knew this, but for the benefit of anyone else who
might not..)
It's also the absolute minimum length a given piece of data can be
compressed to by a general-purpose, "perfect" compression algorithm, which
is why gzip is a decent metric for it.
The requirement about ergodic systems means basically that the way a
single system behaves if you watch it for a long period of time is the
same as a large number of identical systems (in different states) behave
when observed at a single instant (the aformentioned statistical ensemble
assumption). There's also another implicit assumption about a "stationary"
distribution which basically means it's not time-varying: if you have a
probability distribution which changes over time then shannon entropy will
not be a true measure of the information content of the source, because it
doesn't take into account temporal patterns which may decrease the actual
information content.
See the code I posted a few days ago for a C implementation which
calculates the entropy of the PCM input which should make the above
clearer.
Kris
--
In God we Trust -- all others must submit an X.509 certificate.
-- Charles Forsythe <forsythe@alum.mit.edu>
To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?Pine.BSF.4.21.0007260333360.95417-100000>
