Date: Wed, 9 May 2007 11:30:32 -0500 From: "illoai@gmail.com" <illoai@gmail.com> To: "Ted Mittelstaedt" <tedm@toybox.placo.com> Cc: Gary Kline <kline@tao.thought.org>, FreeBSD Mailing List <freebsd-questions@freebsd.org>, usleepless@gmail.com Subject: Re: Another slightly OT q... Message-ID: <d7195cff0705090930s1ea5a180t6a93d619d870da1a@mail.gmail.com> In-Reply-To: <BMEDLGAENEKCJFGODFOCKEAOCAAA.tedm@toybox.placo.com> References: <20070509021840.GA41793@thought.org> <BMEDLGAENEKCJFGODFOCKEAOCAAA.tedm@toybox.placo.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On 09/05/07, Ted Mittelstaedt <tedm@toybox.placo.com> wrote: > > > > -----Original Message----- > > From: owner-freebsd-questions@freebsd.org > > [mailto:owner-freebsd-questions@freebsd.org]On Behalf Of Gary Kline > > Sent: Tuesday, May 08, 2007 7:19 PM > > To: usleepless@gmail.com > > Cc: Gary Kline; FreeBSD Mailing List > > Subject: Re: Another slightly OT q... > > > > > > > > So it *was* a hoax? Rats. Some weeks ago on Public > > Broadcasting, a few sentences were spoken on the potential of > > fractal geometry to achieve [I'm guessing] data-compression on > > the order of what Sloot was claiming. So far, no one has figured > > it out. It may be a dream... . > > > > There's some cool math out there that explains all of this but I never liked > math, but it isn't necessary to know the math to understand the issue. Just > consider the problem for a while and you will realize that the compression > ratio of a specific data stream varies dependent on the amount of repetition > in > the input datastream. A perfectly unrandom datastream, like a constant > series of logical 1's, carries no information, but has a compression ratio > that is infinite. A perfectly random datastream, on the other hand, > also carries no information, but has a compression ratio that is zero. > I believe that a datastream that is 50% of the way between either extreme > carries the most information, and I believe your typical datastream is much > closer to > the perfectly unrandom side than the perfectly random side, compression is > merely the process of pushing the randomness of the stream closer to the > random side. Actually, the more information (as such) the closer the data stream is to perfectly random. The relation- ship might be asymptotic, but I am no maths major. > Thus, if the input datastream is very close to the perfectly unrandom side - > meaning it has a very high amount of repetition in it, you can get some > pretty spectacular compression ratios. But as you move closer to unrandom, > you carry less data. So, the better applications emit datastreams that > are less unrandom, therefore compression does not work as well on them. I suppose this leads to the discussion about what "data" and "information" really are. Imagine a can. The can is data. Imagine tha can is full of worms. > This of course is completely ignoring the other data issue, is the > application > data efficient to begin with? For example, you can transfer about a page of > information in ASCII that consumes about 1K of data, that same page of > information in a MS Word file consumes a hundred times that amount of > space - > Word is therefore extremely inefficient with data. In this case, since word "has to" replace typesetting, layout, and formatting software, in addition to being a word processor the header and meta information tend to bloat the files quite a lot. Every few years someone comes along who makes some mad claims about some new buzzword-enhanced compression technology. Obviously, if there is ever a radical leap forward in that area the theory will have to follow, since modern theory cannot accomodate (lossless) compression past the point of randomness (generally less than 16:1 even for Danielle Steele). mp3, avi, real media mpeg, et al are a different story entirely, sicne they are lossy and optimised for their respective information. -rw-r--r-- 1 1705 1705 7826420 May 9 10:58 ssion_i_really_fuckin_care_about_you.rm -rw-r--r-- 1 1705 1705 7791691 May 9 10:58 ssion_i_really_fuckin_care_about_you.rm.bz2 In this case, very slightly compressible: with some data your resulting file will be slightly larger, yet the raw datastream (and it looks like it was filmed from a cameraphone here (though most likely an 8mm digicam (these, I believe, compress on the fly, so the raw datastream never touches tape))) would probably have been many tens, if not several hundreds, of megabytes. Remember life before the tweel? -- --
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?d7195cff0705090930s1ea5a180t6a93d619d870da1a>