From owner-freebsd-questions@FreeBSD.ORG Wed May 9 16:30:34 2007 Return-Path: X-Original-To: freebsd-questions@freebsd.org Delivered-To: freebsd-questions@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 9F8BA16A402 for ; Wed, 9 May 2007 16:30:34 +0000 (UTC) (envelope-from illoai@gmail.com) Received: from mu-out-0910.google.com (mu-out-0910.google.com [209.85.134.186]) by mx1.freebsd.org (Postfix) with ESMTP id C91D913C489 for ; Wed, 9 May 2007 16:30:33 +0000 (UTC) (envelope-from illoai@gmail.com) Received: by mu-out-0910.google.com with SMTP id w9so150439mue for ; Wed, 09 May 2007 09:30:32 -0700 (PDT) DKIM-Signature: a=rsa-sha1; c=relaxed/relaxed; d=gmail.com; s=beta; h=domainkey-signature:received:received:message-id:date:from:to:subject:cc:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references; b=fsVxVrR55WNZgMfIIB5Q26cZyLeB8GmWeNJ6xZIwYxZwc8M5MOjtZISE1xX/BntOiqYA0OZocfFP8vSaUvDBgBMHJliRJhGskUClfMO2bKp0xftyicaT8dK3uMM+W6GzAvFksafb+BliFD2uWEie/oRdtson6z7SJFX1nuBtNGo= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=beta; h=received:message-id:date:from:to:subject:cc:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references; b=CnAGQ7kojGurTN0ooaR/kodUrGYxKY7lwNVwQnQm6DUE+nVm48vc2P49phZRUWJbxdF2NA6QHVZR92Fk9GIZKatuRHjHo+bJjj/uj4UVZsH0nUz04rOR/TwxNhrbeLZXHBd9Lho7frlcG7PAgPa9WHoaIpY3aIw0xoho10F98mY= Received: by 10.82.136.4 with SMTP id j4mr1347942bud.1178728232178; Wed, 09 May 2007 09:30:32 -0700 (PDT) Received: by 10.82.185.16 with HTTP; Wed, 9 May 2007 09:30:32 -0700 (PDT) Message-ID: Date: Wed, 9 May 2007 11:30:32 -0500 From: "illoai@gmail.com" To: "Ted Mittelstaedt" In-Reply-To: MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Content-Disposition: inline References: <20070509021840.GA41793@thought.org> Cc: Gary Kline , FreeBSD Mailing List , usleepless@gmail.com Subject: Re: Another slightly OT q... X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 09 May 2007 16:30:34 -0000 On 09/05/07, Ted Mittelstaedt wrote: > > > > -----Original Message----- > > From: owner-freebsd-questions@freebsd.org > > [mailto:owner-freebsd-questions@freebsd.org]On Behalf Of Gary Kline > > Sent: Tuesday, May 08, 2007 7:19 PM > > To: usleepless@gmail.com > > Cc: Gary Kline; FreeBSD Mailing List > > Subject: Re: Another slightly OT q... > > > > > > > > So it *was* a hoax? Rats. Some weeks ago on Public > > Broadcasting, a few sentences were spoken on the potential of > > fractal geometry to achieve [I'm guessing] data-compression on > > the order of what Sloot was claiming. So far, no one has figured > > it out. It may be a dream... . > > > > There's some cool math out there that explains all of this but I never liked > math, but it isn't necessary to know the math to understand the issue. Just > consider the problem for a while and you will realize that the compression > ratio of a specific data stream varies dependent on the amount of repetition > in > the input datastream. A perfectly unrandom datastream, like a constant > series of logical 1's, carries no information, but has a compression ratio > that is infinite. A perfectly random datastream, on the other hand, > also carries no information, but has a compression ratio that is zero. > I believe that a datastream that is 50% of the way between either extreme > carries the most information, and I believe your typical datastream is much > closer to > the perfectly unrandom side than the perfectly random side, compression is > merely the process of pushing the randomness of the stream closer to the > random side. Actually, the more information (as such) the closer the data stream is to perfectly random. The relation- ship might be asymptotic, but I am no maths major. > Thus, if the input datastream is very close to the perfectly unrandom side - > meaning it has a very high amount of repetition in it, you can get some > pretty spectacular compression ratios. But as you move closer to unrandom, > you carry less data. So, the better applications emit datastreams that > are less unrandom, therefore compression does not work as well on them. I suppose this leads to the discussion about what "data" and "information" really are. Imagine a can. The can is data. Imagine tha can is full of worms. > This of course is completely ignoring the other data issue, is the > application > data efficient to begin with? For example, you can transfer about a page of > information in ASCII that consumes about 1K of data, that same page of > information in a MS Word file consumes a hundred times that amount of > space - > Word is therefore extremely inefficient with data. In this case, since word "has to" replace typesetting, layout, and formatting software, in addition to being a word processor the header and meta information tend to bloat the files quite a lot. Every few years someone comes along who makes some mad claims about some new buzzword-enhanced compression technology. Obviously, if there is ever a radical leap forward in that area the theory will have to follow, since modern theory cannot accomodate (lossless) compression past the point of randomness (generally less than 16:1 even for Danielle Steele). mp3, avi, real media mpeg, et al are a different story entirely, sicne they are lossy and optimised for their respective information. -rw-r--r-- 1 1705 1705 7826420 May 9 10:58 ssion_i_really_fuckin_care_about_you.rm -rw-r--r-- 1 1705 1705 7791691 May 9 10:58 ssion_i_really_fuckin_care_about_you.rm.bz2 In this case, very slightly compressible: with some data your resulting file will be slightly larger, yet the raw datastream (and it looks like it was filmed from a cameraphone here (though most likely an 8mm digicam (these, I believe, compress on the fly, so the raw datastream never touches tape))) would probably have been many tens, if not several hundreds, of megabytes. Remember life before the tweel? -- --