From owner-freebsd-hackers Thu Sep 26 06:48:45 1996 Return-Path: owner-hackers Received: (from root@localhost) by freefall.freebsd.org (8.7.5/8.7.3) id GAA18889 for hackers-outgoing; Thu, 26 Sep 1996 06:48:45 -0700 (PDT) Received: from pdx1.world.net (pdx1.world.net [192.243.32.18]) by freefall.freebsd.org (8.7.5/8.7.3) with ESMTP id GAA18866 for ; Thu, 26 Sep 1996 06:48:43 -0700 (PDT) Received: from suburbia.net (suburbia.net [203.4.184.1]) by pdx1.world.net (8.7.5/8.7.3) with ESMTP id GAA02983; Thu, 26 Sep 1996 06:48:28 -0700 (PDT) Received: (proff@localhost) by suburbia.net (8.7.4/Proff-950810) id XAA02455; Thu, 26 Sep 1996 23:47:45 +1000 From: Julian Assange Message-Id: <199609261347.XAA02455@suburbia.net> Subject: bzip vs gzip To: meditation@gnu.ai.mit.edu, hackers@freebsd.org Date: Thu, 26 Sep 1996 23:47:44 +1000 (EST) X-Mailer: ELM [version 2.4 PL23] Content-Type: text Sender: owner-hackers@freebsd.org X-Loop: FreeBSD.org Precedence: bulk [http://www.cs.man.ac.uk/arch/people/j-seward/index.html] BZIP compresses the usual 14 files from the Calgary Corpus to an average of 2.340 bits per byte, which is within about 5% of the best known results, and considerably better than the more widespread LZ77/LZ78-based compressors [of which Gzip seems to be amongst the best]. Memory consumption is controllable, never exceeding 8,100 k for compression, and 5,400 k for decompression, even for very long files. You can tell BZIP to use less memory via command-line flags, giving minimum uses of 1200 k for compression and 600 k for decompression. This makes it usable on 8 meg and even 4 meg machines; compression is still better than Gzip. For some kinds of highly-redundant files, Bzip has been observed to do strikingly (3 times) better than Gzip. BZIP is an infinite-context statistical compressor, using preliminary run-length coding of the input, the Burrows-Wheeler block-sorting transformation, Fenwick's structured coding model, run-length coding of zeroes in the MTF codes, and a DCC95-style arithmetic coder. BZIP is distributed under the GNU General Public License, version 2, which means you can copy, use and redistribute it freely. It should run on any 32-bit platform with an ANSI C compiler; I myself have made successful builds, without modifying the sources, on: i386/i486-Linux1.2, i386/i486-Linux2.0, i386/i486-Windows95, Sparc-SunOS4, Sparc-Solaris2, SGI-Irix, HP-HPUX and HP-NetBSD. In practice BZIP should work without modification on any 32-bit GNU-supported target. I have also heard that an earlier version runs ok on Alphas; successful builds are also reported for a Mac Powerbook, and an Acorn R260 running RISC iX. BZIP has been heavily tested: the volume of data compressed in the final validation tests exceeds 1700 megabytes in 41000 files, with the longest file 425 megabytes long. This version, 0.21, is completely compatible with the .bz files created by version 0.15 -- 0.21 differs only in being faster, more portable and offering the "-c" flag. -- "Of all tyrannies a tyranny sincerely exercised for the good of its victims may be the most oppressive. It may be better to live under robber barons than under omnipotent moral busybodies, The robber baron's cruelty may sometimes sleep, his cupidity may at some point be satiated; but those who torment us for own good will torment us without end, for they do so with the approval of their own conscience." - C.S. Lewis, _God in the Dock_ +---------------------+--------------------+----------------------------------+ |Julian Assange RSO | PO Box 2031 BARKER | Secret Analytic Guy Union | |proff@suburbia.net | VIC 3122 AUSTRALIA | finger for PGP key hash ID = | |proff@gnu.ai.mit.edu | FAX +61-3-98199066 | 0619737CCC143F6DEA73E27378933690 | +---------------------+--------------------+----------------------------------+