Date: Wed, 18 Oct 2000 01:57:26 +0200 (CEST) From: Cyrille Lefevre <clefevre@citeweb.net> To: FreeBSD-gnats-submit@freebsd.org Cc: obrien@freebsd.org Subject: ports/22067: Updated port: archivers/bzip2 - 1.0.1 Message-ID: <200010172357.e9HNvQr59131@gits.dyndns.org>
next in thread | raw e-mail | index | archive | help
>Number: 22067 >Category: ports >Synopsis: Updated port: archivers/bzip2 - 1.0.1 >Confidential: no >Severity: non-critical >Priority: low >Responsible: freebsd-ports >State: open >Quarter: >Keywords: >Date-Required: >Class: change-request >Submitter-Id: current-users >Arrival-Date: Tue Oct 17 17:00:01 PDT 2000 >Closed-Date: >Last-Modified: >Originator: Cyrille Lefevre >Release: FreeBSD 4.1-STABLE i386 >Organization: ACME >Environment: FreeBSD gits 4.1-STABLE FreeBSD 4.1-STABLE #3: Sat Sep 23 10:20:30 CEST 2000 root@gits:/disk2/4.0-stable/src/sys/compile/CUSTOM i386 >Description: texinfo file updated to be able to generate info doc. I'll send the patch to the texinfo file to the maintainer for futher releases. Makefile: build depenencies to texi2html and teTeX added. symlinks changed from absolute to relative. installation of info files. pkg-plist: symlinked files deleted. manual.texi deleted. info files added. files/patch-aa: (aka Makefile) updated to generate info, html and ps files. files/patch-manuel.texi new file. >How-To-Repeat: n/a >Fix: diff -BurN -x CVS bzip2.old/Makefile bzip2/Makefile --- bzip2.old/Makefile Fri Jun 30 18:44:43 2000 +++ bzip2/Makefile Wed Oct 18 01:34:07 2000 @@ -3,7 +3,7 @@ # Date created: 19 Nov 1997 # Whom: Thomas Gellekum <tg@FreeBSD.org> # -# $FreeBSD: ports/archivers/bzip2/Makefile,v 1.31 2000/06/30 16:44:43 obrien Exp $ +# $FreeBSD$ # PORTNAME= bzip2 @@ -15,29 +15,43 @@ MAINTAINER= obrien@FreeBSD.org +.if !defined(NOPORTDOCS) +BUILD_DEPENDS= texi2html:${PORTSDIR}/textproc/texi2html \ + tex:${PORTSDIR}/print/teTeX +.endif + FETCH_BEFORE_ARGS= -b INSTALLS_SHLIB= yes +.if !defined(NOPORTDOCS) +ALL_TARGET= all all-doc +.endif + MAN1= bzip2.1 MLINKS= bzip2.1 bunzip2.1 bzip2.1 bzcat.1 bzip2.1 bz2cat.1 \ bzip2.1 bzip2recover.1 do-install: ${INSTALL_PROGRAM} ${WRKSRC}/bzip2 ${PREFIX}/bin - ${LN} -sf ${PREFIX}/bin/bzip2 ${PREFIX}/bin/bunzip2 - ${LN} -sf ${PREFIX}/bin/bzip2 ${PREFIX}/bin/bzcat - ${LN} -sf ${PREFIX}/bin/bzip2 ${PREFIX}/bin/bz2cat + ${LN} -sf bzip2 ${PREFIX}/bin/bunzip2 + ${LN} -sf bzip2 ${PREFIX}/bin/bzcat + ${LN} -sf bzip2 ${PREFIX}/bin/bz2cat ${INSTALL_PROGRAM} ${WRKSRC}/bzip2recover ${PREFIX}/bin ${INSTALL_DATA} ${WRKSRC}/bzlib.h ${PREFIX}/include ${INSTALL_DATA} ${WRKSRC}/libbz2.a ${PREFIX}/lib ${INSTALL_DATA} ${WRKSRC}/libbz2.so.1 ${PREFIX}/lib - ${LN} -sf ${PREFIX}/lib/libbz2.so.1 ${PREFIX}/lib/libbz2.so + ${LN} -sf libbz2.so.1 ${PREFIX}/lib/libbz2.so ${INSTALL_MAN} ${WRKSRC}/bzip2.1 ${PREFIX}/man/man1 + ${INSTALL_DATA} ${WRKSRC}/bzip2.info* ${PREFIX}/info .if !defined(NOPORTDOCS) ${MKDIR} ${PREFIX}/share/doc/bzip2 - ${INSTALL_DATA} ${WRKSRC}/manual* ${PREFIX}/share/doc/bzip2 + ${INSTALL_DATA} ${WRKSRC}/manual*.html ${PREFIX}/share/doc/bzip2 + ${INSTALL_DATA} ${WRKSRC}/manual.ps ${PREFIX}/share/doc/bzip2 ${PREFIX}/bin/bzip2 -f ${PREFIX}/share/doc/bzip2/manual.ps .endif + +post-install: + install-info ${PREFIX}/info/bzip2.info ${PREFIX}/info/dir .include <bsd.port.mk> diff -BurN -x CVS bzip2.old/files/patch-aa bzip2/files/patch-aa --- bzip2.old/files/patch-aa Thu Jun 8 16:47:17 2000 +++ bzip2/files/patch-aa Wed Oct 18 00:58:59 2000 @@ -1,5 +1,5 @@ ---- Makefile.orig Wed May 17 00:31:04 2000 -+++ Makefile Thu Jun 8 17:41:26 2000 +--- Makefile.orig Fri Jun 23 22:34:47 2000 ++++ Makefile Wed Oct 18 00:54:45 2000 @@ -1,8 +1,10 @@ SHELL=/bin/sh @@ -13,7 +13,7 @@ OBJS= blocksort.o \ huffman.o \ -@@ -12,10 +14,18 @@ +@@ -12,10 +14,19 @@ decompress.o \ bzlib.o @@ -28,14 +28,15 @@ -bzip2: libbz2.a bzip2.o - $(CC) $(CFLAGS) -o bzip2 bzip2.o -L. -lbz2 -+all: libbz2.so.1 libbz2.a bzip2 bzip2recover test ++all: libbz2.so.1 libbz2.a bzip2 bzip2recover test bzip2.info ++all-doc: manual_toc.html manual.ps + +bzip2: libbz2.so.1 libbz2.a bzip2.o + $(CC) $(CFLAGS) -o bzip2 bzip2.o libbz2.a bzip2recover: bzip2recover.o $(CC) $(CFLAGS) -o bzip2recover bzip2recover.o -@@ -29,6 +39,10 @@ +@@ -29,6 +40,26 @@ ranlib libbz2.a ; \ fi @@ -43,10 +44,26 @@ + $(CC) -shared -Wl,-soname -Wl,libbz2.so.1 -o libbz2.so.1 $(SO_OBJS) + ln -sf libbz2.so.1 libbz2.so + ++bzip2.info: manual.texi ++ -@makeinfo --force manual.texi ++ ++manual_toc.html: manual.texi ++ -@texi2html -split_chapter manual.texi ++ ++manual.ps: manual.texi ++ -@tex \\nonstopmode \\input manual.texi ++ -@texindex manual.cp manual.fn manual.ky manual.tp manual.vr ++ -@tex \\nonstopmode \\input manual.texi ++ -@rm -f manual.aux manual.fn manual.kys manual.toc manual.vr \ ++ manual.cp manual.fns manual.log manual.tp manual.vrs \ ++ manual.cps manual.ky manual.pg manual.tps ++ -@dvips -o manual.ps manual.dvi ++ -@rm -f manual.dvi ++ test: bzip2 @cat words1 ./bzip2 -1 < sample1.ref > sample1.rb2 -@@ -69,12 +83,27 @@ +@@ -69,12 +100,27 @@ chmod a+r $(PREFIX)/lib/libbz2.a clean: diff -BurN -x CVS bzip2.old/files/patch-manual.texi bzip2/files/patch-manual.texi --- bzip2.old/files/patch-manual.texi Thu Jan 1 01:00:00 1970 +++ bzip2/files/patch-manual.texi Wed Oct 18 00:59:07 2000 @@ -0,0 +1,1811 @@ +--- manual.texi.orig Fri Mar 24 00:43:33 2000 ++++ manual.texi Wed Oct 18 00:00:37 2000 +@@ -3,9 +3,9 @@ + + @ignore + This file documents bzip2 version 1.0, and associated library +-libbzip2, written by Julian Seward (jseward@acm.org). ++libbzip2, written by Julian R. Seward @email{jseward@acm.org}. + +-Copyright (C) 1996-2000 Julian R Seward ++Copyright (C) 1996-2000 Julian R. Seward + + Permission is granted to make and distribute verbatim copies of + this manual provided the copyright notice and this permission notice +@@ -17,6 +17,7 @@ + + @ifinfo + @format ++INFO-DIR-SECTION Compressing tools + START-INFO-DIR-ENTRY + * Bzip2: (bzip2). A program and library for data compression. + END-INFO-DIR-ENTRY +@@ -30,7 +31,7 @@ + @titlepage + @title bzip2 and libbzip2 + @subtitle a program and library for data compression +-@subtitle copyright (C) 1996-2000 Julian Seward ++@subtitle copyright (C) 1996-2000 Julian R. Seward + @subtitle version 1.0 of 21 March 2000 + @author Julian Seward + +@@ -40,15 +41,115 @@ + @parskip 2mm + + @end iftex +-@node Top, Overview, (dir), (dir) ++@ifnottex ++@node Top, Copyright, (dir), (dir) ++@top Bzip2 ++ ++@code{bzip2} compresses files using the Burrows-Wheeler block ++sorting text compression algorithm, and Huffman coding. Compression ++is generally considerably better than that achieved by more ++conventional LZ77/LZ78-based compressors, and approaches the ++performance of the PPM family of statistical compressors. ++ ++@end ifnottex ++ ++@menu ++ ++* Copyright:: ++* Overview:: ++* How To Use Bzip2:: ++* Programming With Libbzip2:: ++* Miscellaneous:: ++* Concept Index:: ++* Option Index:: ++* Variable Index:: ++* Function Index:: ++ ++ ++ --- The Detailed Node Listing --- ++ ++How To Use Bzip2 ++ ++* Name:: ++* Synopsis:: ++* Description:: ++* Options:: ++* Memory Management:: ++* Recovering Data From Damaged Files:: ++* Performance Notes:: ++* Caveats:: ++* Author:: ++ ++Programming With Libbzip2 ++ ++* Top-Level Structure:: ++* Error Handling:: ++* Low-Level Interface:: ++* High-Level Interface:: ++* Utility Functions:: ++* Zlib Compatibility Functions:: ++* Using The Library In A Stdio-Free Environment:: ++* Making A Windows DLL:: ++ ++Top-Level Structure ++ ++* Low-Level Summary:: ++* High-Level Summary:: ++* Utility Functions Summary:: ++ ++Low-Level Interface ++ ++* BZ2_bzCompressInit:: ++* BZ2_bzCompress:: ++* BZ2_bzCompressEnd:: ++* BZ2_bzDecompressInit:: ++* BZ2_bzDecompress:: ++* BZ2_bzDecompressEnd:: ++ ++High-Level Interface ++ ++* BZ2_bzReadOpen:: ++* BZ2_bzRead:: ++* BZ2_bzReadGetUnused:: ++* BZ2_bzReadClose:: ++* BZ2_bzWriteOpen:: ++* BZ2_bzWrite:: ++* BZ2_bzWriteClose:: ++* Handling Embedded Compressed Data Streams:: ++* Standard File-Reading/Writing Code:: ++ ++Utility Functions ++ ++* BZ2_bzBuffToBuffCompress:: ++* BZ2_bzBuffToBuffDecompress:: ++ ++Using The Library In A Stdio-Free Environment ++ ++* Getting Rid Of Stdio:: ++* Critical Error Handling:: ++ ++Miscellaneous ++ ++* Limitations Of The Compressed File Format:: ++* Portability Issues:: ++* Reporting Bugs:: ++* Did You Get The Right Package:: ++* Testing:: ++* Further Reading:: ++ ++@end menu ++ ++@node Copyright, Overview, Top, Top ++@unnumbered Copyright + + This program, @code{bzip2}, + and associated library @code{libbzip2}, are +-Copyright (C) 1996-2000 Julian R Seward. All rights reserved. ++Copyright (C) 1996-2000 Julian R. Seward. All rights reserved. + + Redistribution and use in source and binary forms, with or without + modification, are permitted provided that the following conditions + are met: ++ + @itemize @bullet + @item + Redistributions of source code must retain the above copyright +@@ -66,6 +167,7 @@ + products derived from this software without specific prior written + permission. + @end itemize ++ + THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS + OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED + WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +@@ -78,15 +180,15 @@ + NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS + SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +-Julian Seward, Cambridge, UK. ++Julian R. Seward, Cambridge, UK. + +-@code{jseward@@acm.org} ++@email{jseward@@acm.org} + +-@code{http://sourceware.cygnus.com/bzip2} ++@uref{http://sourceware.cygnus.com/bzip2} + +-@code{http://www.cacheprof.org} ++@uref{http://www.cacheprof.org} + +-@code{http://www.muraroa.demon.co.uk} ++@uref{http://www.muraroa.demon.co.uk} + + @code{bzip2}/@code{libbzip2} version 1.0 of 21 March 2000. + +@@ -101,9 +203,14 @@ + + + +-@node Overview, Implementation, Top, Top +-@chapter Introduction ++@node Overview, How To Use Bzip2, Copyright, Top ++@chapter Overview + ++@cindex Burrows-Wheeler ++@cindex Huffman ++@cindex LZ77 ++@cindex LZ78 ++@cindex PPM + @code{bzip2} compresses files using the Burrows-Wheeler + block-sorting text compression algorithm, and Huffman coding. + Compression is generally considerably better than that +@@ -119,19 +226,34 @@ + + Chapter 2 describes how to use @code{bzip2}; this is the only part + you need to read if you just want to know how to operate the program. ++ + Chapter 3 describes the programming interfaces in detail, and ++ + Chapter 4 records some miscellaneous notes which I thought + ought to be recorded somewhere. + + ++@node How To Use Bzip2, Name, Overview, Top + @chapter How to use @code{bzip2} + + This chapter contains a copy of the @code{bzip2} man page, + and nothing else. + +-@quotation ++@menu ++* Name:: ++* Synopsis:: ++* Description:: ++* Options:: ++* Memory Management:: ++* Recovering Data From Damaged Files:: ++* Performance Notes:: ++* Caveats:: ++* Author:: ++@end menu + +-@unnumberedsubsubsec NAME ++@node Name, Synopsis, How To Use Bzip2, How To Use Bzip2 ++@unnumberedsec NAME ++@quotation + @itemize + @item @code{bzip2}, @code{bunzip2} + - a block-sorting file compressor, v1.0 +@@ -140,17 +262,24 @@ + @item @code{bzip2recover} + - recovers data from damaged bzip2 files + @end itemize ++@end quotation + +-@unnumberedsubsubsec SYNOPSIS ++@node Synopsis, Description, Name, How To Use Bzip2 ++@unnumberedsec SYNOPSIS ++@quotation + @itemize + @item @code{bzip2} [ -cdfkqstvzVL123456789 ] [ filenames ... ] + @item @code{bunzip2} [ -fkvsVL ] [ filenames ... ] + @item @code{bzcat} [ -s ] [ filenames ... ] + @item @code{bzip2recover} filename + @end itemize ++@end quotation + +-@unnumberedsubsubsec DESCRIPTION ++@node Description, Options, Synopsis, How To Use Bzip2 ++@unnumberedsec DESCRIPTION ++@quotation + ++@cindex bzip2 + @code{bzip2} compresses files using the Burrows-Wheeler block sorting + text compression algorithm, and Huffman coding. Compression is + generally considerably better than that achieved by more conventional +@@ -178,11 +307,13 @@ + write compressed output to a terminal, as this would be entirely + incomprehensible and therefore pointless. + ++@cindex bunzip2 + @code{bunzip2} (or @code{bzip2 -d}) decompresses all + specified files. Files which were not created by @code{bzip2} + will be detected and ignored, and a warning issued. + @code{bzip2} attempts to guess the filename for the decompressed file + from that of the compressed file as follows: ++ + @itemize + @item @code{filename.bz2 } becomes @code{filename} + @item @code{filename.bz } becomes @code{filename} +@@ -190,6 +321,7 @@ + @item @code{filename.tbz } becomes @code{filename.tar} + @item @code{anyothername } becomes @code{anyothername.out} + @end itemize ++ + If the file does not end in one of the recognised endings, + @code{.bz2}, @code{.bz}, + @code{.tbz2} or @code{.tbz}, @code{bzip2} complains that it cannot +@@ -213,6 +345,7 @@ + later. Earlier versions of @code{bzip2} will stop after decompressing + the first file in the stream. + ++@cindex bzcat + @code{bzcat} (or @code{bzip2 -dc}) decompresses all specified files to + the standard output. + +@@ -243,31 +376,54 @@ + not found, invalid flags, I/O errors, &c), 2 to indicate a corrupt + compressed file, 3 for an internal consistency error (eg, bug) which + caused @code{bzip2} to panic. ++@end quotation + +- +-@unnumberedsubsubsec OPTIONS ++@node Options, Memory Management, Description, How To Use Bzip2 ++@unnumberedsec OPTIONS ++@quotation + @table @code +-@item -c --stdout ++@item -c ++@itemx --stdout ++@kindex -c ++@kindex --stdout + Compress or decompress to standard output. +-@item -d --decompress ++@item -d ++@itemx --decompress ++@kindex -d ++@kindex --decompress + Force decompression. @code{bzip2}, @code{bunzip2} and @code{bzcat} are + really the same program, and the decision about what actions to take is + done on the basis of which name is used. This flag overrides that + mechanism, and forces bzip2 to decompress. +-@item -z --compress ++@item -z ++@itemx --compress ++@kindex -z ++@kindex --compress + The complement to @code{-d}: forces compression, regardless of the + invokation name. +-@item -t --test ++@item -t ++@itemx --test ++@kindex -t ++@kindex --test + Check integrity of the specified file(s), but don't decompress them. + This really performs a trial decompression and throws away the result. +-@item -f --force ++@item -f ++@itemx --force ++@kindex -f ++@kindex --force + Force overwrite of output files. Normally, @code{bzip2} will not overwrite + existing output files. Also forces @code{bzip2} to break hard links + to files, which it otherwise wouldn't do. +-@item -k --keep ++@item -k ++@itemx --keep ++@kindex -k ++@kindex --keep + Keep (don't delete) input files during compression + or decompression. +-@item -s --small ++@item -s ++@itemx --small ++@kindex -s ++@kindex --small + Reduce memory usage, for compression, decompression and testing. Files + are decompressed and tested using a modified algorithm which only + requires 2.5 bytes per block byte. This means any file can be +@@ -276,33 +432,60 @@ + During compression, @code{-s} selects a block size of 200k, which limits + memory use to around the same figure, at the expense of your compression + ratio. In short, if your machine is low on memory (8 megabytes or +-less), use -s for everything. See MEMORY MANAGEMENT below. +-@item -q --quiet ++less), use @code{-s} for everything. @xref{Memory Management}, below. ++@item -q ++@itemx --quiet ++@kindex -q ++@kindex --quiet + Suppress non-essential warning messages. Messages pertaining to + I/O errors and other critical events will not be suppressed. +-@item -v --verbose ++@item -v ++@itemx --verbose ++@kindex -v ++@kindex --verbose + Verbose mode -- show the compression ratio for each file processed. + Further @code{-v}'s increase the verbosity level, spewing out lots of + information which is primarily of interest for diagnostic purposes. +-@item -L --license -V --version ++@item -L ++@itemx --license ++@itemx -V ++@itemx --version ++@kindex -L ++@kindex --license ++@kindex -V ++@kindex --version + Display the software version, license terms and conditions. + @item -1 to -9 ++@kindex -1 ++@kindex -2 ++@kindex -3 ++@kindex -4 ++@kindex -5 ++@kindex -6 ++@kindex -7 ++@kindex -8 ++@kindex -9 + Set the block size to 100 k, 200 k .. 900 k when compressing. Has no +-effect when decompressing. See MEMORY MANAGEMENT below. ++effect when decompressing. @xref{Memory Management}, below. + @item -- ++@kindex -- + Treats all subsequent arguments as file names, even if they start + with a dash. This is so you can handle files with names beginning + with a dash, for example: @code{bzip2 -- -myfilename}. + @item --repetitive-fast +-@item --repetitive-best ++@itemx --repetitive-best ++@kindex --repetitive-fast ++@kindex --repetitive-best + These flags are redundant in versions 0.9.5 and above. They provided + some coarse control over the behaviour of the sorting algorithm in + earlier versions, which was sometimes useful. 0.9.5 and above have an + improved algorithm which renders these flags irrelevant. + @end table ++@end quotation + +- +-@unnumberedsubsubsec MEMORY MANAGEMENT ++@node Memory Management, Recovering Data From Damaged Files, Options, How To Use Bzip2 ++@unnumberedsec MEMORY MANAGEMENT ++@quotation + + @code{bzip2} compresses large files in blocks. The block size affects + both the compression ratio achieved, and the amount of memory needed for +@@ -317,12 +500,14 @@ + + Compression and decompression requirements, in bytes, can be estimated + as: ++ + @example + Compression: 400k + ( 8 x block size ) + + Decompression: 100k + ( 4 x block size ), or + 100k + ( 2.5 x block size ) + @end example ++ + Larger block sizes give rapidly diminishing marginal returns. Most of + the compression comes from the first two or three hundred k of block + size, a fact worth bearing in mind when using @code{bzip2} on small machines. +@@ -355,6 +540,7 @@ + column gives some feel for how compression varies with block size. + These figures tend to understate the advantage of larger block sizes for + larger files, since the Corpus is dominated by smaller files. ++ + @example + Compress Decompress Decompress Corpus + Flag usage usage -s usage Size +@@ -369,8 +555,11 @@ + -8 6800k 3300k 2100k 828642 + -9 7600k 3700k 2350k 828642 + @end example ++@end quotation + +-@unnumberedsubsubsec RECOVERING DATA FROM DAMAGED FILES ++@node Recovering Data From Damaged Files, Performance Notes, Memory Management, How To Use Bzip2 ++@unnumberedsec RECOVERING DATA FROM DAMAGED FILES ++@quotation + + @code{bzip2} compresses files in blocks, usually 900kbytes long. Each + block is handled independently. If a media or transmission error causes +@@ -382,6 +571,7 @@ + reasonable certainty. Each block also carries its own 32-bit CRC, so + damaged blocks can be distinguished from undamaged ones. + ++@cindex bzip2recover + @code{bzip2recover} is a simple program whose purpose is to search for + blocks in @code{.bz2} files, and write each block out into its own + @code{.bz2} file. You can then use @code{bzip2 -t} to test the +@@ -391,22 +581,24 @@ + @code{bzip2recover} + takes a single argument, the name of the damaged file, + and writes a number of files @code{rec0001file.bz2}, +- @code{rec0002file.bz2}, etc, containing the extracted blocks. +- The output filenames are designed so that the use of +- wildcards in subsequent processing -- for example, ++@code{rec0002file.bz2}, etc, containing the extracted blocks. ++The output filenames are designed so that the use of ++wildcards in subsequent processing -- for example, + @code{bzip2 -dc rec*file.bz2 > recovered_data} -- lists the files in +- the correct order. ++the correct order. + + @code{bzip2recover} should be of most use dealing with large @code{.bz2} +- files, as these will contain many blocks. It is clearly +- futile to use it on damaged single-block files, since a +- damaged block cannot be recovered. If you wish to minimise ++files, as these will contain many blocks. It is clearly ++futile to use it on damaged single-block files, since a ++damaged block cannot be recovered. If you wish to minimise + any potential data loss through media or transmission errors, + you might consider compressing with a smaller + block size. ++@end quotation + +- +-@unnumberedsubsubsec PERFORMANCE NOTES ++@node Performance Notes, Caveats, Recovering Data From Damaged Files, How To Use Bzip2 ++@unnumberedsec PERFORMANCE NOTES ++@quotation + + The sorting phase of compression gathers together similar strings in the + file. Because of this, files containing very long runs of repeated +@@ -427,9 +619,11 @@ + been observed to give disproportionately large performance improvements. + I imagine @code{bzip2} will perform best on machines with very large + caches. ++@end quotation + +- +-@unnumberedsubsubsec CAVEATS ++@node Caveats, Author, Performance Notes, How To Use Bzip2 ++@unnumberedsec CAVEATS ++@quotation + + I/O error messages are not as helpful as they could be. @code{bzip2} + tries hard to detect I/O errors and exit cleanly, but the details of +@@ -446,13 +640,17 @@ + @code{bzip2recover} uses 32-bit integers to represent bit positions in + compressed files, so it cannot handle compressed files more than 512 + megabytes long. This could easily be fixed. ++@end quotation + +- +-@unnumberedsubsubsec AUTHOR +-Julian Seward, @code{jseward@@acm.org}. ++@node Author, Programming With Libbzip2, Caveats, How To Use Bzip2 ++@unnumberedsec AUTHOR ++@quotation ++Julian R. Seward, @email{jseward@@acm.org}. + + The ideas embodied in @code{bzip2} are due to (at least) the following +-people: Michael Burrows and David Wheeler (for the block sorting ++people: ++ ++Michael Burrows and David Wheeler (for the block sorting + transformation), David Wheeler (again, for the Huffman coder), Peter + Fenwick (for the structured coding model in the original @code{bzip}, + and many refinements), and Alistair Moffat, Radford Neal and Ian Witten +@@ -470,16 +668,32 @@ + + + ++@node Programming With Libbzip2, Top-Level Structure, Author, Top + @chapter Programming with @code{libbzip2} + + This chapter describes the programming interface to @code{libbzip2}. + + For general background information, particularly about memory +-use and performance aspects, you'd be well advised to read Chapter 2 ++use and performance aspects, you'd be well advised to read ++@xref{How To Use Bzip2}, + as well. + ++@menu ++* Top-Level Structure:: ++* Error Handling:: ++* Low-Level Interface:: ++* High-Level Interface:: ++* Utility Functions:: ++* Zlib Compatibility Functions:: ++* Using The Library In A Stdio-Free Environment:: ++* Making A Windows DLL:: ++@end menu ++ ++ ++@node Top-Level Structure, Low-Level Summary, Programming With Libbzip2, Programming With Libbzip2 + @section Top-level structure + ++@cindex libbzip2 + @code{libbzip2} is a flexible library for compressing and decompressing + data in the @code{bzip2} data format. Although packaged as a single + entity, it helps to regard the library as three separate parts: the low +@@ -490,10 +704,18 @@ + that of Jean-loup Gailly's and Mark Adler's excellent @code{zlib} + library. + ++@cindex BZ2_ + All externally visible symbols have names beginning @code{BZ2_}. + This is new in version 1.0. The intention is to minimise pollution + of the namespaces of library clients. + ++@menu ++* Low-Level Summary:: ++* High-Level Summary:: ++* Utility Functions Summary:: ++@end menu ++ ++@node Low-Level Summary, High-Level Summary, Top-Level Structure, Top-Level Structure + @subsection Low-level summary + + This interface provides services for compressing and decompressing +@@ -506,13 +728,15 @@ + is therefore thread-safe. + + Six routines make up the low level interface: +-@code{BZ2_bzCompressInit}, @code{BZ2_bzCompress}, and @* @code{BZ2_bzCompressEnd} ++ ++@code{BZ2_bzCompressInit}, @code{BZ2_bzCompress}, and @code{BZ2_bzCompressEnd} + for compression, +-and a corresponding trio @code{BZ2_bzDecompressInit}, @* @code{BZ2_bzDecompress} ++and a corresponding trio @code{BZ2_bzDecompressInit}, @code{BZ2_bzDecompress} + and @code{BZ2_bzDecompressEnd} for decompression. ++ + The @code{*Init} functions allocate + memory for compression/decompression and do other +-initialisations, whilst the @code{*End} functions close down operations ++initialisations, while the @code{*End} functions close down operations + and release memory. + + The real work is done by @code{BZ2_bzCompress} and @code{BZ2_bzDecompress}. +@@ -525,6 +749,7 @@ + + + ++@node High-Level Summary, Utility Functions Summary, Low-Level Summary, Top-Level Structure + @subsection High-level summary + + This interface provides some handy wrappers around the low-level +@@ -535,7 +760,7 @@ + multiple @code{bzip2} data streams concatenated end-to-end. + + For reading files, @code{BZ2_bzReadOpen}, @code{BZ2_bzRead}, +-@code{BZ2_bzReadClose} and @* @code{BZ2_bzReadGetUnused} are supplied. For ++@code{BZ2_bzReadClose} and @code{BZ2_bzReadGetUnused} are supplied. For + writing files, @code{BZ2_bzWriteOpen}, @code{BZ2_bzWrite} and + @code{BZ2_bzWriteFinish} are available. + +@@ -555,6 +780,7 @@ + + + ++@node Utility Functions Summary, Error Handling, High-Level Summary, Top-Level Structure + @subsection Utility functions summary + For very simple needs, @code{BZ2_bzBuffToBuffCompress} and + @code{BZ2_bzBuffToBuffDecompress} are provided. These compress +@@ -564,8 +790,8 @@ + requirements before investing effort in understanding the more + general but more complex low-level interface. + +-Yoshioka Tsuneo (@code{QWF00133@@niftyserve.or.jp} / +-@code{tsuneo-y@@is.aist-nara.ac.jp}) has contributed some functions to ++Yoshioka Tsuneo (@email{QWF00133@@niftyserve.or.jp} / ++@email{tsuneo-y@@is.aist-nara.ac.jp}) has contributed some functions to + give better @code{zlib} compatibility. These functions are + @code{BZ2_bzopen}, @code{BZ2_bzread}, @code{BZ2_bzwrite}, @code{BZ2_bzflush}, + @code{BZ2_bzclose}, +@@ -580,6 +806,7 @@ + built as a Windows DLL. + + ++@node Error Handling, Low-Level Interface, Utility Functions Summary, Programming With Libbzip2 + @section Error handling + + The library is designed to recover cleanly in all situations, including +@@ -599,6 +826,8 @@ + out of range reads or writes. So it's certainly much improved, + although I wouldn't claim it to be totally bombproof. + ++@cindex bzlib.h ++@cindex bzlib_private.h + The file @code{bzlib.h} contains all definitions needed to use + the library. In particular, you should definitely not include + @code{bzlib_private.h}. +@@ -609,22 +838,30 @@ + later. Rather, it is intended to convey the rough meaning of each + return value. The first five actions are normal and not intended to + denote an error situation. ++ + @table @code + @item BZ_OK ++@vindex BZ_OK + The requested action was completed successfully. + @item BZ_RUN_OK + @itemx BZ_FLUSH_OK + @itemx BZ_FINISH_OK ++@vindex BZ_RUN_OK ++@vindex BZ_FLUSH_OK ++@vindex BZ_FINISH_OK + In @code{BZ2_bzCompress}, the requested flush/finish/nothing-special action + was completed successfully. + @item BZ_STREAM_END ++@vindex BZ_STREAM_END + Compression of data was completed, or the logical stream end was + detected during decompression. + @end table + + The following return values indicate an error of some kind. ++ + @table @code + @item BZ_CONFIG_ERROR ++@vindex BZ_CONFIG_ERROR + Indicates that the library has been improperly compiled on your + platform -- a major configuration error. Specifically, it means + that @code{sizeof(char)}, @code{sizeof(short)} and @code{sizeof(int)} +@@ -635,6 +872,7 @@ + still 4, so @code{libbzip2}, which doesn't use the @code{long} type, + is OK. + @item BZ_SEQUENCE_ERROR ++@vindex BZ_SEQUENCE_ERROR + When using the library, it is important to call the functions in the + correct sequence and with data structures (buffers etc) in the correct + states. @code{libbzip2} checks as much as it can to ensure this is +@@ -643,12 +881,14 @@ + should never receive this value; such an event denotes buggy code + which you should investigate. + @item BZ_PARAM_ERROR ++@vindex BZ_PARAM_ERROR + Returned when a parameter to a function call is out of range + or otherwise manifestly incorrect. As with @code{BZ_SEQUENCE_ERROR}, + this denotes a bug in the client code. The distinction between + @code{BZ_PARAM_ERROR} and @code{BZ_SEQUENCE_ERROR} is a bit hazy, but still worth + making. + @item BZ_MEM_ERROR ++@vindex BZ_MEM_ERROR + Returned when a request to allocate memory failed. Note that the + quantity of memory needed to decompress a stream cannot be determined + until the stream's header has been read. So @code{BZ2_bzDecompress} and +@@ -657,15 +897,18 @@ + compression; once @code{BZ2_bzCompressInit} or @code{BZ2_bzWriteOpen} have + successfully completed, @code{BZ_MEM_ERROR} cannot occur. + @item BZ_DATA_ERROR ++@vindex BZ_DATA_ERROR + Returned when a data integrity error is detected during decompression. + Most importantly, this means when stored and computed CRCs for the + data do not match. This value is also returned upon detection of any + other anomaly in the compressed data. + @item BZ_DATA_ERROR_MAGIC ++@vindex BZ_DATA_ERROR_MAGIC + As a special case of @code{BZ_DATA_ERROR}, it is sometimes useful to + know when the compressed stream does not start with the correct + magic bytes (@code{'B' 'Z' 'h'}). + @item BZ_IO_ERROR ++@vindex BZ_IO_ERROR + Returned by @code{BZ2_bzRead} and @code{BZ2_bzWrite} when there is an error + reading or writing in the compressed file, and by @code{BZ2_bzReadOpen} + and @code{BZ2_bzWriteOpen} for attempts to use a file for which the +@@ -674,9 +917,11 @@ + @code{errno} and/or @code{perror} to acquire operating-system + specific information about the problem. + @item BZ_UNEXPECTED_EOF ++@vindex BZ_UNEXPECTED_EOF + Returned by @code{BZ2_bzRead} when the compressed file finishes + before the logical end of stream is detected. + @item BZ_OUTBUFF_FULL ++@vindex BZ_OUTBUFF_FULL + Returned by @code{BZ2_bzBuffToBuffCompress} and + @code{BZ2_bzBuffToBuffDecompress} to indicate that the output data + will not fit into the output buffer provided. +@@ -684,9 +929,22 @@ + + + ++@node Low-Level Interface, BZ2_bzCompressInit, Error Handling, Programming With Libbzip2 + @section Low-level interface + ++@menu ++* BZ2_bzCompressInit:: ++* BZ2_bzCompress:: ++* BZ2_bzCompressEnd:: ++* BZ2_bzDecompressInit:: ++* BZ2_bzDecompress:: ++* BZ2_bzDecompressEnd:: ++@end menu ++ ++@node BZ2_bzCompressInit, BZ2_bzCompress, Low-Level Interface, Low-Level Interface + @subsection @code{BZ2_bzCompressInit} ++@tindex bz_stream ++@findex BZ2_bzCompressInit + @example + typedef + struct @{ +@@ -794,6 +1052,7 @@ + mechanism would render the parameter obsolete. + + Possible return values: ++ + @display + @code{BZ_CONFIG_ERROR} + if the library has been mis-compiled +@@ -807,17 +1066,22 @@ + @code{BZ_OK} + otherwise + @end display ++ + Allowable next actions: ++ + @display + @code{BZ2_bzCompress} + if @code{BZ_OK} is returned + no specific action needed in case of error + @end display + ++@node BZ2_bzCompress, BZ2_bzCompressEnd, BZ2_bzCompressInit, Low-Level Interface + @subsection @code{BZ2_bzCompress} ++@findex BZ2_bzCompress + @example + int BZ2_bzCompress ( bz_stream *strm, int action ); + @end example ++ + Provides more input and/or output buffer space for the library. The + caller maintains input and output buffers, and calls @code{BZ2_bzCompress} to + transfer data between them. +@@ -900,6 +1164,7 @@ + values are. Note that you can't explicitly ask what state the + stream is in, but nor do you need to -- it can be inferred from the + values returned by @code{BZ2_bzCompress}. ++ + @display + IDLE/@code{any} + Illegal. IDLE state only exists after @code{BZ2_bzCompressEnd} or +@@ -952,6 +1217,7 @@ + + That still looks complicated? Well, fair enough. The usual sequence + of calls for compressing a load of data is: ++ + @itemize @bullet + @item Get started with @code{BZ2_bzCompressInit}. + @item Shovel data in and shlurp out its compressed form using zero or more +@@ -961,6 +1227,7 @@ + copying out the compressed output, until @code{BZ_STREAM_END} is returned. + @item Close up and go home. Call @code{BZ2_bzCompressEnd}. + @end itemize ++ + If the data you want to compress fits into your input buffer all + at once, you can skip the calls of @code{BZ2_bzCompress ( ..., BZ_RUN )} and + just do the @code{BZ2_bzCompress ( ..., BZ_FINISH )} calls. +@@ -972,28 +1239,36 @@ + your programming. + + Trivial other possible return values: ++ + @display + @code{BZ_PARAM_ERROR} + if @code{strm} is @code{NULL}, or @code{strm->s} is @code{NULL} + @end display + ++@node BZ2_bzCompressEnd, BZ2_bzDecompressInit, BZ2_bzCompress, Low-Level Interface + @subsection @code{BZ2_bzCompressEnd} ++@findex BZ2_bzCompressEnd + @example + int BZ2_bzCompressEnd ( bz_stream *strm ); + @end example ++ + Releases all memory associated with a compression stream. + + Possible return values: ++ + @display + @code{BZ_PARAM_ERROR} if @code{strm} is @code{NULL} or @code{strm->s} is @code{NULL} + @code{BZ_OK} otherwise + @end display + + ++@node BZ2_bzDecompressInit, BZ2_bzDecompress, BZ2_bzCompressEnd, Low-Level Interface + @subsection @code{BZ2_bzDecompressInit} ++@findex BZ2_bzDecompressInit + @example + int BZ2_bzDecompressInit ( bz_stream *strm, int verbosity, int small ); + @end example ++ + Prepares for decompression. As with @code{BZ2_bzCompressInit}, a + @code{bz_stream} record should be allocated and initialised before the + call. Fields @code{bzalloc}, @code{bzfree} and @code{opaque} should be +@@ -1002,13 +1277,13 @@ + state will have been initialised, and @code{total_in} and + @code{total_out} will be zero. + +-For the meaning of parameter @code{verbosity}, see @code{BZ2_bzCompressInit}. ++For the meaning of parameter @code{verbosity}, @xref{BZ2_bzCompressInit}. + + If @code{small} is nonzero, the library will use an alternative + decompression algorithm which uses less memory but at the cost of + decompressing more slowly (roughly speaking, half the speed, but the +-maximum memory requirement drops to around 2300k). See Chapter 2 for +-more information on memory management. ++maximum memory requirement drops to around 2300k). @xref{Memory Management}, ++for more information. + + Note that the amount of memory needed to decompress + a stream cannot be determined until the stream's header has been read, +@@ -1016,6 +1291,7 @@ + @code{BZ2_bzDecompress} could fail with @code{BZ_MEM_ERROR}. + + Possible return values: ++ + @display + @code{BZ_CONFIG_ERROR} + if the library has been mis-compiled +@@ -1027,6 +1303,7 @@ + @end display + + Allowable next actions: ++ + @display + @code{BZ2_bzDecompress} + if @code{BZ_OK} was returned +@@ -1035,10 +1312,13 @@ + + + ++@node BZ2_bzDecompress, BZ2_bzDecompressEnd, BZ2_bzDecompressInit, Low-Level Interface + @subsection @code{BZ2_bzDecompress} ++@findex BZ2_bzDecompress + @example + int BZ2_bzDecompress ( bz_stream *strm ); + @end example ++ + Provides more input and/out output buffer space for the library. The + caller maintains input and output buffers, and uses @code{BZ2_bzDecompress} + to transfer data between them. +@@ -1079,6 +1359,7 @@ + to clean up and release memory. + + Possible return values: ++ + @display + @code{BZ_PARAM_ERROR} + if @code{strm} is @code{NULL} or @code{strm->s} is @code{NULL} +@@ -1095,7 +1376,9 @@ + @code{BZ_OK} + otherwise + @end display ++ + Allowable next actions: ++ + @display + @code{BZ2_bzDecompress} + if @code{BZ_OK} was returned +@@ -1104,13 +1387,17 @@ + @end display + + ++@node BZ2_bzDecompressEnd, High-Level Interface, BZ2_bzDecompress, Low-Level Interface + @subsection @code{BZ2_bzDecompressEnd} ++@findex BZ2_bzDecompressEnd + @example + int BZ2_bzDecompressEnd ( bz_stream *strm ); + @end example ++ + Releases all memory associated with a decompression stream. + + Possible return values: ++ + @display + @code{BZ_PARAM_ERROR} + if @code{strm} is @code{NULL} or @code{strm->s} is @code{NULL} +@@ -1119,11 +1406,13 @@ + @end display + + Allowable next actions: ++ + @display + None. + @end display + + ++@node High-Level Interface, BZ2_bzReadOpen, BZ2_bzDecompressEnd, Programming With Libbzip2 + @section High-level interface + + This interface provides functions for reading and writing +@@ -1166,9 +1455,23 @@ + functions (could easily be added, though). + @end itemize + ++@menu ++* BZ2_bzReadOpen:: ++* BZ2_bzRead:: ++* BZ2_bzReadGetUnused:: ++* BZ2_bzReadClose:: ++* BZ2_bzWriteOpen:: ++* BZ2_bzWrite:: ++* BZ2_bzWriteClose:: ++* Handling Embedded Compressed Data Streams:: ++* Standard File-Reading/Writing Code:: ++@end menu + + ++@node BZ2_bzReadOpen, BZ2_bzRead, High-Level Interface, High-Level Interface + @subsection @code{BZ2_bzReadOpen} ++@tindex BZFILE ++@findex BZ2_bzReadOpen + @example + typedef void BZFILE; + +@@ -1176,6 +1479,7 @@ + int small, int verbosity, + void *unused, int nUnused ); + @end example ++ + Prepare to read compressed data from file handle @code{f}. @code{f} + should refer to a file which has been opened for reading, and for which + the error indicator (@code{ferror(f)})is not set. If @code{small} is 1, +@@ -1190,7 +1494,7 @@ + respectively. + + For the meaning of parameters @code{small} and @code{verbosity}, +-see @code{BZ2_bzDecompressInit}. ++@xref{BZ2_bzDecompressInit}. + + The amount of memory needed to decompress a file cannot be determined + until the file's header has been read. So it is possible that +@@ -1198,6 +1502,7 @@ + @code{BZ2_bzRead} will return @code{BZ_MEM_ERROR}. + + Possible assignments to @code{bzerror}: ++ + @display + @code{BZ_CONFIG_ERROR} + if the library has been mis-compiled +@@ -1215,6 +1520,7 @@ + @end display + + Possible return values: ++ + @display + Pointer to an abstract @code{BZFILE} + if @code{bzerror} is @code{BZ_OK} +@@ -1223,6 +1529,7 @@ + @end display + + Allowable next actions: ++ + @display + @code{BZ2_bzRead} + if @code{bzerror} is @code{BZ_OK} +@@ -1231,10 +1538,13 @@ + @end display + + ++@node BZ2_bzRead, BZ2_bzReadGetUnused, BZ2_bzReadOpen, High-Level Interface + @subsection @code{BZ2_bzRead} ++@findex BZ2_bzRead + @example + int BZ2_bzRead ( int *bzerror, BZFILE *b, void *buf, int len ); + @end example ++ + Reads up to @code{len} (uncompressed) bytes from the compressed file + @code{b} into + the buffer @code{buf}. If the read was successful, +@@ -1262,6 +1572,7 @@ + appeared, call @code{BZ2_bzReadGetUnused} immediately before @code{BZ2_bzReadClose}. + + Possible assignments to @code{bzerror}: ++ + @display + @code{BZ_PARAM_ERROR} + if @code{b} is @code{NULL} or @code{buf} is @code{NULL} or @code{len < 0} +@@ -1285,6 +1596,7 @@ + @end display + + Possible return values: ++ + @display + number of bytes read + if @code{bzerror} is @code{BZ_OK} or @code{BZ_STREAM_END} +@@ -1293,6 +1605,7 @@ + @end display + + Allowable next actions: ++ + @display + collect data from @code{buf}, then @code{BZ2_bzRead} or @code{BZ2_bzReadClose} + if @code{bzerror} is @code{BZ_OK} +@@ -1304,11 +1617,14 @@ + + + ++@node BZ2_bzReadGetUnused, BZ2_bzReadClose, BZ2_bzRead, High-Level Interface + @subsection @code{BZ2_bzReadGetUnused} ++@findex BZ2_bzReadGetUnused + @example + void BZ2_bzReadGetUnused ( int* bzerror, BZFILE *b, + void** unused, int* nUnused ); + @end example ++ + Returns data which was read from the compressed file but was not needed + to get to the logical end-of-stream. @code{*unused} is set to the address + of the data, and @code{*nUnused} to the number of bytes. @code{*nUnused} will +@@ -1318,6 +1634,7 @@ + @code{BZ_STREAM_END} but before @code{BZ2_bzReadClose}. + + Possible assignments to @code{bzerror}: ++ + @display + @code{BZ_PARAM_ERROR} + if @code{b} is @code{NULL} +@@ -1330,15 +1647,19 @@ + @end display + + Allowable next actions: ++ + @display + @code{BZ2_bzReadClose} + @end display + + ++@node BZ2_bzReadClose, BZ2_bzWriteOpen, BZ2_bzReadGetUnused, High-Level Interface + @subsection @code{BZ2_bzReadClose} ++@findex BZ2_bzReadClose + @example + void BZ2_bzReadClose ( int *bzerror, BZFILE *b ); + @end example ++ + Releases all memory pertaining to the compressed file @code{b}. + @code{BZ2_bzReadClose} does not call @code{fclose} on the underlying file + handle, so you should do that yourself if appropriate. +@@ -1346,6 +1667,7 @@ + situations. + + Possible assignments to @code{bzerror}: ++ + @display + @code{BZ_SEQUENCE_ERROR} + if @code{b} was opened with @code{BZ2_bzOpenWrite} +@@ -1354,32 +1676,37 @@ + @end display + + Allowable next actions: ++ + @display + none + @end display + + + ++@node BZ2_bzWriteOpen, BZ2_bzWrite, BZ2_bzReadClose, High-Level Interface + @subsection @code{BZ2_bzWriteOpen} ++@findex BZ2_bzWriteOpen + @example + BZFILE *BZ2_bzWriteOpen ( int *bzerror, FILE *f, + int blockSize100k, int verbosity, + int workFactor ); + @end example ++ + Prepare to write compressed data to file handle @code{f}. + @code{f} should refer to + a file which has been opened for writing, and for which the error +-indicator (@code{ferror(f)})is not set. ++indicator (@code{ferror(f)}) is not set. + + For the meaning of parameters @code{blockSize100k}, +-@code{verbosity} and @code{workFactor}, see +-@* @code{BZ2_bzCompressInit}. ++@code{verbosity} and @code{workFactor}, ++@xref{BZ2_bzCompressInit}. + + All required memory is allocated at this stage, so if the call + completes successfully, @code{BZ_MEM_ERROR} cannot be signalled by a + subsequent call to @code{BZ2_bzWrite}. + + Possible assignments to @code{bzerror}: ++ + @display + @code{BZ_CONFIG_ERROR} + if the library has been mis-compiled +@@ -1395,6 +1722,7 @@ + @end display + + Possible return values: ++ + @display + Pointer to an abstract @code{BZFILE} + if @code{bzerror} is @code{BZ_OK} +@@ -1403,6 +1731,7 @@ + @end display + + Allowable next actions: ++ + @display + @code{BZ2_bzWrite} + if @code{bzerror} is @code{BZ_OK} +@@ -1413,14 +1742,18 @@ + + + ++@node BZ2_bzWrite, BZ2_bzWriteClose, BZ2_bzWriteOpen, High-Level Interface + @subsection @code{BZ2_bzWrite} ++@findex BZ2_bzWrite + @example + void BZ2_bzWrite ( int *bzerror, BZFILE *b, void *buf, int len ); + @end example ++ + Absorbs @code{len} bytes from the buffer @code{buf}, eventually to be + compressed and written to the file. + + Possible assignments to @code{bzerror}: ++ + @display + @code{BZ_PARAM_ERROR} + if @code{b} is @code{NULL} or @code{buf} is @code{NULL} or @code{len < 0} +@@ -1435,7 +1768,10 @@ + + + ++@node BZ2_bzWriteClose, Handling Embedded Compressed Data Streams, BZ2_bzWrite, High-Level Interface + @subsection @code{BZ2_bzWriteClose} ++@findex BZ2_bzWriteClose ++@findex BZ2_bzWriteClose64 + @example + void BZ2_bzWriteClose ( int *bzerror, BZFILE* f, + int abandon, +@@ -1475,6 +1811,7 @@ + + + Possible assignments to @code{bzerror}: ++ + @display + @code{BZ_SEQUENCE_ERROR} + if @code{b} was opened with @code{BZ2_bzReadOpen} +@@ -1484,11 +1821,13 @@ + otherwise + @end display + ++@node Handling Embedded Compressed Data Streams, Standard File-Reading/Writing Code, BZ2_bzWriteClose, High-Level Interface + @subsection Handling embedded compressed data streams + + The high-level library facilitates use of + @code{bzip2} data streams which form some part of a surrounding, larger + data stream. ++ + @itemize @bullet + @item For writing, the library takes an open file handle, writes + compressed data to it, @code{fflush}es it but does not @code{fclose} it. +@@ -1522,8 +1861,10 @@ + If you require extra flexibility, you'll have to bite the bullet and get + to grips with the low-level interface. + ++@node Standard File-Reading/Writing Code, Utility Functions, Handling Embedded Compressed Data Streams, High-Level Interface + @subsection Standard file-reading/writing code + Here's how you'd write data to a compressed file: ++ + @example @code + FILE* f; + BZFILE* b; +@@ -1556,7 +1897,9 @@ + /* handle error */ + @} + @end example ++ + And to read from a compressed file: ++ + @example + FILE* f; + BZFILE* b; +@@ -1592,8 +1935,17 @@ + + + ++@node Utility Functions, BZ2_bzBuffToBuffCompress, Standard File-Reading/Writing Code, Programming With Libbzip2 + @section Utility functions ++ ++@menu ++* BZ2_bzBuffToBuffCompress:: ++* BZ2_bzBuffToBuffDecompress:: ++@end menu ++ ++@node BZ2_bzBuffToBuffCompress, BZ2_bzBuffToBuffDecompress, Utility Functions, Utility Functions + @subsection @code{BZ2_bzBuffToBuffCompress} ++@findex BZ2_bzBuffToBuffCompress + @example + int BZ2_bzBuffToBuffCompress( char* dest, + unsigned int* destLen, +@@ -1603,6 +1955,7 @@ + int verbosity, + int workFactor ); + @end example ++ + Attempts to compress the data in @code{source[0 .. sourceLen-1]} + into the destination buffer, @code{dest[0 .. *destLen-1]}. + If the destination buffer is big enough, @code{*destLen} is +@@ -1617,7 +1970,7 @@ + mechanism, use the low-level interface. + + For the meaning of parameters @code{blockSize100k}, @code{verbosity} +-and @code{workFactor}, @* see @code{BZ2_bzCompressInit}. ++and @code{workFactor}, @xref{BZ2_bzCompressInit}. + + To guarantee that the compressed data will fit in its buffer, allocate + an output buffer of size 1% larger than the uncompressed data, plus +@@ -1627,6 +1980,7 @@ + beyond @code{dest[*destLen]}, even in case of buffer overflow. + + Possible return values: ++ + @display + @code{BZ_CONFIG_ERROR} + if the library has been mis-compiled +@@ -1645,7 +1999,9 @@ + + + ++@node BZ2_bzBuffToBuffDecompress, Zlib Compatibility Functions, BZ2_bzBuffToBuffCompress, Utility Functions + @subsection @code{BZ2_bzBuffToBuffDecompress} ++@findex BZ2_bzBuffToBuffDecompress + @example + int BZ2_bzBuffToBuffDecompress ( char* dest, + unsigned int* destLen, +@@ -1654,6 +2010,7 @@ + int small, + int verbosity ); + @end example ++ + Attempts to decompress the data in @code{source[0 .. sourceLen-1]} + into the destination buffer, @code{dest[0 .. *destLen-1]}. + If the destination buffer is big enough, @code{*destLen} is +@@ -1662,11 +2019,11 @@ + is unchanged, and @code{BZ_OUTBUFF_FULL} is returned. + + @code{source} is assumed to hold a complete @code{bzip2} format +-data stream. @* @code{BZ2_bzBuffToBuffDecompress} tries to decompress ++data stream. @code{BZ2_bzBuffToBuffDecompress} tries to decompress + the entirety of the stream into the output buffer. + + For the meaning of parameters @code{small} and @code{verbosity}, +-see @code{BZ2_bzDecompressInit}. ++@xref{BZ2_bzDecompressInit}. + + Because the compression ratio of the compressed data cannot be known in + advance, there is no easy way to guarantee that the output buffer will +@@ -1678,6 +2035,7 @@ + beyond @code{dest[*destLen]}, even in case of buffer overflow. + + Possible return values: ++ + @display + @code{BZ_CONFIG_ERROR} + if the library has been mis-compiled +@@ -1701,6 +2059,7 @@ + + + ++@node Zlib Compatibility Functions, Using The Library In A Stdio-Free Environment, BZ2_bzBuffToBuffDecompress, Programming With Libbzip2 + @section @code{zlib} compatibility functions + Yoshioka Tsuneo has contributed some functions to + give better @code{zlib} compatibility. These functions are +@@ -1710,81 +2069,112 @@ + These functions are not (yet) officially part of + the library. If they break, you get to keep all the pieces. + Nevertheless, I think they work ok. ++ ++@findex BZ2_bzlibVersion + @example + typedef void BZFILE; + + const char * BZ2_bzlibVersion ( void ); + @end example ++ + Returns a string indicating the library version. ++ ++@findex BZ2_bzopen ++@findex BZ2_bzdopen + @example + BZFILE * BZ2_bzopen ( const char *path, const char *mode ); + BZFILE * BZ2_bzdopen ( int fd, const char *mode ); + @end example ++ + Opens a @code{.bz2} file for reading or writing, using either its name + or a pre-existing file descriptor. + Analogous to @code{fopen} and @code{fdopen}. ++ ++@findex BZ2_bzread ++@findex BZ2_bzwrite + @example + int BZ2_bzread ( BZFILE* b, void* buf, int len ); + int BZ2_bzwrite ( BZFILE* b, void* buf, int len ); + @end example ++ + Reads/writes data from/to a previously opened @code{BZFILE}. + Analogous to @code{fread} and @code{fwrite}. ++ ++@findex BZ2_bzflush ++@findex BZ2_bzclose + @example + int BZ2_bzflush ( BZFILE* b ); + void BZ2_bzclose ( BZFILE* b ); + @end example ++ + Flushes/closes a @code{BZFILE}. @code{BZ2_bzflush} doesn't actually do + anything. Analogous to @code{fflush} and @code{fclose}. + ++@findex BZ2_bzerror + @example + const char * BZ2_bzerror ( BZFILE *b, int *errnum ) + @end example ++ + Returns a string describing the more recent error status of + @code{b}, and also sets @code{*errnum} to its numerical value. + + ++@node Using The Library In A Stdio-Free Environment, Getting Rid Of Stdio, Zlib Compatibility Functions, Programming With Libbzip2 + @section Using the library in a @code{stdio}-free environment + ++@menu ++* Getting Rid Of Stdio:: ++* Critical Error Handling:: ++@end menu ++ ++@node Getting Rid Of Stdio, Critical Error Handling, Using The Library In A Stdio-Free Environment, Using The Library In A Stdio-Free Environment + @subsection Getting rid of @code{stdio} + ++@vindex BZ_NO_STDIO + In a deeply embedded application, you might want to use just + the memory-to-memory functions. You can do this conveniently + by compiling the library with preprocessor symbol @code{BZ_NO_STDIO} + defined. Doing this gives you a library containing only the following + eight functions: + +-@code{BZ2_bzCompressInit}, @code{BZ2_bzCompress}, @code{BZ2_bzCompressEnd} @* +-@code{BZ2_bzDecompressInit}, @code{BZ2_bzDecompress}, @code{BZ2_bzDecompressEnd} @* +-@code{BZ2_bzBuffToBuffCompress}, @code{BZ2_bzBuffToBuffDecompress} ++@code{BZ2_bzCompressInit}, @code{BZ2_bzCompress}, @code{BZ2_bzCompressEnd}, ++@code{BZ2_bzDecompressInit}, @code{BZ2_bzDecompress}, @code{BZ2_bzDecompressEnd}, ++@code{BZ2_bzBuffToBuffCompress}, @code{BZ2_bzBuffToBuffDecompress}. + + When compiled like this, all functions will ignore @code{verbosity} + settings. + ++@node Critical Error Handling, Making A Windows DLL, Getting Rid Of Stdio, Using The Library In A Stdio-Free Environment + @subsection Critical error handling + @code{libbzip2} contains a number of internal assertion checks which + should, needless to say, never be activated. Nevertheless, if an + assertion should fail, behaviour depends on whether or not the library + was compiled with @code{BZ_NO_STDIO} set. + +-For a normal compile, an assertion failure yields the message ++For a normal compile, an assertion failure yields the message: ++ + @example + bzip2/libbzip2: internal error number N. + This is a bug in bzip2/libbzip2, 1.0 of 21-Mar-2000. +- Please report it to me at: jseward@@acm.org. If this happened ++ Please report it to me at: @email{jseward@@acm.org}. If this happened + when you were using some program which uses libbzip2 as a + component, you should also report this bug to the author(s) + of that program. Please make an effort to report this bug; + timely and accurate bug reports eventually lead to higher +- quality software. Thanks. Julian Seward, 21 March 2000. ++ quality software. Thanks. Julian R. Seward, 21 March 2000. + @end example ++ + where @code{N} is some error code number. @code{exit(3)} + is then called. + + For a @code{stdio}-free library, assertion failures result + in a call to a function declared as: ++ ++@findex bz_internal_error + @example + extern void bz_internal_error ( int errcode ); + @end example ++ + The relevant code is passed as a parameter. You should supply + such a function. + +@@ -1799,11 +2189,12 @@ + and can be recovered from. + + ++@node Making A Windows DLL, Miscellaneous, Critical Error Handling, Programming With Libbzip2 + @section Making a Windows DLL + Everything related to Windows has been contributed by Yoshioka Tsuneo +-@* (@code{QWF00133@@niftyserve.or.jp} / +-@code{tsuneo-y@@is.aist-nara.ac.jp}), so you should send your queries to +-him (but perhaps Cc: me, @code{jseward@@acm.org}). ++(@email{QWF00133@@niftyserve.or.jp} / ++@email{tsuneo-y@@is.aist-nara.ac.jp}), so you should send your queries to ++him (but perhaps Cc: me, @email{jseward@@acm.org}). + + My vague understanding of what to do is: using Visual C++ 5.0, + open the project file @code{libbz2.dsp}, and build. That's all. +@@ -1811,10 +2202,11 @@ + If you can't + open the project file for some reason, make a new one, naming these files: + @code{blocksort.c}, @code{bzlib.c}, @code{compress.c}, +-@code{crctable.c}, @code{decompress.c}, @code{huffman.c}, @* ++@code{crctable.c}, @code{decompress.c}, @code{huffman.c}, + @code{randtable.c} and @code{libbz2.def}. You will also need + to name the header files @code{bzlib.h} and @code{bzlib_private.h}. + ++@vindex _WIN32 + If you don't use VC++, you may need to define the proprocessor symbol + @code{_WIN32}. + +@@ -1833,11 +2225,22 @@ + + + +-@chapter Miscellanea ++@node Miscellaneous, Limitations Of The Compressed File Format, Making A Windows DLL, Top ++@chapter Miscellaneous + + These are just some random thoughts of mine. Your mileage may + vary. + ++@menu ++* Limitations Of The Compressed File Format:: ++* Portability Issues:: ++* Reporting Bugs:: ++* Did You Get The Right Package:: ++* Testing:: ++* Further Reading:: ++@end menu ++ ++@node Limitations Of The Compressed File Format, Portability Issues, Miscellaneous, Miscellaneous + @section Limitations of the compressed file format + @code{bzip2-1.0}, @code{0.9.5} and @code{0.9.0} + use exactly the same file format as the previous +@@ -1849,6 +2252,7 @@ + work since the release of @code{bzip2-0.1} in August 1997 + has shown complexities in the file format which slow down + decompression and, in retrospect, are unnecessary. These are: ++ + @itemize @bullet + @item The run-length encoder, which is the first of the + compression transformations, is entirely irrelevant. +@@ -1892,12 +2296,14 @@ + @item An Adler-32 checksum, rather than a CRC32 checksum, + would be faster to compute. + @end itemize ++ + It would be fair to say that the @code{bzip2} format was frozen + before I properly and fully understood the performance + consequences of doing so. + + Improvements which I was able to incorporate into + 0.9.0, despite using the same file format, are: ++ + @itemize @bullet + @item Single array implementation of the inverse BWT. This + significantly speeds up decompression, presumably +@@ -1910,12 +2316,14 @@ + Duh! Well, you live and learn. + + @end itemize ++ + Further ahead, it would be nice + to be able to do random access into files. This will + require some careful design of compressed file formats. + + + ++@node Portability Issues, Reporting Bugs, Limitations Of The Compressed File Format, Miscellaneous + @section Portability issues + After some consideration, I have decided not to use + GNU @code{autoconf} to configure 0.9.5 or 1.0. +@@ -1932,6 +2340,7 @@ + under Unix straight out-of-the-box, so to speak, especially + if you have a version of GNU C available. + ++@vindex __inline__ + There are a couple of @code{__inline__} directives in the code. GNU C + (@code{gcc}) should be able to handle them. If you're not using + GNU C, your C compiler shouldn't see them at all. +@@ -1940,6 +2349,7 @@ + easy way to do this is to compile with the flag @code{-D__inline__=}, + which should be understood by most Unix compilers. + ++@vindex BZ_STRICT_ANSI + If you still have difficulties, try compiling with the macro + @code{BZ_STRICT_ANSI} defined. This should enable you to build the + library in a strictly ANSI compliant environment. Building the program +@@ -1953,12 +2363,15 @@ + avoids all sorts of library-version issues that others may encounter + later on. + ++@vindex BZ_UNIX ++@vindex BZ_LCCWIN32 + If you build @code{bzip2} on Win32, you must set @code{BZ_UNIX} to 0 and + @code{BZ_LCCWIN32} to 1, in the file @code{bzip2.c}, before compiling. + Otherwise the resulting binary won't work correctly. + + + ++@node Reporting Bugs, Did You Get The Right Package, Portability Issues, Miscellaneous + @section Reporting bugs + I tried pretty hard to make sure @code{bzip2} is + bug free, both by design and by testing. Hopefully +@@ -1969,6 +2382,7 @@ + will ask you to email me a bug report. Experience with + version 0.1 shows that almost all these problems can + be traced to either compiler bugs or hardware problems. ++ + @itemize @bullet + @item + Recompile the program with no optimisation, and see if it +@@ -1977,6 +2391,10 @@ + of GNU C (and other compilers) generating bad code for + @code{bzip2}, and I've run across two such examples myself. + ++@kindex -O2 ++@kindex -fomit-frame-pointer ++@kindex -fno-strength-reduce ++@kindex -funroll-loops + 2.7.X versions of GNU C are known to generate bad code from + time to time, at high optimisation levels. + If you get problems, try using the flags +@@ -2013,10 +2431,12 @@ + + Finally, if the above comments don't help, you'll have to send + me a bug report. Now, it's just amazing how many people will +-send me a bug report saying something like ++send me a bug report saying something like: ++ + @display + bzip2 crashed with segmentation fault on my machine + @end display ++ + and absolutely nothing else. Needless to say, a such a report + is @emph{totally, utterly, completely and comprehensively 100% useless; + a waste of your time, my time, and net bandwidth}. +@@ -2026,12 +2446,14 @@ + The rules of the game are: facts, facts, facts. Don't omit + them because "oh, they won't be relevant". At the bare + minimum: ++ + @display + Machine type. Operating system version. + Exact version of @code{bzip2} (do @code{bzip2 -V}). + Exact version of the compiler used. + Flags passed to the compiler. + @end display ++ + However, the most important single thing that will help me is + the file that you were trying to compress or decompress at the + time the problem happened. Without that, my ability to do anything +@@ -2041,6 +2463,7 @@ + you should contact me before mailing me huge files. + + ++@node Did You Get The Right Package, Testing, Reporting Bugs, Miscellaneous + @section Did you get the right package? + + @code{bzip2} is a resource hog. It soaks up large amounts of CPU cycles +@@ -2053,34 +2476,40 @@ + an intrinsic property of the Burrows-Wheeler transform (unfortunately). + Maybe this isn't what you want. + ++@cindex zlib ++@cindex gzip + If you want a compressor and/or library which is faster, uses less + memory but gets pretty good compression, and has minimal latency, + consider Jean-loup + Gailly's and Mark Adler's work, @code{zlib-1.1.2} and + @code{gzip-1.2.4}. Look for them at ++@uref{http://www.cdrom.com/pub/infozip/zlib} and ++@uref{http://www.gzip.org} respectively. + +-@code{http://www.cdrom.com/pub/infozip/zlib} and +-@code{http://www.gzip.org} respectively. +- ++@cindex lzo + For something faster and lighter still, you might try Markus F X J + Oberhumer's @code{LZO} real-time compression/decompression library, at +-@* @code{http://wildsau.idv.uni-linz.ac.at/mfx/lzo.html}. ++@uref{http://wildsau.idv.uni-linz.ac.at/mfx/lzo.html}. + ++@cindex e2compr ++@cindex ext2 + If you want to use the @code{bzip2} algorithms to compress small blocks + of data, 64k bytes or smaller, for example on an on-the-fly disk + compressor, you'd be well advised not to use this library. Instead, + I've made a special library tuned for that kind of use. It's part of + @code{e2compr-0.40}, an on-the-fly disk compressor for the Linux + @code{ext2} filesystem. Look at +-@code{http://www.netspace.net.au/~reiter/e2compr}. ++@uref{http://www.netspace.net.au/~reiter/e2compr}. + + + ++@node Testing, Further Reading, Did You Get The Right Package, Miscellaneous + @section Testing + + A record of the tests I've done. + + First, some data sets: ++ + @itemize @bullet + @item B: a directory containing 6001 files, one for every length in the + range 0 to 6000 bytes. The files contain random lowercase +@@ -2093,6 +2522,7 @@ + @code{egcs}, @code{gcc-2.8.1}, KDE, GTK, Octave, etc. + 2200 megabytes. + @end itemize ++ + The tests conducted are as follows. Each test means compressing + (a copy of) each file in the data set, decompressing it and + comparing it against the original. +@@ -2103,6 +2533,7 @@ + blocking and buffering mechanisms. + This required modifying the source code so as to try to + break it. ++ + @enumerate + @item Data set H, with + buffer size of 1 byte, and block size of 23 bytes. +@@ -2116,7 +2547,9 @@ + @item H with buffer size of 1 byte, but normal block + size (up to 900000 bytes). + @end enumerate ++ + Then some tests with unmodified source code. ++ + @enumerate + @item H, all settings normal. + @item As (1), with small-mode decompress. +@@ -2141,25 +2574,28 @@ + @item Misc tests to make sure it builds and runs ok on non-Linux/x86 + platforms. + @end enumerate ++ + These tests were conducted on a 225 MHz IDT WinChip machine, running + Linux 2.0.36. They represent nearly a week of continuous computation. + All tests completed successfully. + + ++@node Further Reading, Concept Index, Testing, Miscellaneous + @section Further reading + @code{bzip2} is not research work, in the sense that it doesn't present + any new ideas. Rather, it's an engineering exercise based on existing + ideas. + + Four documents describe essentially all the ideas behind @code{bzip2}: ++ + @example + Michael Burrows and D. J. Wheeler: + "A block-sorting lossless data compression algorithm" + 10th May 1994. + Digital SRC Research Report 124. +- ftp://ftp.digital.com/pub/DEC/SRC/research-reports/SRC-124.ps.gz ++ @uref{ftp://ftp.digital.com/pub/DEC/SRC/research-reports/SRC-124.ps.gz} + If you have trouble finding it, try searching at the +- New Zealand Digital Library, http://www.nzdl.org. ++ New Zealand Digital Library, @uref{http://www.nzdl.org}. + + Daniel S. Hirschberg and Debra A. LeLewer + "Efficient Decoding of Prefix Codes" +@@ -2171,43 +2607,70 @@ + Program bred3.c and accompanying document bred3.ps. + This contains the idea behind the multi-table Huffman + coding scheme. +- ftp://ftp.cl.cam.ac.uk/users/djw3/ ++ @uref{ftp://ftp.cl.cam.ac.uk/users/djw3} + + Jon L. Bentley and Robert Sedgewick + "Fast Algorithms for Sorting and Searching Strings" + Available from Sedgewick's web page, +- www.cs.princeton.edu/~rs ++ @uref{http://www.cs.princeton.edu/~rs} + @end example ++ + The following paper gives valuable additional insights into the + algorithm, but is not immediately the basis of any code + used in bzip2. ++ + @example + Peter Fenwick: + Block Sorting Text Compression + Proceedings of the 19th Australasian Computer Science Conference, + Melbourne, Australia. Jan 31 - Feb 2, 1996. +- ftp://ftp.cs.auckland.ac.nz/pub/peter-f/ACSC96paper.ps ++ @uref{ftp://ftp.cs.auckland.ac.nz/pub/peter-f/ACSC96paper.ps} + @end example ++ + Kunihiko Sadakane's sorting algorithm, mentioned above, + is available from: ++ + @example +-http://naomi.is.s.u-tokyo.ac.jp/~sada/papers/Sada98b.ps.gz ++@uref{http://naomi.is.s.u-tokyo.ac.jp/~sada/papers/Sada98b.ps.gz} + @end example ++ + The Manber-Myers suffix array construction + algorithm is described in a paper + available from: ++ + @example +-http://www.cs.arizona.edu/people/gene/PAPERS/suffix.ps ++@uref{http://www.cs.arizona.edu/people/gene/PAPERS/suffix.ps} + @end example ++ + Finally, the following paper documents some recent investigations + I made into the performance of sorting algorithms: ++ + @example +-Julian Seward: ++Julian R. Seward: + On the Performance of BWT Sorting Algorithms + Proceedings of the IEEE Data Compression Conference 2000 + Snowbird, Utah. 28-30 March 2000. + @end example + ++@node Concept Index, Option Index, Further Reading, Top ++@unnumbered Concept Index ++ ++@printindex cp ++ ++@node Option Index, Variable Index, Concept Index, Top ++@unnumbered Option Index ++ ++@printindex ky ++ ++@node Variable Index, Function Index, Option Index, Top ++@unnumbered Variable Index ++ ++@printindex vr ++ ++@node Function Index, Top, Variable Index, Top ++@unnumbered Function Index ++ ++@printindex fn + + @contents + diff -BurN -x CVS bzip2.old/pkg-plist bzip2/pkg-plist --- bzip2.old/pkg-plist Thu Jun 15 19:12:48 2000 +++ bzip2/pkg-plist Wed Oct 18 01:19:10 2000 @@ -1,17 +1,31 @@ bin/bzip2 -bin/bunzip2 -bin/bzcat -bin/bz2cat +@exec ln -sf bzip2 %B/bunzip2 +@unexec rm -f %B/bunzip2 +@exec ln -sf bzip2 %B/bzcat +@unexec rm -f %B/bzcat +@exec ln -sf bzip2 %B/bz2cat +@unexec rm -f %B/bz2cat bin/bzip2recover include/bzlib.h +@unexec install-info --quiet --delete %D/info/bzip2.info %D/info/dir +info/bzip2.info +info/bzip2.info-1 +info/bzip2.info-2 +info/bzip2.info-3 +@exec install-info %D/info/bzip2.info %D/info/dir lib/libbz2.a -lib/libbz2.so lib/libbz2.so.1 +@exec ln -sf libbz2.so.1 %B/libbz2.so +@unexec rm -f %B/libbz2.so share/doc/bzip2/manual.ps.bz2 -share/doc/bzip2/manual.texi share/doc/bzip2/manual_1.html share/doc/bzip2/manual_2.html share/doc/bzip2/manual_3.html share/doc/bzip2/manual_4.html +share/doc/bzip2/manual_5.html +share/doc/bzip2/manual_6.html +share/doc/bzip2/manual_7.html +share/doc/bzip2/manual_8.html +share/doc/bzip2/manual_9.html share/doc/bzip2/manual_toc.html @dirrm share/doc/bzip2 >Release-Note: >Audit-Trail: >Unformatted: To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-ports" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200010172357.e9HNvQr59131>