Date: Mon, 30 Oct 2000 17:01:30 -0800 From: Tim Kientzle <kientzle@acm.org> To: "Daniel C. Sobral" <dcs@newsguy.com>, Alexander Langer <alex@big.endian.de>, libh@FreeBSD.ORG, "Jordan K. Hubbard" <jkh@FreeBSD.ORG> Subject: Re: BOF at BSDCon: FreeBSD Installer, Packages System Message-ID: <39FE19EA.5346F798@acm.org> References: <39DCC860.B04F7D50@acm.org> <20001006155542.A29218@cichlids.cichlids.com> <39F3CDD7.15B889E7@acm.org> <20001023190412.B507@cichlids.cichlids.com> <39F47E98.4BB647AA@acm.org> <20001023202244.B10374@cichlids.cichlids.com> <39F48F4A.38D458C2@acm.org> <39FCF244.5A8C8E59@newsguy.com> <39FDC12E.304B0011@acm.org> <39FDE2A0.C2CEF041@acm.org>
next in thread | previous in thread | raw e-mail | index | archive | help
Hmmm.. As I suspected, if you first gunzip -r /usr/share/man then a tar.gz archive is only 9MB. That suggests a couple of ways to save space in the distribution archives. One, obviously, is to store the data un-gzipped and, after unpacking, go back through and gzip appropriate files. (This is tricky with the man tree because of multiply-linked files. Best is to auto-build a shell script that gzips files and creates links; then you can build a very compact archive with just one copy of each man file.) Another approach is to build a custom archive format that permits you to store the actual file data un-gzipped but mark the entry so that the de-archiver will re-gzip the data as it's written. Sounds roundabout, I know, but if you think carefully about how gzip works internally, you'll understand why this generally gives better compression. It's similar to HTTP "transfer-encoding", if you want to think of it that way. A custom archive format is no big deal; I have a favorite one I've used for a couple of years now that's extremely easy to implement, extensible, etc. It discards tar's tape-centric heritage and in the process discards most of tar's limitations. - Tim Tim Kientzle wrote: > > Tim Kientzle wrote: > > Though I haven't tested it, I wouldn't be surprised if > > the ports tree was more than twice as large in ZIP format as > > in tar.gz format. > > I just did a few quick tests against my FreeBSD 3.3 > system to see how much you lose by switching from > tar.gz to ZIP. I simply archived a couple of directories > and compared the sizes: > > Directory tar.gz ZIP > /usr/ports 7,601,675 15,008,530 > /usr/src 50,896,742 62,536,891 > /usr/bin 3,892,391 6,192,116 > /usr/share/man 28,449,979 22,518,970 (!) > > I think it's pretty clear that building a single > archive and then compressing the whole thing is > necessary if you really want to build full-featured > CD-ROM distributions. > > - Tim Kientzle > > P.S. /usr/share/man is an interesting example > which works out larger in tar.gz format because the > individual files are already gzipped. I suspect that > you could get an archive smaller than 22MB by un-gzipping > all the individual files and then building a tar.gz archive. > > To Unsubscribe: send mail to majordomo@FreeBSD.org > with "unsubscribe freebsd-libh" in the body of the message To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-libh" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?39FE19EA.5346F798>