Date: Sat, 21 Jul 2007 12:35:57 +1000 From: Norberto Meijome <freebsd@meijome.net> To: James Long <list@museum.rain.com> Cc: freebsd-questions@freebsd.org Subject: Re: speed of bzip2 versus gzip Message-ID: <20070721123557.715b38f7@localhost> In-Reply-To: <20070721012455.GA5012@ns.umpquanet.com> References: <20070720220337.GA87174@ns.umpquanet.com> <20070721103710.1e16a319@localhost> <2BF10D44-4FB5-4F07-B515-553BC705B900@mac.com> <20070721012455.GA5012@ns.umpquanet.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Fri, 20 Jul 2007 18:24:55 -0700 James Long <list@museum.rain.com> wrote: > On Fri, Jul 20, 2007 at 05:50:20PM -0700, Chuck Swiger wrote: > > On Jul 20, 2007, at 5:37 PM, Norberto Meijome wrote: > >>> Is it normal for bzip2 to be significantly slower than gzip? > >>> If not, where can I look for things that might be causing > >>> "bzip2 --fast" to take 50-60 times longer to compress a > >>> (sendmail log) file than gzip? > >> > >> i never measured it to see if it is 50-60 times slower, but yes, gzip > >> blows > >> bzip2 out of the water on speed. I wanted to use bzip2 to compress > >> multi-GB > >> weblog files, but gzip beat it my miles, and bzip2 wasn't THAT much better > >> @ > >> compressing it to make it worth it. > > > > Thanks for the feedback, Norberto. > > > > Of course, it all depends on what your priorities are, too-- if what you > > want is a final tarball which is being mirrored and downloaded frequently, > > then your goal is to obtain the absolute best compression, and how much CPU > > --best takes isn't important. > > > > Comparing the default (-5 compression?) of gzip to bzip2 would probably be > > more reasonable if you care about reasonably timely compression. > > If I read the man page correctly, bzip2 defaults to --best, which is why > I compared gzip to bzip2 --fast. With the 1.5G sendmail log, bzip2 --fast > compresses to just under 10M in about 55 minutes, give or take. bzip2 > --best compresses 1.5G to 1.8M, but takes about 2.25 hours. gzip > compresses almost as well (with 3% or so) as --fast, but does it in 1 > minute instead of 55 on a dual P-III 1.4GHz (but of course, using only > one CPU). I don't have the exact numbers at hand, but yes, they were definitely in that range of crazy comparison. BTW, i always compared using default bzip2 and gzip -9, because i was interested in making gzip work harder at achieving some more compression. I ran some short tests... both systems are not doing much more than this simple test Comparison using a 249 MB Apache web log file First is my laptop running FreeBSD, single CPU. 2nd is a server with the same hardware as I had compressed those multi-GB log files in 2005...this one is running CentOS/64 bit. . I know, not Freebsd, but to see if there's a difference in the OS... Both boxes have enough RAM to hold all the file in memory. The numbers are quite similar, even given the difference in hardware...it may speak very well of FreeBSD speeds ;) Compression ratios are the same in both Linux + FreeBSD, and Bzip2 compresses >THIS FILE< about 50% more than gzip -9 ------------------ CPU: Intel(R) Pentium(R) M processor 2.00GHz (1995.02-MHz 686-class CPU) 1.5 GB RAM $ uname -a FreeBSD ayiin.octantis.com.au 6.2-STABLE FreeBSD 6.2-STABLE #12: Fri Jul 13 17:45:09 EST 2007 root@ayiin.octantis.com.au:/usr/obj/usr/src/sys/AYIIN i386 $ time gzip -9 20070604-desktop.log real 0m13.373s user 0m10.398s sys 0m0.257s [betom@ayiin] [Sat Jul 21 12:27:14 2007] /usr/home/betom/Desktop $ ls -lh 20070604-desktop.log.gz -rw-r--r-- 1 betom betom 11M Jul 21 12:17 20070604-desktop.log.gz $ time gunzip ./20070604-desktop.log.gz real 0m13.926s user 0m1.455s sys 0m0.525s $ time bzip2 20070604-desktop.log real 4m2.662s user 3m21.184s sys 0m0.321s $ ls -lh 20070604-desktop.log.bz2 -rw-r--r-- 1 betom betom 5.2M Jul 21 12:17 20070604-desktop.log.bz2 $ time bunzip2 20070604-desktop.log.bz2 real 0m18.650s user 0m13.922s sys 0m0.794s ================================================== Box 2 # uname -a Linux cerberus.octantis.com.au. 2.6.18-8.1.4.el5.centos.plus #1 SMP Sun May 20 10:53:21 EDT 2007 x86_64 x86_64 x86_64 GNU/Linux CPU : 2 x model name : AMD Opteron(tm) Processor 250 stepping : 10 cpu MHz : 2400.000 4 GB RAM [root@cerberus] [Sat 21 Jul 2007 12:22:39 PM EST] ~ # time gzip -9 20070604-desktop.log real 0m7.818s user 0m7.343s sys 0m0.332s [root@cerberus] [Sat 21 Jul 2007 12:22:56 PM EST] ~ # ls -lh 20070604-desktop.log.gz -rw-r--r-- 1 numard numard 11M Jul 21 12:09 20070604-desktop.log.gz # time gunzip 20070604-desktop.log.gz real 0m2.502s user 0m1.049s sys 0m1.044s # time bzip2 20070604-desktop.log real 3m22.587s user 3m17.566s sys 0m1.741s [root@cerberus] [Sat 21 Jul 2007 12:29:19 PM EST] ~ # ls -lh 20070604-desktop.log.bz2 -rw-r--r-- 1 numard numard 5.2M Jul 21 12:09 20070604-desktop.log.bz2 # time bunzip2 20070604-desktop.log.bz2 real 0m17.544s user 0m15.261s sys 0m1.435s _________________________ {Beto|Norberto|Numard} Meijome "They redundantly repeated themselves over and over again incessantly without end ad infinitum" ibid. I speak for myself, not my employer. Contents may be hot. Slippery when wet. Reading disclaimers makes you go blind. Writing them is worse. You have been Warned.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20070721123557.715b38f7>