Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 20 Aug 2007 20:58:23 +0200
From:      Max Laier <max@love2party.net>
To:        freebsd-current@freebsd.org
Cc:        Peter Jeremy <peterjeremy@optushome.com.au>, Jeff Roberson <jroberson@chesapeake.net>
Subject:   Re: Why we don't use bzip2 in sysinstall/rescue?
Message-ID:  <200708202058.39565.max@love2party.net>
In-Reply-To: <20070820103424.GG1164@turion.vk2pj.dyndns.org>
References:  <200708170939.l7H9diEk054469@lurza.secnetix.de> <20070819163934.V568@10.0.0.1> <20070820103424.GG1164@turion.vk2pj.dyndns.org>

next in thread | previous in thread | raw e-mail | index | archive | help
--nextPart2870256.GJF9jbluCR
Content-Type: text/plain;
  charset="iso-8859-6"
Content-Transfer-Encoding: quoted-printable
Content-Disposition: inline

On Monday 20 August 2007, Peter Jeremy wrote:
> On 2007-Aug-19 16:46:10 -0700, Jeff Roberson <jroberson@chesapeake.net>=20
wrote:
> >I tried this on my 1.8ghz pentium M laptop with 5.6MB of jpg data.
> >
> >I did:
> >
> >tar cvf foo.tar foo
> >cat foo.tar >> /dev/null
> >time bzip2/gzip foo.tar
> >
> >I removed and recreated the tar each time.  The cat was to make sure
> > it was in cache, although it certainly was from the creation step
> > before.
> >
> >Anyway, the results are:
> >
> >bzip2
> >2.452u 0.026s 0:07.65 32.2% 92+3227k 5+43io 0pf+0w 1849c/6w
> >
> >gzip
> >0.539u 0.020s 0:01.75 31.4% 109+3268k 2+44io 0pf+0w 493c/3w
>
> I don't believe this is a reasonable test because:
> 1) You are measuring compression time, whilst it's decompression time
>    that is relevant to installation.
> 2) jpeg images should not be compressible and are not representative
>    of the type of data in a FreeBSD release.
>
> I've tried what I believe is a more reasonable benchmark on an
> Athlon XP-1800, running a recent 7-CURRENT using all the installation
> images in 6.2-RELEASE-i386-disk1.iso.
>
> I concatenated all the 6.2-RELEASE/*/*.?? parts into */*.tgz files as
> well as copying ports.tgz (a total of 31 files).  I also decompressed
> each file and recompressed it into a bzip2 file.  The total sizes
> were:
> */*.tbz: 237717490
> */*.tgz: 281754511
>
> Like you, I used "cat */*.t{g,b}z >/dev/null" to cache the files
> and use systat to verify that they were cached.
>
> Timing the gzcat and bzcat runs gives:
> gzcat -v */*.tgz > /dev/null  12.01s user 0.88s system 98% cpu 13.115=20
> total=20
> gzcat -v */*.tgz > /dev/null  11.95s user 0.95s system 98% cpu 13.124=20
> total=20
> gzcat -v */*.tgz > /dev/null  11.96s user 0.91s system 98% cpu 13.092=20
> total=20
> bzcat -v */*.tbz > /dev/null  153.29s user 3.43s system 98% cpu 2:39.03
> total=20
> bzcat -v */*.tbz > /dev/null  153.32s user 3.26s system 98% cpu 2:39.14
> total=20
> bzcat -v */*.tbz > /dev/null  153.16s user 3.48s system 98% cpu 2:39.02
> total=20
>
> This is nearly 13:1 slower for bzcat, with a size reduction of about
> 15%.
>
> As for the CPU vs I/O tradeoff, I believe that gzcat will be I/O bound
> whilst bzcat will be CPU bound in most situations, though I haven't
> actually verified this.

With an amd64 world(236M) tar'ed together with -czf / -cyf respectively I=20
get the following input bandwidth numbers (gathered via dd=20
if=3Damd64.t{g,b}z of=3D/dev/stdio | {g,b}zcat > /dev/null):

hw.model: Intel(R) Pentium(R) 4 CPU 2.00GHz:
x bz   + gz
   N           Min           Max        Median           Avg        Stddev
x 13       1411432       1554463       1544446     1521445.7     50394.705
+ 13      13391136      13847045      13804007      13726781     138363.27
Difference at 95.0% confidence
        1.22053e+07 +/- 84296.2
        802.22% +/- 5.54053%
        (Student's t, pooled s =3D 104125)

hw.model=3DAMD Opteron(tm) Processor 275:
x fast.bz  + fast.gz
   N           Min           Max        Median           Avg        Stddev
x 10       3429556       3889574       3449869     3525725.6     169675.46
+ 10      41967910      46046387      45944662      45490300     1257435.8
Difference at 95.0% confidence
        4.19646e+07 +/- 843005
        1190.24% +/- 23.9101%
        (Student's t, pooled s =3D 897200)

So it seems that bzip2 will indeed be bound to CPU - at least when=20
installing from CD.  netinst over the internet is a different story,=20
though.

=2D-=20
/"\  Best regards,                      | mlaier@freebsd.org
\ /  Max Laier                          | ICQ #67774661
 X   http://pf4freebsd.love2party.net/  | mlaier@EFnet
/ \  ASCII Ribbon Campaign              | Against HTML Mail and News

--nextPart2870256.GJF9jbluCR
Content-Type: application/pgp-signature; name=signature.asc 
Content-Description: This is a digitally signed message part.

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.4 (FreeBSD)

iD8DBQBGyeRfXyyEoT62BG0RAkwVAJ997GbvWwS9meNewwnpJsyAAmlrKQCfcip8
nw2QMwF3JYszR6oXcr4JvwM=
=Pezz
-----END PGP SIGNATURE-----

--nextPart2870256.GJF9jbluCR--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200708202058.39565.max>