From owner-freebsd-hackers@FreeBSD.ORG Thu Oct 12 07:26:03 2006 Return-Path: X-Original-To: freebsd-hackers@freebsd.org Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 31E5F16A403 for ; Thu, 12 Oct 2006 07:26:03 +0000 (UTC) (envelope-from kientzle@freebsd.org) Received: from kientzle.com (h-66-166-149-50.snvacaid.covad.net [66.166.149.50]) by mx1.FreeBSD.org (Postfix) with ESMTP id 3BFB743D5E for ; Thu, 12 Oct 2006 07:26:02 +0000 (GMT) (envelope-from kientzle@freebsd.org) Received: from [10.0.0.221] (p54.kientzle.com [66.166.149.54]) by kientzle.com (8.12.9/8.12.9) with ESMTP id k9C7Q124003430 for ; Thu, 12 Oct 2006 00:26:01 -0700 (PDT) (envelope-from kientzle@freebsd.org) Message-ID: <452DEE0A.4060500@freebsd.org> Date: Thu, 12 Oct 2006 00:26:02 -0700 From: Tim Kientzle User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:1.7.12) Gecko/20060422 X-Accept-Language: en-us, en MIME-Version: 1.0 To: freebsd-hackers@freebsd.org References: <200610101727.k9AHRrYo039774@lurza.secnetix.de> In-Reply-To: <200610101727.k9AHRrYo039774@lurza.secnetix.de> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: "tar -c|gzip" faster than "tar -cz"?!? X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 12 Oct 2006 07:26:03 -0000 Oliver Fromme wrote: > > While doing some performance tuning of a backup script > I noticed that the -z option of our (bsd)tar behaves in > a very suboptimal way. It's not only a lot slower than > using gzip separately, it also compresses worse. It seems that you and others have seen very different performance. I'd be very interested in knowing why. I suspect it may have to do with average file size. How big are the files you're archiving? Does the relative performance differ with larger or smaller files? Right now, libarchive calls the libz compression function for each small piece of data. I think that it might be possible to make it faster by combining blocks of data to make fewer calls to the compression routines in libz. (This is why I think the size of the files might matter; small files result in more calls to libz with small blocks of data.) I am very surprised that you see different sizes of output. There are small differences between the compression code in libz and gzip, but I've only ever seen very trivial size differences because of that. Tim Kientzle