From owner-freebsd-hackers@FreeBSD.ORG Wed Oct 10 20:46:29 2012 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id A53583E4 for ; Wed, 10 Oct 2012 20:46:29 +0000 (UTC) (envelope-from wojtek@wojtek.tensor.gdynia.pl) Received: from wojtek.tensor.gdynia.pl (wojtek.tensor.gdynia.pl [89.206.35.99]) by mx1.freebsd.org (Postfix) with ESMTP id CA9E38FC08 for ; Wed, 10 Oct 2012 20:46:28 +0000 (UTC) Received: from wojtek.tensor.gdynia.pl (localhost [127.0.0.1]) by wojtek.tensor.gdynia.pl (8.14.5/8.14.5) with ESMTP id q9AKkCHl002228; Wed, 10 Oct 2012 22:46:12 +0200 (CEST) (envelope-from wojtek@wojtek.tensor.gdynia.pl) Received: from localhost (wojtek@localhost) by wojtek.tensor.gdynia.pl (8.14.5/8.14.5/Submit) with ESMTP id q9AKkBU6002225; Wed, 10 Oct 2012 22:46:11 +0200 (CEST) (envelope-from wojtek@wojtek.tensor.gdynia.pl) Date: Wed, 10 Oct 2012 22:46:11 +0200 (CEST) From: Wojciech Puchar To: Kurt Lidl Subject: Re: SMP Version of tar In-Reply-To: <20121010143314.GA8402@pix.net> Message-ID: References: <5069C9FC.6020400@brandonfa.lk> <324B736D-8961-4E44-A212-2ECF3E60F2A0@kientzle.com> <20121008083814.GA5830@straylight.m.ringlet.net> <15DBA1A9-A4B6-4F7D-A9DC-3412C4BE3517@kientzle.com> <20121010143314.GA8402@pix.net> User-Agent: Alpine 2.00 (BSF 1167 2008-08-23) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.2.7 (wojtek.tensor.gdynia.pl [127.0.0.1]); Wed, 10 Oct 2012 22:46:12 +0200 (CEST) Cc: Brandon Falk , freebsd-hackers@freebsd.org X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 10 Oct 2012 20:46:29 -0000 > > Tim is correct in that gzip datastream allows for concatenation of > compressed blocks of data, so you might break the input stream into > a bunch of blocks [A, B, C, etc], and then can append those together > into [A.gz, B.gz, C.gz, etc], and when uncompressed, you will get > the original input stream. > I think that Wojciech's point is that the compressed data stream for > for the single datastream is different than the compressed data > stream of [A.gz, B.gz, C.gz, etc]. Both will decompress to the same > thing, but the intermediate compressed representation will be different. So - after your response it is clear that parallel generated tar.gz will be different and have slightly (can be ignored) worse compression, and WILL be compatible with standard gzip as it can decompress from multiple streams which i wasn't aware of. That's good. at the same time parallel tar will go back to single thread when unpacking standard .tar.gz - not a big deal, as gzip decompression is untrafast and I/O is usually a limit.