From owner-freebsd-hackers@FreeBSD.ORG Wed Aug 2 16:54:48 2006 Return-Path: X-Original-To: freebsd-hackers@freebsd.org Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 202AD16A4DD for ; Wed, 2 Aug 2006 16:54:48 +0000 (UTC) (envelope-from rick@kiwi-computer.com) Received: from kiwi-computer.com (megan.kiwi-computer.com [63.224.10.3]) by mx1.FreeBSD.org (Postfix) with SMTP id 6CFB743D6D for ; Wed, 2 Aug 2006 16:54:47 +0000 (GMT) (envelope-from rick@kiwi-computer.com) Received: (qmail 13462 invoked by uid 2001); 2 Aug 2006 16:54:46 -0000 Date: Wed, 2 Aug 2006 11:54:46 -0500 From: "Rick C. Petty" To: Peter Jeremy Message-ID: <20060802165446.GD13312@megan.kiwi-computer.com> References: <17614.8289.134373.387558@bhuda.mired.org> <96b30c400607310847s1d2f845eo212b234d03f51e9a@mail.gmail.com> <17614.10982.499561.139268@bhuda.mired.org> <20060801072611.GA717@turion.vk2pj.dyndns.org> <20060801171150.GB3413@megan.kiwi-computer.com> <44CF8F1A.5090506@centtech.com> <20060801174048.GE3413@megan.kiwi-computer.com> <44D04797.1040201@freebsd.org> <20060802073340.GA713@turion.vk2pj.dyndns.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20060802073340.GA713@turion.vk2pj.dyndns.org> User-Agent: Mutt/1.4.2.1i Cc: freebsd-hackers@freebsd.org Subject: Re: [PATCH] adding two new options to 'cp' X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: rick-freebsd@kiwi-computer.com List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 02 Aug 2006 16:54:48 -0000 On Wed, Aug 02, 2006 at 05:33:40PM +1000, Peter Jeremy wrote: > On Tue, 2006-Aug-01 23:35:03 -0700, Tim Kientzle wrote: > >The "cheap" solution is to handle it purely on > >extract: Detect blocks of zeros when restoring > >files and seek over them. > > The downside is that you wind up with a sparse file whether or not you > wanted one. No, you wind up with a sparse file if you specify the "-S" option to tar. If you didn't want one, don't specify the option. > Actually, the only real solution to copying sparse files is to add > a system call that can return a map of holes. This would neatly > address the "needs two passes" problem with tar. You don't need two passes-- you advance the file pointer whenever you detect a block of zeros. Note: care must be taken to only do this for newly-created or otherwise truncated files, otherwise the skip ahead wouldn't work. > As a general comment (not addressed to Tim): There _is_ a downside > to sparsifying files. If you take a sparse file and start filling > in the holes, the net result will be very badly fragmented and hence > have very poor sequential I/O performance. If you're never going to > update a file then making it sparse makes sense, if you will be > updating it, you will get better performance by making it non-sparse. Agreed, with a minor correction/elaboration. By "updating" you mean specifically "updating but not appending". And another note: a good way to defragment a file is to sequentially copy it. (The "best" way is to copy the file to a new filesystem, that way you guarantee the blocks allocated to the file are contiguous.) -- Rick C. Petty