From owner-freebsd-arch@FreeBSD.ORG Wed Jan 14 19:39:56 2004 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id B77B916A4CE for ; Wed, 14 Jan 2004 19:39:56 -0800 (PST) Received: from kientzle.com (h-66-166-149-50.SNVACAID.covad.net [66.166.149.50]) by mx1.FreeBSD.org (Postfix) with ESMTP id 6AF7543D55 for ; Wed, 14 Jan 2004 19:39:54 -0800 (PST) (envelope-from kientzle@acm.org) Received: from acm.org ([66.166.149.54]) by kientzle.com (8.12.9/8.12.9) with ESMTP id i0F3drkX076610; Wed, 14 Jan 2004 19:39:54 -0800 (PST) (envelope-from kientzle@acm.org) Message-ID: <40060B87.20906@acm.org> Date: Wed, 14 Jan 2004 19:39:51 -0800 From: Tim Kientzle User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:1.4) Gecko/20031006 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Vladimir Dozen References: <4004D445.7020205@acm.org> <20040114234829.GA19067@cat.robbins.dropbear.id.au> <4005EB9D.50506@acm.org> <4005FAB4.7070304@mail.ru> In-Reply-To: <4005FAB4.7070304@mail.ru> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit cc: freebsd-arch@freebsd.org Subject: Re: Request for Comments: libarchive, bsdtar X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list Reply-To: kientzle@acm.org List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 15 Jan 2004 03:39:56 -0000 Vladimir Dozen wrote: > > so I want to make sure it's well-tested > > BTW, how do you perform testing of the library/tar? I've used automated tests to verify a few of the trickier routines, such as exercising boundary conditions in the formatting and parsing logic. There are a number of built-in logic tests in the code. Most notably, each public function starts with a call to "archive_check_magic" which verifies that the provided archive structure is in the correct state. I've also been collecting sample test archives to verify correct operation. Joerg Schilling's collection of test files has been very helpful. I haven't yet had a chance to automate the full-program tests, though. I have a few ideas about how to proceed and what needs testing, but haven't yet pieced anything together. I also plan to use dmalloc (or something similar) to test for memory leaks and invalid heap operations. I used some crude, home-grown routines early in development to verify memory usage but haven't had a chance to do more systematic testing. I have tested enough to know that: * libarchive correctly archives 64k pathnames * performance is comparable to gtar overall * When reading/writing compressed archives, zlib/bzlib are the performance bottlenecks. bzlib, in particular, seems very sensitive to the size of blocks you feed it; some work to compress/decompress larger blocks would be helpful * When writing non-compressed archives, getpwent/getgrent calls in bsdtar are the most obvious performance issue (about 10% of the CPU time for tar -cf /dev/null) I'm considering a simple LRU cache of uname/gname lookups to address this. > >> - I would prefer it if compression was done by opening a pipe to > >> gzip/bzip2 instead of using libz/libbz2. > ... > The right way to avoid code duplication between gzip and libarchive > is to use common libgzip. The same applies to bzip and compress. Yes, I use zlib and bzlib to handle compression. This handles essentially everything except for the relatively minor task of generating/verifying the gzip header (newer versions of zlib handle even that, but the version we currently have in the tree does not). I would love to see a library version of compress(1) with an API similar to that of zlib/bzlib. Tim