From owner-freebsd-current@FreeBSD.ORG Wed Dec 31 23:34:28 2008 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 6E5A3106564A; Wed, 31 Dec 2008 23:34:28 +0000 (UTC) (envelope-from kientzle@freebsd.org) Received: from kientzle.com (kientzle.com [66.166.149.50]) by mx1.freebsd.org (Postfix) with ESMTP id 21AD58FC0C; Wed, 31 Dec 2008 23:34:27 +0000 (UTC) (envelope-from kientzle@freebsd.org) Received: from [10.123.2.23] (p53.kientzle.com [66.166.149.53]) by kientzle.com (8.12.9/8.12.9) with ESMTP id mBVNYQtv004430; Wed, 31 Dec 2008 15:34:26 -0800 (PST) (envelope-from kientzle@freebsd.org) Message-ID: <495C017F.9000408@freebsd.org> Date: Wed, 31 Dec 2008 15:34:23 -0800 From: Tim Kientzle User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:1.7.12) Gecko/20060422 X-Accept-Language: en-us, en MIME-Version: 1.0 To: "Simon L. Nielsen" References: <200812301616.11132.max@love2party.net> <200812302213.07155.max@love2party.net> <20081231094159.GA964@zaphod.nitro.dk> <200812311109.57955.max@love2party.net> <20081231105952.GB964@zaphod.nitro.dk> In-Reply-To: <20081231105952.GB964@zaphod.nitro.dk> Content-Type: multipart/mixed; boundary="------------010107030804000001000003" Cc: Max Laier , Tim Kientzle , cperciva@freebsd.org, freebsd-current@freebsd.org Subject: Re: tar/libarchive gzip problem [was: portsnap corrupted] X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 31 Dec 2008 23:34:28 -0000 This is a multi-part message in MIME format. --------------010107030804000001000003 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit I think I know what this is. I tried recently to add support for concatenated gzip files. This involves looking ahead for another GZip header after the end of the compressed data. The current version screws this up and ends up trying to match the file CRC as a Gzip header. About 1 in 256 files will match the first byte, which triggers the subsequent meltdown. This also explains why neither of us saw it in testing. (I knew the code didn't actually work for concatenated gzip files but didn't realize it would break decode of some regular non-concatenated files.) The attached patch simply disables this additional header check, which should fix the immediate problem. Please try it and let me know. Apologies, Tim Simon L. Nielsen wrote: > Hey Tim, > > I think one of the recent changes to tar or libarchive broke gzip > handling in some cases. See more below. > > [portsnap extract fails with gzip error] > > I'm not sure why I didn't run into it in my tests, but I think the > problem is in tar / libarchive's handling of gzip files. Taking one > random "broken" file [1] it fails with tar's build in decompression, > but works using external zcat. > > [1] http://portsnap1.freebsd.org/f/19faa3b8bd15bb8f4cd9f665a7623887729f3bd834d780e8b069df979f228e8d.gz > http://people.freebsd.org/~simon/tmp/19faa3b8bd15bb8f4cd9f665a7623887729f3bd834d780e8b069df979f228e8d.gz > > > Broken system: > > [simon@eddie:/tmp] tar --version > bsdtar 2.5.903a - libarchive 2.5.903a > [simon@eddie:/tmp] uname -a > FreeBSD eddie.nitro.dk 8.0-CURRENT FreeBSD 8.0-CURRENT #1: Tue Dec 30 22:28:33 CET 2008 simon@eddie.nitro.dk:/FreeBSD/obj/FreeBSD/system-CURRENT/sys/EDDIE i386 > [simon@eddie:/tmp] tar tvf 19faa3b8bd15bb8f4cd9f665a7623887729f3bd834d780e8b069df979f228e8d.gz > /dev/null > tar: Error opening archive: Invalid GZip header (saw 99 at offset 1) > [simon@eddie:/tmp] zcat 19faa3b8bd15bb8f4cd9f665a7623887729f3bd834d780e8b069df979f228e8d.gz | tar tf - > /dev/null > [simon@eddie:/tmp] zcat 19faa3b8bd15bb8f4cd9f665a7623887729f3bd834d780e8b069df979f228e8d.gz | sha256 > 19faa3b8bd15bb8f4cd9f665a7623887729f3bd834d780e8b069df979f228e8d > > > OK system: > > [simon@benji:/tmp] tar --version > bsdtar 2.5.5 - libarchive 2.5.5 > [simon@benji:/tmp] uname -a > FreeBSD benji.s 7.1-RC2 FreeBSD 7.1-RC2 #0: Tue Dec 23 15:18:30 UTC 2008 root@logan.cse.buffalo.edu:/usr/obj/usr/src/sys/GENERIC i386 > [simon@benji:/tmp] tar tvf 19faa3b8bd15bb8f4cd9f665a7623887729f3bd834d780e8b069df979f228e8d.gz > /dev/null > [simon@benji:/tmp] zcat 19faa3b8bd15bb8f4cd9f665a7623887729f3bd834d780e8b069df979f228e8d.gz | sha256 > 19faa3b8bd15bb8f4cd9f665a7623887729f3bd834d780e8b069df979f228e8d > > > Another OK system: > > [simon@ref8-amd64:files] tar --version > bsdtar 2.5.5 - libarchive 2.5.5 > [simon@ref8-amd64:files] uname -a > FreeBSD ref8-amd64.freebsd.org 8.0-CURRENT FreeBSD 8.0-CURRENT #2 r184542:185402: Fri Nov 28 19:14:40 UTC 2008 peter@ref8-amd64.freebsd.org:/scratch/src/sys/amd64/compile/REF8-AMD64 amd64 > [simon@ref8-amd64:files] tar tvf 19faa3b8bd15bb8f4cd9f665a7623887729f3bd834d780e8b069df979f228e8d.gz > /dev/null > [simon@ref8-amd64:files] zcat 19faa3b8bd15bb8f4cd9f665a7623887729f3bd834d780e8b069df979f228e8d.gz | sha256 > 19faa3b8bd15bb8f4cd9f665a7623887729f3bd834d780e8b069df979f228e8d > --------------010107030804000001000003 Content-Type: text/x-patch; name="gzip_decompression.patch" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="gzip_decompression.patch" Index: archive_read_support_compression_gzip.c =================================================================== --- archive_read_support_compression_gzip.c (revision 185679) +++ archive_read_support_compression_gzip.c (working copy) @@ -428,8 +428,7 @@ "Failed to clean up gzip decompressor"); return (ARCHIVE_FATAL); } - /* Restart header parser with the next block. */ - state->header_state = state->header_done = 0; + state->eof = 1; /* FALL THROUGH */ case Z_OK: /* Decompressor made some progress. */ /* If we filled our buffer, update stats and return. */ --------------010107030804000001000003--