From owner-freebsd-amd64@FreeBSD.ORG Fri Apr 23 18:55:50 2004 Return-Path: Delivered-To: freebsd-amd64@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 524E216A4CE for ; Fri, 23 Apr 2004 18:55:50 -0700 (PDT) Received: from smtp01.syd.iprimus.net.au (smtp01.syd.iprimus.net.au [210.50.30.52]) by mx1.FreeBSD.org (Postfix) with ESMTP id 9399343D48 for ; Fri, 23 Apr 2004 18:55:49 -0700 (PDT) (envelope-from tim@robbins.dropbear.id.au) Received: from robbins.dropbear.id.au (210.50.248.251) by smtp01.syd.iprimus.net.au (7.0.024) id 402BA92701B7A16F; Sat, 24 Apr 2004 11:55:48 +1000 Received: by robbins.dropbear.id.au (Postfix, from userid 1000) id 26EB641E5; Sat, 24 Apr 2004 11:55:24 +1000 (EST) Date: Sat, 24 Apr 2004 11:55:24 +1000 From: Tim Robbins To: Peter Losher Message-ID: <20040424015524.GA37337@cat.robbins.dropbear.id.au> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.4.1i cc: freebsd-amd64@freebsd.org Subject: Re: Issues w/ gzip for > 2GB files? X-BeenThere: freebsd-amd64@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Porting FreeBSD to the AMD64 platform List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 24 Apr 2004 01:55:50 -0000 [Context lost -- I was not subscribed to the list at the time.] I can reproduce this, but the error that I get is slightly different: zcat: bigfile.gz: invalid compressed data--length error This is a bug in gzip. By checking the code, "length error" means that the file did not expand to its original size *modulo 2^32* (see RFC 1952 section 2.3.1 "ISIZE"). gzip does not explicitly compute the remainder mod. 2^32; instead it computes it implicitly by storing the size in an "unsigned long", which it presumes is 32 bits wide. But since "long" is 64-bit on amd64, the modulo is not performed, so the extracted file size != (original file size mod 2^32) for files larger than 4 GB. The problem you encountered may another instances of this same basic incorrect assumption. Try this patch and let me know how things go. --- gnu/usr.bin/gzip/gzip.h.orig Sat Apr 24 11:51:16 2004 +++ gnu/usr.bin/gzip/gzip.h Sat Apr 24 11:40:43 2004 @@ -41,9 +41,11 @@ #define local static -typedef unsigned char uch; -typedef unsigned short ush; -typedef unsigned long ulg; +#include + +typedef uint8_t uch; +typedef uint16_t ush; +typedef uint32_t ulg; /* Return codes from gzip */ #define OK 0 In any case, the bug ought to be reported to the gzip developers. Tim