From owner-freebsd-hackers@FreeBSD.ORG Sat Sep 18 04:01:09 2010 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 9F356106564A; Sat, 18 Sep 2010 04:01:09 +0000 (UTC) (envelope-from kaduk@mit.edu) Received: from dmz-mailsec-scanner-4.mit.edu (DMZ-MAILSEC-SCANNER-4.MIT.EDU [18.9.25.15]) by mx1.freebsd.org (Postfix) with ESMTP id 304138FC12; Sat, 18 Sep 2010 04:01:08 +0000 (UTC) X-AuditID: 1209190f-b7bf7ae00000628e-91-4c9439882650 Received: from mailhub-auth-3.mit.edu ( [18.9.21.43]) by dmz-mailsec-scanner-4.mit.edu (Symantec Brightmail Gateway) with SMTP id 07.91.25230.889349C4; Sat, 18 Sep 2010 00:01:12 -0400 (EDT) Received: from outgoing.mit.edu (OUTGOING-AUTH.MIT.EDU [18.7.22.103]) by mailhub-auth-3.mit.edu (8.13.8/8.9.2) with ESMTP id o8I417ZW026321; Sat, 18 Sep 2010 00:01:07 -0400 Received: from multics.mit.edu (MULTICS.MIT.EDU [18.187.1.73]) (authenticated bits=56) (User authenticated as kaduk@ATHENA.MIT.EDU) by outgoing.mit.edu (8.13.6/8.12.4) with ESMTP id o8I415po019277 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NOT); Sat, 18 Sep 2010 00:01:07 -0400 (EDT) Received: (from kaduk@localhost) by multics.mit.edu (8.12.9.20060308) id o8I414dO023384; Sat, 18 Sep 2010 00:01:04 -0400 (EDT) Date: Sat, 18 Sep 2010 00:01:04 -0400 (EDT) From: Benjamin Kaduk To: kientzle@freebsd.org, kaiw@freebsd.org In-Reply-To: <20100829201050.GA60715@stack.nl> Message-ID: References: <20100829201050.GA60715@stack.nl> User-Agent: Alpine 1.10 (GSO 962 2008-03-14) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; format=flowed; charset=US-ASCII X-Brightmail-Tracker: AAAAAA== Cc: freebsd-hackers@freebsd.org, Jilles Tjoelker Subject: Re: ar(1) format_decimal failure is fatal? X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 18 Sep 2010 04:01:09 -0000 On Sun, 29 Aug 2010, Jilles Tjoelker wrote: > On Sat, Aug 28, 2010 at 07:08:34PM -0400, Benjamin Kaduk wrote: >> [...] >> building static egacy library >> ar: fatal: Numeric user ID too large >> *** Error code 70 > >> This error appears to be coming from >> lib/libarchive/archive_write_set_format_ar.c , which seems to only have >> provisions for outputting a user ID in AR_uid_size = 6 columns. [...] >> It looks like this macro was so defined in version 1.1 of that file, with >> commit message "'ar' format support for libarchive, contributed by Kai >> Wang.". This doesn't make it terribly clear whether the 'ar' format >> mandates this length, or if it is an implementation decision, so I get to >> ask: what reasoning (if any) was behind this choice? Would anything break >> if it was bumped up to a larger size? Are there other options for a >> workaround in my AFS environment? > > I wonder if the uid/gid fields are useful at all for ar archives. Ar > archives are usually not extracted, and when they are, the current > user's values seem good enough. The uid/gid also prevent exactly > reproducible builds (together with the timestamp). GNU binutils has recently (well, March 2009) added a -D ("deterministic") argument to ar(1) which sets the timestamp, uid, and gid to zero, and the mode to 644. If that argument is not given, linux's ar(1) happily uses my 8-digit uid as-is; the manual page seems to imply that it will handle 15 or 16 digits in that field. Solaris' ar(1) caps large uids to 600001. On OS X, the value is wrapped at some power of two less than 26, showing up in the archive as 271 (33554703 = 271 + 2^25). In no cases that I tried was a large uid a fatal error; I'm not really convinced that it should be fatal for FreeBSD. Poking at the source, it seems this stems from usr.bin/ar/write.c's use of the AC() macro, defined in ar.h: #define AC(CALL) do { \ if ((CALL)) \ bsdar_errc(bsdar, EX_SOFTWARE, 0, "%s", \ archive_error_string(a)); \ } while (0) archive_write_header() is always called within this macro, and the relevant implementation (archive_write_ar_header() in libarchive/archive_write_set_format_ar.c) immediately returns ARCHIVE_WARN if the format_decimal() call fails. Other places in the libarchive code actually use the distinction between ARCHIVE_OK, ARCHIVE_WARN, and ARCHIVE_FATAL (and friends); I think that it would be pretty easy to modify format_decimal() (and probably its cousins) to use that convention instead of just -1 and 0. It already does a reasonable thing in the case of overflow (write the maximum value), it just does not distinguish between the different possible errors. I propose that format_{decimal,octal}() return ARCHIVE_FAILED for negative input, and ARCHIVE_WARN for overflow. archive_write_ar_header() can then catch ARCHIVE_WARN from the format_foo functions and continue on, propagating the ARCHIVE_WARN return value at the end of its execution instead of bailing immediately. ar/write.c would also need to be changed, calling archive_write_header without the AC macro and dealing with the ARCHIVE_WARN return value case, presumably by writing archive_error_string(a) to stderr and continuing. Would (one of) you be willing to review a patch to that effect? Thanks, Ben Kaduk