From owner-freebsd-current Thu Oct 8 22:25:24 1998 Return-Path: Received: (from majordom@localhost) by hub.freebsd.org (8.8.8/8.8.8) id WAA24725 for freebsd-current-outgoing; Thu, 8 Oct 1998 22:25:24 -0700 (PDT) (envelope-from owner-freebsd-current@FreeBSD.ORG) Received: from merlin.camalott.com (merlin.camalott.com [208.229.74.19]) by hub.freebsd.org (8.8.8/8.8.8) with ESMTP id WAA24715 for ; Thu, 8 Oct 1998 22:25:21 -0700 (PDT) (envelope-from joelh@gnu.org) Received: from detlev.UUCP (tex-94.camalott.com [208.229.74.94]) by merlin.camalott.com (8.8.7/8.8.7) with ESMTP id AAA27306; Fri, 9 Oct 1998 00:27:00 -0500 Received: (from joelh@localhost) by detlev.UUCP (8.9.1/8.9.1) id AAA10983; Fri, 9 Oct 1998 00:23:03 -0500 (CDT) (envelope-from joelh) To: Mike Smith Subject: Re: BETA problems References: <199810090125.SAA02069@dingo.cdrom.com> From: Joel Ray Holveck Cc: freebsd-current@FreeBSD.ORG Date: 09 Oct 1998 00:22:59 -0500 In-Reply-To: Mike Smith's message of "Thu, 08 Oct 1998 18:25:24 -0700" Message-ID: <86u31efgwc.fsf@detlev.UUCP> Lines: 57 X-Mailer: Gnus v5.5/Emacs 20.2 Sender: owner-freebsd-current@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.ORG Mike Smith writes: >>>>> (One suggestion: If an install fails due to a network failure, >>>>> restart at the block instead of at the distribution.) >>>> Owing to the way the blocks are compressed, you can't do this. >>>> You have to start at the beginning. >>> I haven't studied this closely, but Gary said that it's because the >>> entire segment is a single gzipped stream, and that doing so would >>> cause a restart midstream. >> I just glanced over the zlib sources, and started wondering. Would >> it be possible to copy the state of the decompression engine, in >> anticipation of failure? How expensive would it be? > We don't have access to that state, because tar forks copy of gzip to > do its work. If we were doing the decompression internally, I think > it would be reasonably straightforward. It is possibly easier if we have a separate gzip. (The following is based on the assumption that all blocks, cat'd together, contain a single .tar.gz in its entirety.) Consider the following. (Gross hack follows.) We don't have to let tar do the gzip. We can create the pipe in sysinstall, feed our data to gzip, and pipe that straight to tar. Start gzip with an extra fd, say, fd 3, that is half of a socketpair created by sysinstall. Upon receiving a SIGUSR1, gzip is to fork. The child then creates a new pipe, passes the input side (with the new child's pid) up fd3 (using the recently-discussed SCM_RIGHTS method), and dup2's the output side to stdin. (The same fd3 and stdout can be used throughout the process; no new descriptors need to be created there.) Now, sysinstall SIGUSR1's the gzip before each block, and remembers the pid and fd of the pipe to the resultant child. If the block succeeds, then the child is killed. If not, then the original gzip is killed, and the block is restarted with the child gzip (which contains the state of gzip before the failed block). Do the same to tar. However, for tar, the new SIGUSR1'd child also records the name (or inode) of the current file, and its current position. Upon receiving a SIGUSR2, it will reopen the file, seek, and carry on. This actually sounds fairly like it could be implemented straightforwardly and with minimal resource consumption. As a bonus, it would be extending tar -z to handle transient errors on other multiple-volume .tar.gz backups, such as a network tape drive going offline. However, since I'm presently operating at a late hour, not to mention the number of beers consumed, I'd like some other people's ideas. Any thoughts? Happy hacking, joelh -- Joel Ray Holveck - joelh@gnu.org - http://www.wp.com/piquan Fourth law of programming: Anything that can go wrong wi sendmail: segmentation violation - core dumped To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-current" in the body of the message