Skip site navigation (1)Skip section navigation (2)
Date:      09 Oct 1998 00:22:59 -0500
From:      Joel Ray Holveck <joelh@gnu.org>
To:        Mike Smith <mike@smith.net.au>
Cc:        freebsd-current@FreeBSD.ORG
Subject:   Re: BETA problems
Message-ID:  <86u31efgwc.fsf@detlev.UUCP>
In-Reply-To: Mike Smith's message of "Thu, 08 Oct 1998 18:25:24 -0700"
References:  <199810090125.SAA02069@dingo.cdrom.com>

next in thread | previous in thread | raw e-mail | index | archive | help
Mike Smith <mike@smith.net.au> writes:

>>>>> (One suggestion: If an install fails due to a network failure,
>>>>> restart at the block instead of at the distribution.)
>>>> Owing to the way the blocks are compressed, you can't do this.
>>>> You have to start at the beginning.
>>> I haven't studied this closely, but Gary said that it's because the
>>> entire segment is a single gzipped stream, and that doing so would
>>> cause a restart midstream.
>> I just glanced over the zlib sources, and started wondering.  Would
>> it be possible to copy the state of the decompression engine, in
>> anticipation of failure?  How expensive would it be?
> We don't have access to that state, because tar forks copy of gzip to
> do its work.  If we were doing the decompression internally, I think
> it would be reasonably straightforward.

It is possibly easier if we have a separate gzip.  (The following is
based on the assumption that all blocks, cat'd together, contain a
single .tar.gz in its entirety.)

Consider the following.  (Gross hack follows.)  We don't have to let
tar do the gzip.  We can create the pipe in sysinstall, feed our data
to gzip, and pipe that straight to tar.  Start gzip with an extra fd,
say, fd 3, that is half of a socketpair created by sysinstall.  Upon
receiving a SIGUSR1, gzip is to fork.  The child then creates a new
pipe, passes the input side (with the new child's pid) up fd3 (using
the recently-discussed SCM_RIGHTS method), and dup2's the output side
to stdin.  (The same fd3 and stdout can be used throughout the
process; no new descriptors need to be created there.)

Now, sysinstall SIGUSR1's the gzip before each block, and remembers
the pid and fd of the pipe to the resultant child.  If the block
succeeds, then the child is killed.  If not, then the original gzip is
killed, and the block is restarted with the child gzip (which contains
the state of gzip before the failed block).

Do the same to tar.  However, for tar, the new SIGUSR1'd child also
records the name (or inode) of the current file, and its current
position.  Upon receiving a SIGUSR2, it will reopen the file, seek,
and carry on.

This actually sounds fairly like it could be implemented
straightforwardly and with minimal resource consumption.  As a bonus,
it would be extending tar -z to handle transient errors on other
multiple-volume .tar.gz backups, such as a network tape drive going
offline.  However, since I'm presently operating at a late hour, not
to mention the number of beers consumed, I'd like some other people's
ideas.  Any thoughts?

Happy hacking,
joelh

-- 
Joel Ray Holveck - joelh@gnu.org - http://www.wp.com/piquan
   Fourth law of programming:
   Anything that can go wrong wi
sendmail: segmentation violation - core dumped

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-current" in the body of the message



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?86u31efgwc.fsf>