Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 13 Apr 2010 20:31:32 -0700
From:      Tim Kientzle <kientzle@freebsd.org>
To:        Garrett Cooper <gcooper@freebsd.org>
Cc:        Perforce Change Reviews <perforce@freebsd.org>, Florent Thoumie <flz@esat.net>
Subject:   Re: PERFORCE change 176831 for review
Message-ID:  <4BC53714.80805@freebsd.org>
In-Reply-To: <l2y364299f41004122249q2f5734a7j89be807581e42dac@mail.gmail.com>
References:  <201004121230.o3CCUsIX029146@repoman.freebsd.org>	 <4BC3EB5B.5070801@freebsd.org> <l2y364299f41004122249q2f5734a7j89be807581e42dac@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
>> ... I would also put in a simple verification
>> that the entry you found is a regular entry:
>>   archive_entry_filetype(entry) == AE_IFREG
> 
> Ok. What does a regular entry correspond to (I assume not a regular file)?

Sorry, "regular entry" == "regular file" (or at least, what
will become a regular file once you write it to disk, if you
want to be really pedantic about it ;-).  Coincidentally,
AE_IFREG == S_IFREG.  The separate AE_IFxxx constants
are there mostly portability shims (e.g., Windows doesn't
have block devices, so doesn't define S_IFBLK,
but AE_IFBLK is always available).

>>> + * NOTE: the exit code is 0 / 1 so that this can be fed directly into
>>> exit
>>> + * when doing piped tar commands for copying hierarchies *hint*, *hint*.
>> Why do you want to copy hierarchies?
>> Seems a waste of disk bandwidth.  *hint* *hint* ;-)
> 
> There's a fair amount of tar cpf - -C <dir_a> . | tar xpf - -C <dir_b>
> in libinstall . This is being done to preserve file attributes,
> fflags, modes, ownership, permissions, etc. ... 
> Installing directly to hierarchies works in theory as well, but it's
> considerably more dangerous unless you do an information transfer
> between the pkg contents and the install base, and unfortunately that
> is tough to achieve with 100% clarity from what I've seen.

That's exactly what I was getting to:  Someday, pkg_add
should write the files directly to their final location
and avoid the copy.  (You're exactly right that you
don't want to build a "copy tree" facility.)

Clearly, direct install is not something you want to
tackle in this stage of the rewrite.  There's a lot of
gains from exactly what you're doing.

But I really do believe that single-pass direct
install is feasible and is eventually where we want
to be.  The key insight for me was when Florent recently
pointed out that you could just read the +CONTENTS,
then do a verify pass, then extract everything.
(Any conflict during the extraction pass
would be a fatal error:  delete everything
extracted so far and scream loudly.)

Dealing with dependent packages is still
a little tricky but I think it's all doable.

> I saw that in the original comments and while I think that you have a
> point, I still am leery about someone specifying a package with
> contents that start with + because it immediately breaks the
> assumption. ...

Unless I'm mistaken, pkg_add has used "+*" to identify
metadata files for a long time, so I think your own
argument from compatibility might be against you here.
(Though I suspect that you're right in the long term that
we should be obeying the +CONTENTS declarations instead of
using filename conventions.)

> unpack_file, blah with the appropriate metadata filenames considering
> that it's a lot quicker than having to go through the entire tarball
> end to end (especially if it's a large tarfile like openoffice),
> unless someone unfortunately put the files at the end of the tarball
> (in that case they do need to modify how the package is created).

As for the speed argument:  Do you really need to
extract the metadata files before you extract
everything else?  Those files are going to a
different place, but that's not an issue with a
libarchive-driven extraction.  At one point, I had
convinced myself that you could do it all in one
pass, just writing the files to different places
depending on whether they were metadata files
(however identified) or not.  I recall, for example,
that the +MTREE file needs to be used only at the
end of the extraction (to fix-up permissions).

> Thanks again for the comments :)... it would be helpful BTW if the
> manpages linked together because archive_read(3) doesn't reference the
> archive_read_disk(3) APIs, etc, so I kind of have to fish for the
> right APIs to use, and I feel like I'm catching more boots than fish
> sometimes... but it's a learning experience and I don't expect to get
> it right the first time.

Please let me know any doc shortcomings; the documentation
could use a lot of work, I know.  I also have a growing
Examples page on the libarchive Wiki and conversations
like this help me a lot to better understand what needs
to go there.

Cheers,

Tim




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4BC53714.80805>