Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 10 Jul 2012 23:17:54 -0400
From:      Ryan Stone <rysto32@gmail.com>
To:        freebsd-arch@freebsd.org
Subject:   Generating a tarball directly from make installworld
Message-ID:  <CAFMmRNwiZtbfuyT3tZ1udKk=VPJgwVuAD9gS=FY9rdGuoupqMw@mail.gmail.com>

next in thread | raw e-mail | index | archive | help
I was playing around with this a couple of months ago and a recent
thread on -arch prompted me to pick it up again.  The idea, for those
who aren't familiar with it, is to have the base system build
infrastructure generate tarballs directly.  I believe that that the
current approach is to do a make installworld installkernel
distribution to an empty directory, and then tar that up.  That's not
an ideal system because in order to have file ownership, flags, etc be
set correctly, installworld and friends must be run as root.

The method that I was trying was to write a new tool based on
libarchive that emulates the tools used by installworld and friends
but acts directly on a tar file.  I've made decent progress but I've
hit some pretty big roadblocks and I think that I need to take a step
back and solicit feedback before I go too much deeper down the rabbit
hole.  I'm hoping that people have already thought of the problems
that I'm hitting and have ideas on how to get passed them.

Currently, installworld and friends use the following tools to get the
right bits on disk:
- install
- ln
- mkdir
- rm
- zic (timezone data compiler)
- install-info (install for GNU info pages)
- makewhatis (generates manpage index)
- cap_mkdb (generates pwd.db and spwd.db from /etc/passwd and master.passwd)
- kldxref (generates linker.hints)

I would say that makewhatis and kldxref are the knottiest problems.
Both look at DESTDIR and generate metadata based on the files that
have been installed there.  As the files originate from many different
locations in OBJDIR, it's not as simple as pointing the tools at
OBJDIR and working out of there(by contrast, zic and cap_mkdb operate
on a limited number of files in SRCDIR, so they could be worked around
if we really had to).  I suppose that the files could be all installed
to a temporary location, and makewhatis/kldxref run against the temp
directory (with the output from that being install'ed to the right
place), but that's quite hacky.

The other problem that I have is performance.  When bsdtar appends to
a tar file, it iterates over every entry in the tar to figure out
where the end of it is.  I gather that this is to get rid of padding
but I'm not entirely sure.  Even if this isn't necessary I still have
to iterate over the entire file in most cases.  The problem is in the
sloppy semantics of ln and install: install foo bar means "install foo
to path bar/foo" if bar is a directory, but "install foo to path bar"
if bar is a regular file or it doesn't exist(symlinks add an extra
layer of complexity).  In order to implement this correctly, I have to
iterate over the tar to figure out what type of file bar is, every
time that install or ln is invoked.  As you can probably image this
gets quite slow quite quickly -- installing to a tar file on my system
seems to be at least 10 times slower than installing to disk.

I know that a lot of people have suggested generating an mtree file
and then converting the mtree file into a tarball, but I admit that
it's not at all clear to me how to generate the mtree file.  It feels
to me that it ends up being equivalent in complexity as what I've
tried, but maybe I'm missing something about mtree's capabilities.  Or
maybe people are just proposing a more fundamental change to what
tools we use to get bits on disk, and I missed that implication.

I know that a lot of people have opinions on this, so kindly speak up. :)



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAFMmRNwiZtbfuyT3tZ1udKk=VPJgwVuAD9gS=FY9rdGuoupqMw>