Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 3 Nov 2011 23:46:02 -0400 (EDT)
From:      Benjamin Kaduk <kaduk@MIT.EDU>
To:        Alexander Yerenkow <yerenkow@gmail.com>
Cc:        freebsd-current@freebsd.org
Subject:   Re: VM images for FreeBSD
Message-ID:  <alpine.GSO.1.10.1111032302110.882@multics.mit.edu>
In-Reply-To: <CAPJF9wmeZadAQjFPBDq4x4fK3KgwnXyTKBmXdp9bRF2piwGJ0Q@mail.gmail.com>
References:  <CAPJF9wmf89mV2M3PO5deoWJ9i2FPHkQ1asgLzd9-bGkAd7j79g@mail.gmail.com> <alpine.BSF.2.00.1110170742420.16168@wonkity.com> <CAPJF9wmeZadAQjFPBDq4x4fK3KgwnXyTKBmXdp9bRF2piwGJ0Q@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Wed, 19 Oct 2011, Alexander Yerenkow wrote:

> Hello all!
>
> I'm working currently on creating images with a set pre-installed packages.
> I looked at project pkgng (candidate for replacing current pkg_* subsystem),
> and also I have some thought about current packages/ports system.
>
> 1. pkg_add can be launched with parameter -p $PREFIX. So, my first thought
> was: I create empty directory structure with mtree, and I'll install there
> all required packages; after that I need only update this installation tree
> (manually by pkg_delete $old pkg_add $new, or with some tool). But I cannot
> specify to pkg_add relative root, instead of real one.
>
> Let me show example:
> PKG_DBDIR=/zpool0/testroot/var/db/pkg pkg_add -p /zpool0/testroot/usr/local
> ubench-0.32.tbz
> installs package, and in /zpool0/testroot/var/db/pkg/ubench-0.32/+CONTENTS
> there will be such record:
> @cwd /zpool0/testroot/usr/local
>
> I can't specify to pkg_add that it should treat /zpool0/testroot as root, as
> I need (so record really should be @cwd /usr/local)
> Instead, pkg_add allows me to make chroot, which as you understand is not
> good (In specified chroot all required by pkg* binaries/libraries must
> exists, unfortunately I can't specify some empty dir and install there).
>
> Why is that? Because there is +INSTALL script in packages, in which
> package/port system allows execute any code/script written by porter.

This is indeed a frustrating problem.

>
> 2. In ports enhancements task list (somewhere i read it) there was one item:
> Make packages non-executable (or something similar). To do this properly, we
> must get rid of of free-form post-install post-deinstall scripts.
> To do this, we need some deep analysis of what types of actions there
> happening, formalize them and provide some way to porters specify all needed
> actions in Makefile.
> I downloaded all packages for 9-current i386, found all +INSTALL scripts,
> and kinda categorized them, you can get all of them here:
> http://www.box.net/shared/ieovjj7l8omkrm3l21xb
>
> To summarize my efforts:
> I checked 21195 packages;
> I found 880 install scripts;
>
> 3 scripts contains plain "exit 0"
> 8 install scripts contains some perl code;
> 17 scripts contains some additional "install" commands;
> 70 scripts contains some chgroup/chown actions (which probably could be done
> by specifying mtree file?...)
> 75 contains uncategorized actions (print of license, some interactive
> questions, ghostscript actions, tex, fonts etc.)
> 161 scripts contains some file commands, like (ld / cp / mv, creating
> backups, creating configs if they aren't exists etc. )
> 166 scripts contains useradd/groupadd commands (many similar constructions,
> not too hard to move this to .mk, in pkgng group/users can be specified in
> yaml config)
> 380 contains pear component registration (md5 -q * | uniq  - produces
> exactly one result, so these all scripts are really one, could be moved to
> some pear.mk)
>

Thank you for doing this analysis/breakdown!
However, I worry that it may have missed @exec statements in pkg-plist 
files ... for example, net/openafs (which I maintain) runs kldxref in 
/boot/modules after installing a kernel module, which is needed in order 
for kldload to find the module.  Now, this is clearly a case that a 
potential nonexecutable package framework could handle, checking for 
installed kernel modules and acting accordingly.  However, having not done 
the survey of the sort you did for install scripts, it is an uneasy 
dangling unknown.

> Why I'm interested in non-executable install of package (e.g. simple unpack
> + execute some typical actions based on package description):
> - Unpacking of hundreds Mb packages takes several minutes (to mdconfig-ed
> filesystem)
> - Installation of these packages via pkg_add (they downloads from local ftp)
> took hours in my case (to mdconfig-ed filesystem)
>

This is quite a telling statistic :)

> As you understand, to make efficient image building system, I need to deal
> with package installation without spending too many cpu/disk resources.
> Ideally I consider all required packages are extracted to some their own
> directory, like for ubench:
> $X/packages/ubench/ (and here goes all directory structure which should be
> copied to new root)
>
> plus separated info of new users/groups (maybe there need some additional
> data to make package installed in such way fully working).

There would certainly be additional data needed, e.g. for installing 
sample configuration files and copying to the real location, and removing 
both copies on uninstallation if the "real" file is unchanged from the 
sample.  I'm sure there are others, too.

>
> So, maybe someone working in this direction, or have any comments?
>

I would be very hesitant to proceed in this direction without doing some 
investigation of other package-based systems.
For example, Debian packages are inherently binary-based, there is not a 
real parallel to our ports framework.  Yet if anything, I think that 
"executable packages" may be even more heavily used in Debian than in 
FreeBSD.  In addition to the tarball of files to extract, the maintainer 
can also supply "maintainer scripts" which run before and/or after 
installation and/or uninstallation.  (Not to mention the infrastructure 
components which implement things like diversions.)  I have an incomplete 
survey of a biased sample of Debian(-style) packages in my slides at 
http://web.mit.edu/kaduk/Public/bsdcan-ports-talk-20110511.pdf , which 
shows that in addition to being used to manage users and groups, these 
maintainer scripts also are used to start/stop services, update gconf 
keys, the PAM stack, and more.  It quickly becomes quite a pile of 
"additional data needed" (per the above) that I fear would be too much 
infrastructure to safely maintain in a non-executable package framework.

Another incredibly useful (though hopefully infrequently used) feature of 
maintainer scripts is the ability they give to recover from packaging 
errors.  The first example that comes to mind is unfortunately not a very 
good one, but recently here at Athena we had a bug in our TeX 
configuration package which resulted in a dangling symlink from a broken 
diversion (which has no direct parallel in FreeBSD, making this a bad 
example).  In any case, this packaging bug made the package uninstallable!
However, we could produce an updated version of the package which had a 
preinst that corrected for the previous packaging error, offering us a way 
out that did not require manual user intervention, which I feel is 
something that we should try very hard to avoid.

Because of this, I don't think that having it be impossible for a package 
to have a custom executable component is a realistic goal.  (Which is not 
to say it is not a goal worth having.)  However, it would probably be 
feasible to add pieces to our framework (e.g. USERS/GROUPS) that make it 
easier for packages to avoid executable components.  If appropriately 
flagged, then a package could just be an "unpack this tarball" operation, 
possibly with a couple hooks (e.g. users/groups) from the packaging 
system.

> 3. Other "ports" ideas/thoughts.
> I proposed small enahcement to pkgng, but instead in pkgng this should be
> implemented in ports subsystem, it's about specifying abstract dependencies,
> and correct resolving of them:
> https://github.com/pkgng/pkgng/issues/100
>
> Who can comment/elaborate about this? It shouldn't be very complicated,
> since currently almost same functionality provided in .mk. files ( like
> USE_PERL etc)

Interesting, though I don't think I'm really one to comment/elaborate on 
it.  It does seem vaguely analogous to the concept of a (Debian) "virtual 
package", which can be depended on like a first-class package, but is 
actually "provide"d by any number of candidate packages.
I don't have a sense of how hard it would be to implement for us, though, 
and I have not had any time to look at pkgng at all. :(

>
>
> 4. Where's the "right" place to discuss ports system? :)

Presumably freebsd-ports@freebsd.org, though I alas do not regularly read 
there.

-Ben Kaduk



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?alpine.GSO.1.10.1111032302110.882>