Date: Fri, 25 May 2007 22:56:53 +0300 From: Giorgos Keramidas <keramida@ceid.upatras.gr> To: Brian Candler <B.Candler@pobox.com> Cc: freebsd-current@FreeBSD.org Subject: Re: Using Subversion for binary distribution? Message-ID: <20070525195653.GA1824@kobe.laptop> In-Reply-To: <20070525115242.GA31555@uk.tiscali.com> References: <20070525074925.GA19294@uk.tiscali.com> <20070525104342.GA2761@kobe.laptop> <20070525115242.GA31555@uk.tiscali.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On 2007-05-25 12:52, Brian Candler <B.Candler@pobox.com> wrote: >On Fri, May 25, 2007 at 01:43:42PM +0300, Giorgos Keramidas wrote: >> Using Subversion of a more distributed system like Git and Mercurial, >> can work in the way you are describing, but you would have to be _very_ >> careful about file ownership (so that you don't accidentally leak files >> owned by root to other accounts, for example), and permissions (so that >> you don't suddenly let everyone read /etc/master.passwd, or something >> equally or more evil). > > That's a very good point. There's svn:executable, but that's only a small > part of the equation. Perhaps mtree-like settings could be put into a user > property, and a post-checkout script could enforce them. > > The problem is, I believe, analagous to what happens when you do the initial > system build and create the tarballs: you have to ensure that all the files > in the tarballs are owned by the correct accounts and have the correct mode > (not root:root and 0755, or whatever they got when the compiler built them) > > If you put this information in an appropriate place for the client, it can > apply it locally after it has done an svn checkout. Indeed. This crossed my mind too. An mtree script which can run every time something is "committed", "updated" or resynced in any way from a master copy was my initial thought of working around the permissions and ownership issues. Then again, a full 'checkout' can take a lot of time, and this may create a "window of time" during which some files have very wrong permissions and ownership. I'm not sure if this is good enough for all possible cases. >> Subversion support for making 'local' changes to a checked out workspace >> and keeping them local is simply unavailable. The checked out tree >> would be "polluted" with .svn/ subdirectories with all the metadata of >> the Subversion workspace too (that's where permissions will be tricky to >> get right). >> >> The disk space requirements of a Subversion checkout are also very big. >> At least twice the size of the checked out files, and then some more. > > Having .svn directories all over the place is not a worry to me, and in fact > that's what gives you most of the practical advantages: svn diff and svn > revert will depend on this, as well as merging of non-conflicting changes. > > rm /bin/ls > svn revert /bin/ls > # happy days :-) > > Permissions on the .svn directories need to be gotten right, but I think > that simply setting them to root:root 0700 would be safe. > > You'd have a hard job finding a disk below 160GB these days, so the size > utilisation doesn't bother me either. If disk is not a space, there better tools for managing 'local' per workspace and per-checkout changes. Git and Mercurial are two of them, and I like a lot better their support for "pushing" and "pulling" changes from a master repository. But before you read below, a word of caution... What follows is an unverified brainstorming about management of a blobs of files with a distributed SCM like Mercurial (this is the one I'm most familiar with, so I'll let other people speak for Git as they see fit). I haven't put this to production use anywhere, and I don't really recommend using an SCM to manage installed files, but since you seem to like the idea, the following are a few random thoughts intermixed with a mini 'tutorial' about Mercurial/Hg. Their merge support is also excellent, so you can keep local changes like the '/etc/hosts' contents and other per-system changes tracked properly as the master copy is updated. The 'hooks' of Mercurial for example are nice for running "update" scripts which take care of the permission problems, similar to the one you mentioned above. Mercurial and Git are a very big departure from the centralized way of working with Subversion, but it is precisely this 'distributedness' that makes them ideal for keeping local per-system changes whenever these local changes make sense. For example, if you want to keep a central copy of the FreeBSD base system in a 'master' machine called 'buildhost', you can create a Mercurial workspace with the binary files of a FreeBSD base system at: buildhost:/repos/freebsd/releng7/base To populate a new disk with the files of the freebsd7/base workspace, essentially "installing" a new copy of FreeBSD on the system, you can mount the new disk on your laptop (i.e. through a USB disk connection), and use something like: laptop# fdisk -BI /dev/da0 laptop# bsdlabel -w -B /dev/da0s1 laptop# newfs /dev/da0s1a and then you can 'clone' the base installation from 'buildhost', through an SSH tunneled clone operation: laptop# mount /dev/da0s1a /mnt laptop# cd /mnt laptop# hg clone ssh://buildhost//repos/freebsd/releng7/base . The next step would be to edit the /mnt files locally, while the disk is still connected on your laptop (i.e. to fix `/etc/fstab' and other files which do need local changes). You install hook scripts in /mnt/.hg/hooks and set them up to run every time a group of changes is pulled, every time a commit is done in the /mnt workspace, and every time 'hg update' is used to update local files: laptop# cd /mnt laptop# cat .hg/hgrc [paths] default = ssh://buildhost//repos/freebsd/releng7/base [hooks] changegroup = /bin/sh .hg/hooks/fixperms.sh commit = /bin/sh .hg/hooks/fixperms.sh update = /bin/sh .hg/hooks/fixperms.sh laptop# A good idea is to also set up a per-managed host workspace at the buildhost, i.e. using workspace paths like: ssh://buildhost//repos/freebsd/hosts/kobe and setting the 'default-push' path of .hg/hgrc to point to the 'backup' clone of the host 'kobe': laptop# cat .hg/hgrc [paths] default = ssh://buildhost//repos/freebsd/releng7/base default-push = ssh://buildhost//repos/freebsd/hosts/kobe Then you can unmount the new disk, move it to its destination machine and let the "distributedness" take over. Every time something changes in the 'master' copy of the base installation, you can "pull" the new changes: laptop$ ssh kobe kobe$ sudo -i Password: ****** kobe# cd / kobe# hg incoming This will show you the changes you would have 'pulled' but not modify anything, i.e.: kobe# hg incoming comparing with ssh://buildhost//repos/freebsd/releng7/base searching for changes no changes found kobe# You can "pull" new changes with "hg pull": kobe# hg incoming comparing with ssh://buildhost//repos/freebsd/releng7/base searching for changes adding changesets adding manifests adding file changes added 29 changesets with 47 changes to 26 files (-1 heads) (run 'hg update' to get a working copy) kobe# If there are merge conflicts, Mercurial will create new "heads" with the conflicting changes but *not* affect any of the files until you run "hg update". Local changes which have been 'committed' in the managed host's file system will not be lost when you pull. You can 'merge' the remote updates, using 3-way merge tools (i.e. kdiff3 or even plain good ol' vim), you can revert the merge, you can roll back changes to a known good-state, etc. and so on. When managing a large set of files as a 'branch', everything that can be done with Subversion can also be done with Mercurial, and you also get the benefits of a fully distributed system, including (but not expressly limited to) the following: + Blazingly fast local operation + Minimal dependency on the network (you never have to go over the network for looking at history and making commits/changes, unless you really want to) + Speed. It's amazing how horrendously slow some operations can be when you have to go over the network for *everything* except perhaps one operation, like "svn diff". + Fully usable local history, diffs + Merge tracking that is *far* superior to what CVS or Subversion have ever provided so far + The ability to pull/push identical changes to multiple hosts + Tunnelling of SCM operation through SSH, HTTP, HTTPS + Full support for arbitrarily complex local changes, completely independently of remote hosts (i.e. no host will be affected by local changes, unless the changes are specifically "pushed" to it, or "pulled" while working on the host itself) + Most importantly. Changes can be tested *locally* on the host which they should affect and only on that host! If they don't work, rollback is easy and nobody's central Subversion repository is bloated by the changes, as they have never hit anybody's tree. Combined with the extremely easy clonability of "Hg-managed" blobs of files, and a test host someplace out of production, you can guess how easy testing of changes which are highly experimental can be ;-) IMHO, if you _have_ to use an SCM to manage the FreeBSD base system, and you are ok with the idea of running hook scripts to fix permission and ownership of the checked out files (as you seem to be), don't use Subversion... use a fully distributed tool. It's inherently better for almost any sort of job which requires 'merging' of local changes with a remote master copy. FWIW, more information about Mercurial (which I very briefly advocated for above) and Git can be found online, at their sites: Mercurial: http://www.selenic.com/mercurial/ Git: http://git.or.cz/
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20070525195653.GA1824>