Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 29 Mar 2003 18:38:13 +1100
From:      Peter Jeremy <peterjeremy@optushome.com.au>
To:        Stephen McKay <smckay@internode.on.net>
Cc:        ctm-users@freebsd.org
Subject:   Re: cvs-cur
Message-ID:  <20030329073813.GA24683@cirb503493.alcatel.com.au>
In-Reply-To: <200303290543.h2T5h6Gh004778@dungeon.home>
References:  <3E84F93A.2080108@math.missouri.edu> <200303290543.h2T5h6Gh004778@dungeon.home>

next in thread | previous in thread | raw e-mail | index | archive | help
On Sat, Mar 29, 2003 at 03:43:06PM +1000, Stephen McKay wrote:
>Also, in those few moments I have spare I've been thinking of adding
>heuristics to the ctm system to detect files that have been moved (and
>possibly slightly edited).  This should be 100% correct for unedited and
>moved files (using the md5 hash), and should guess well for things with
>the same base name but which have been moved, say, to the Attic.

Perfect file renames will be very rare.  I agree that there have been
two recent examples (the commitlog renames and CVSROOT split) but this
has never happened before and it'll be nearly 8000 years before we
need to rename the commitlogs again :-).  The quarterly commitlog
rotation involves gzip'ing the old file and probably isn't a candidate
for short-circuiting.  Repository surgery is normally limited to
copying files and deleting files from the Attic - other surgical
possibilities tend to confuse CVS.

Within a CVS repository, repo-copies result in two identical copies
(one or both of which are usually edited fairly shortly after the
repo-copy).  "Deleting" a CVS file (moving it to the Attic) entails a
commit which will alter the file contents (to add a log message
stating that the file has been deleted).

Even outside the repository, CTM is tracking checked out CVS files
and virtually all files have a $FreeBSD$ line which will be different
for two otherwise identical files.

Being able to detect files that have been repo-copied or deleted
would be useful - but both cases require mkCTM to detect a slightly
changed file.  The Attic case is fairly simple to pick up but you
can't detect arbitrary repo-copies.

It you did decide to define a new CTM statement type, an updated
ctm(1) would need to be distributed well before the deltas started
using the new features.  It would be nice if the updated ctm was
distributed in a -RELEASE before updated deltas were circulated - this
would mean someone could install the latest -RELEASE and be able to
use CTM to bring themselves up to date without having to rebuild ctm
partway through.  This is practical for 5.1, but 4.9 (if it will be
released) is probably too far away.

Remember to make sure the changes are MFC'd to -STABLE as well as
applied to -CURRENT.   It might also be an idea to make a tarball
of the updated /usr/src/usr.sbin/ctm/ctm available somewhere near
the CTM deltas on the WEB site.

>This would reduce the size of a great many deltas, but would be an
>incompatible change to the format.

I agree it would have saved ~56MB of deltas about a month ago - but I
doubt that event will be repeated.  I've just checked cvs-cur 9040
through 9109 (which is all I have quickly to hand) and there's been a
total of 13MB cvs rm'd in a total of 50MB (uncompressed, 11MB
compressed) of deltas.  (I wrote a perl script which looked for
matching CTMFR and CTMFM commands where the only different was a
/Attic/ in the latter and accumulated the CTMFM size).  This suggests
we'd save about 25% of the delta size (very roughly).  Is this worth
the effort?

>  I don't know if people would object.

I wouldn't object as long as the updated ctm was available a couple of
months before the updated deltas were generated.

>PS I have no idea how many people use ctm nowadays.  Do we have an count?

postmaster@freebsd.org should be able to tell you how many people are
on the various mailing lists.  An accurate count of people downloading
from the FTP sites would take more effort but could probably be derived
from the FTP server logs.

Peter



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20030329073813.GA24683>