From owner-ctm-users@FreeBSD.ORG Fri Mar 28 23:42:52 2003 Return-Path: Delivered-To: ctm-users@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 09A1D37B414 for ; Fri, 28 Mar 2003 23:42:52 -0800 (PST) Received: from cirb503493.alcatel.com.au (c18609.belrs1.nsw.optusnet.com.au [210.49.80.204]) by mx1.FreeBSD.org (Postfix) with ESMTP id DFE8544057 for ; Fri, 28 Mar 2003 23:38:17 -0800 (PST) (envelope-from peterjeremy@optushome.com.au) Received: from cirb503493.alcatel.com.au (localhost.alcatel.com.au [127.0.0.1])h2T7cFM2024815; Sat, 29 Mar 2003 18:38:16 +1100 (EST) (envelope-from jeremyp@cirb503493.alcatel.com.au) Received: (from jeremyp@localhost) by cirb503493.alcatel.com.au (8.12.8/8.12.8/Submit) id h2T7cESE024814; Sat, 29 Mar 2003 18:38:14 +1100 (EST) Date: Sat, 29 Mar 2003 18:38:13 +1100 From: Peter Jeremy To: Stephen McKay Message-ID: <20030329073813.GA24683@cirb503493.alcatel.com.au> References: <3E84F93A.2080108@math.missouri.edu> <200303290543.h2T5h6Gh004778@dungeon.home> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <200303290543.h2T5h6Gh004778@dungeon.home> User-Agent: Mutt/1.4.1i cc: ctm-users@freebsd.org Subject: Re: cvs-cur X-BeenThere: ctm-users@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: CTM User discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 29 Mar 2003 07:43:02 -0000 On Sat, Mar 29, 2003 at 03:43:06PM +1000, Stephen McKay wrote: >Also, in those few moments I have spare I've been thinking of adding >heuristics to the ctm system to detect files that have been moved (and >possibly slightly edited). This should be 100% correct for unedited and >moved files (using the md5 hash), and should guess well for things with >the same base name but which have been moved, say, to the Attic. Perfect file renames will be very rare. I agree that there have been two recent examples (the commitlog renames and CVSROOT split) but this has never happened before and it'll be nearly 8000 years before we need to rename the commitlogs again :-). The quarterly commitlog rotation involves gzip'ing the old file and probably isn't a candidate for short-circuiting. Repository surgery is normally limited to copying files and deleting files from the Attic - other surgical possibilities tend to confuse CVS. Within a CVS repository, repo-copies result in two identical copies (one or both of which are usually edited fairly shortly after the repo-copy). "Deleting" a CVS file (moving it to the Attic) entails a commit which will alter the file contents (to add a log message stating that the file has been deleted). Even outside the repository, CTM is tracking checked out CVS files and virtually all files have a $FreeBSD$ line which will be different for two otherwise identical files. Being able to detect files that have been repo-copied or deleted would be useful - but both cases require mkCTM to detect a slightly changed file. The Attic case is fairly simple to pick up but you can't detect arbitrary repo-copies. It you did decide to define a new CTM statement type, an updated ctm(1) would need to be distributed well before the deltas started using the new features. It would be nice if the updated ctm was distributed in a -RELEASE before updated deltas were circulated - this would mean someone could install the latest -RELEASE and be able to use CTM to bring themselves up to date without having to rebuild ctm partway through. This is practical for 5.1, but 4.9 (if it will be released) is probably too far away. Remember to make sure the changes are MFC'd to -STABLE as well as applied to -CURRENT. It might also be an idea to make a tarball of the updated /usr/src/usr.sbin/ctm/ctm available somewhere near the CTM deltas on the WEB site. >This would reduce the size of a great many deltas, but would be an >incompatible change to the format. I agree it would have saved ~56MB of deltas about a month ago - but I doubt that event will be repeated. I've just checked cvs-cur 9040 through 9109 (which is all I have quickly to hand) and there's been a total of 13MB cvs rm'd in a total of 50MB (uncompressed, 11MB compressed) of deltas. (I wrote a perl script which looked for matching CTMFR and CTMFM commands where the only different was a /Attic/ in the latter and accumulated the CTMFM size). This suggests we'd save about 25% of the delta size (very roughly). Is this worth the effort? > I don't know if people would object. I wouldn't object as long as the updated ctm was available a couple of months before the updated deltas were generated. >PS I have no idea how many people use ctm nowadays. Do we have an count? postmaster@freebsd.org should be able to tell you how many people are on the various mailing lists. An accurate count of people downloading from the FTP sites would take more effort but could probably be derived from the FTP server logs. Peter