From owner-freebsd-www@FreeBSD.ORG Wed Jun 7 14:05:24 2006 Return-Path: X-Original-To: freebsd-www@freebsd.org Delivered-To: freebsd-www@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 41B0916E226; Wed, 7 Jun 2006 12:23:29 +0000 (UTC) (envelope-from tobez@tobez.org) Received: from heechee.tobez.org (heechee.tobez.org [194.255.56.42]) by mx1.FreeBSD.org (Postfix) with ESMTP id C7CAC43D45; Wed, 7 Jun 2006 12:23:28 +0000 (GMT) (envelope-from tobez@tobez.org) Received: by heechee.tobez.org (Postfix, from userid 1001) id 28D4312543D; Wed, 7 Jun 2006 14:23:27 +0200 (CEST) Date: Wed, 7 Jun 2006 14:23:27 +0200 From: Anton Berezin To: freebsd-www@freebsd.org Message-ID: <20060607122327.GA80285@heechee.tobez.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.4.2.1i X-Powered-By: FreeBSD http://www.freebsd.org/ Cc: scop@freebsd.org, noackjr@alumni.rice.edu Subject: [PATCH] Fix cvsweb.cgi to grok logs pasted into logs X-BeenThere: freebsd-www@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: FreeBSD Project Webmasters List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 07 Jun 2006 14:05:34 -0000 It is a well-known fact that cvsweb is unable to cope with cvs logs pasted into log messages. It is not uncommon to see comments like this in FreeBSD mailing lists: Hmm, you should not quote cvs history in your cvs commit messages. It confuses the tools like CVSweb, which does not show anything below the dashed line for your commit. To me, it sounds like "our tools suck, so don't do that", and I am not sure I agree with the attitude. So below is the patch against 3.0.6. Basically it uses a hack of feeding rlog with -z+00 option, which happens to modify the dates in the resulting log from "2006/06/05 00:00:35" to "2006-06-05 00:00:35+00". The resulting output is still somewhat ambiguous, but this ambiguity is *substantially* less likely to confuse cvsweb, unless one specially crafts the commit log. This way of fixing the problem is admittedly going for a low-hanging fruit, since the proper proper PROPER solution would involve not using rlog at all and doing all the RCS parsing in-place. The patched up cvsweb showing FreeBSD repository is currently running here http://www.tobez.org/cgi-bin/cvsweb.cgi , so that you can see the difference for yourself, for example: http://www.freebsd.org/cgi/cvsweb.cgi/ports/devel/p5-Config-Fast/Makefile and http://www.tobez.org/cgi-bin/cvsweb.cgi/ports/devel/p5-Config-Fast/Makefile This message's purpose is two-fold: - I would like the patch to be incorporated upstream, hence the relevant people are Cc'ed; - I would like the patch to be incorporated into our running cvsweb. The problem with the later wish is that Simon says [:-)] that we still run 2.something, and so it would be advisable to first do the update to 3.0.6. Also, Simon does not have time to do that atm, hence the mail to the list. So, what do you folks think? Cheers, \Anton. --- cvsweb.cgi.orig Wed Jun 7 11:57:23 2006 +++ cvsweb.cgi Wed Jun 7 13:59:24 2006 @@ -2655,9 +2655,9 @@ sub readLog($;$) $revision = defined($revision) ? "-r$revision" : ''; if ($revision =~ /\./) { # Normal revision, not a branch/tag name. - exec($CMD{rlog}, $revision, $fullname) or exit -1; + exec($CMD{rlog}, "-z+00", $revision, $fullname) or exit -1; } else { - exec($CMD{rlog}, $fullname) or exit -1; + exec($CMD{rlog}, "-z+00", $fullname) or exit -1; } } @@ -2696,50 +2696,76 @@ sub readLog($;$) # becomes smth like # revision 9.19 locked by: vassilii; - logentry: + my $state = 'wantrev'; + my @data; + my $data = { bailout => "", log => "" }; - while ($_ !~ LOG_FILESEPR) { - $_ = <$fh>; - last logentry if (!defined($_)); # EOF - if (/^revision (\d+(?:\.\d+)+)/) { - $rev = $1; - unshift(@allrevisions, $rev); - } elsif ($_ =~ LOG_FILESEPR || $_ =~ LOG_REVSEPR) { - next logentry; + LOGENTRY: + while (<$fh>) { + if ($state eq 'wantlog') { + if ($_ =~ LOG_FILESEPR || $_ =~ LOG_REVSEPR) { + push @data, $data if exists $data->{rev}; + $data = { bailout => $_, log => "" }; + $state = 'wantrev'; + } else { + $data->{log} .= $_; + } + } elsif ($state eq 'wantrev') { + if ($_ =~ LOG_FILESEPR || $_ =~ LOG_REVSEPR) { + $data->{bailout} .= $_; + next LOGENTRY; + } + goto BAILOUT unless /^revision (\d+(?:\.\d+)+)/; + $data->{rev} = $1; + $data->{bailout} .= $_; + $state = 'wantdate'; + } elsif ($state eq 'wantdate') { + if ( + m|^date:\s+(\d+)-(\d+)-(\d+)\s+(\d+):(\d+):(\d+)\+00;\s+author:\s+(\S+);\s+state:\s+(\S+);\s+(lines:\s+([0-9\s+-]+))?| + ) + { + my $yr = $1; + $yr -= 1900 if ($yr > 100); # Damn 2-digit year routines :-) + $data->{date} = timegm($6, $5, $4, $3, $2 - 1, $yr); + $data->{author} = $7; + $data->{state} = $8; + $data->{difflines} = $10; + $state = 'wantbranches'; + } else { + goto BAILOUT; + } + } elsif ($state eq 'wantbranches') { + $state = 'wantlog'; + if (/^branches:\s/) { + next LOGENTRY; + } else { + redo LOGENTRY; + } } else { - - # The rlog output is syntactically ambiguous. We must - # have guessed wrong about where the end of the last log - # message was. - # Since this is likely to happen when people put rlog output - # in their commit messages, don't even bother keeping - # these lines since we don't know what revision they go with - # any more. - next logentry; + fatal("500 Internal Error", 'Wrong state during RCS output parsing: %s', $_); } - $_ = <$fh>; - if ( - m|^date:\s+(\d+)/(\d+)/(\d+)\s+(\d+):(\d+):(\d+);\s+author:\s+(\S+);\s+state:\s+(\S+);\s+(lines:\s+([0-9\s+-]+))?| - ) - { - my $yr = $1; - $yr -= 1900 if ($yr > 100); # Damn 2-digit year routines :-) - $date{$rev} = timegm($6, $5, $4, $3, $2 - 1, $yr); - $author{$rev} = $7; - $state{$rev} = $8; - $difflines{$rev} = $10; + next LOGENTRY; + BAILOUT: + if (@data) { + # bailout, pasted log entry detected + $data[-1]->{log} .= "$data->{bailout}$_"; + $data = pop @data; + $state = 'wantlog'; } else { fatal("500 Internal Error", 'Error parsing RCS output: %s', $_); } - - line: - while (<$fh>) { - next line if (/^branches:\s/); - last line if ($_ =~ LOG_FILESEPR || $_ =~ LOG_REVSEPR); - $log{$rev} .= $_; - } } close($fh); + + # postprocess + for $data (@data) { + unshift @allrevisions, $data->{rev}; + $date{$data->{rev}} = $data->{date}; + $author{$data->{rev}} = $data->{author}; + $state{$data->{rev}} = $data->{state}; + $difflines{$data->{rev}} = $data->{difflines}; + $log{$data->{rev}} = $data->{log}; + } @revorder = reverse sort { revcmp($a, $b) } @allrevisions; -- An undefined problem has an infinite number of solutions. -- Robert A. Humphrey