From owner-freebsd-www@FreeBSD.ORG Thu Jun 8 09:03:27 2006 Return-Path: X-Original-To: freebsd-www@freebsd.org Delivered-To: freebsd-www@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 68A0D16AC1F for ; Thu, 8 Jun 2006 06:34:48 +0000 (UTC) (envelope-from noackjr@alumni.rice.edu) Received: from smtp101.biz.mail.re2.yahoo.com (smtp101.biz.mail.re2.yahoo.com [68.142.229.215]) by mx1.FreeBSD.org (Postfix) with SMTP id 7E09943D48 for ; Thu, 8 Jun 2006 06:34:46 +0000 (GMT) (envelope-from noackjr@alumni.rice.edu) Received: (qmail 6236 invoked from network); 8 Jun 2006 06:34:45 -0000 Received: from unknown (HELO optimator.noacks.org) (noackjr@supercrime.org@24.99.22.177 with login) by smtp101.biz.mail.re2.yahoo.com with SMTP; 8 Jun 2006 06:34:45 -0000 Received: from localhost (localhost [127.0.0.1]) by optimator.noacks.org (Postfix) with ESMTP id 1AA346437; Thu, 8 Jun 2006 02:34:45 -0400 (EDT) X-Virus-Scanned: amavisd-new at noacks.org Received: from optimator.noacks.org ([127.0.0.1]) by localhost (optimator.noacks.org [127.0.0.1]) (amavisd-new, port 10024) with LMTP id YUxj4V9+KwB2; Thu, 8 Jun 2006 02:34:43 -0400 (EDT) Received: from compgeek.noacks.org (compgeek [192.168.1.10]) by optimator.noacks.org (Postfix) with ESMTP id 2B5C363CE; Thu, 8 Jun 2006 02:34:43 -0400 (EDT) Received: from [127.0.0.1] (localhost [127.0.0.1]) by compgeek.noacks.org (8.13.6/8.13.6) with ESMTP id k586YgP8000517; Thu, 8 Jun 2006 02:34:42 -0400 (EDT) (envelope-from noackjr@alumni.rice.edu) Message-ID: <4487C4FB.5010008@alumni.rice.edu> Date: Thu, 08 Jun 2006 02:34:35 -0400 From: Jonathan Noack User-Agent: Thunderbird 1.5.0.4 (X11/20060606) MIME-Version: 1.0 To: Anton Berezin References: <20060607122327.GA80285@heechee.tobez.org> In-Reply-To: <20060607122327.GA80285@heechee.tobez.org> X-Enigmail-Version: 0.94.0.0 OpenPGP: id=991D8195; url=http://www.noacks.org/cert/noackjr.asc Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="------------enig67BB9EAFF0DDFF9364F3A86B" Cc: freebsd-www@freebsd.org, scop@freebsd.org Subject: Re: [PATCH] Fix cvsweb.cgi to grok logs pasted into logs X-BeenThere: freebsd-www@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: noackjr@alumni.rice.edu List-Id: FreeBSD Project Webmasters List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 08 Jun 2006 09:03:28 -0000 This is an OpenPGP/MIME signed message (RFC 2440 and 3156) --------------enig67BB9EAFF0DDFF9364F3A86B Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable On 06/07/06 08:23, Anton Berezin wrote: > Basically it uses a hack of feeding rlog with -z+00 option, which > happens to modify the dates in the resulting log from "2006/06/05 > 00:00:35" to "2006-06-05 00:00:35+00". The resulting output is still > somewhat ambiguous, but this ambiguity is *substantially* less likely t= o > confuse cvsweb, unless one specially crafts the commit log. =2E.. and users shouldn't specify "-z+00" because UTC is already the default in rlog. Brilliant. This is an ingenious idea and is better than any of the hacks I considered. See my "proper" solution further dow= n. > This way of fixing the problem is admittedly going for a low-hanging > fruit, since the proper proper PROPER solution would involve not using > rlog at all and doing all the RCS parsing in-place. Actually, doing the RCS parsing in-place is *really* hard to do without destroying performance. I tried for about a week before throwing in the towel. Consider this: $ time rlog /home/ncvs/src/UPDATING,v > /dev/null real 0m0.045s user 0m0.045s sys 0m0.000s $ time perl -e 'open(FILE,"/home/ncvs/src/UPDATING,v"); while(){print $_;}' > /dev/null real 0m0.059s user 0m0.059s sys 0m0.000s $ time perl -e 'open(FILE,"/home/ncvs/src/UPDATING,v"); while(read(FILE,$buffer,4096)){print $buffer;}' > /dev/null real 0m0.014s user 0m0.014s sys 0m0.000s Just grabbing each line and printing it out in Perl takes longer than it does for rlog to produce usable output! The reason for this is that grabbing a line in Perl invokes the regex engine, which is expensive. I tried using a buffer and tokenizing input myself (into either characters or logical tokens), but parsing the rlog output was twice as fast as my best effort to do the RCS parsing in-place. Fixing rlog is a much better "proper" solution. > The patched up cvsweb showing FreeBSD repository is currently running > here http://www.tobez.org/cgi-bin/cvsweb.cgi , so that you can see the > difference for yourself, for example: >=20 > http://www.freebsd.org/cgi/cvsweb.cgi/ports/devel/p5-Config-Fast/Makefi= le > and > http://www.tobez.org/cgi-bin/cvsweb.cgi/ports/devel/p5-Config-Fast/Make= file There are two places where CVSweb parses rlog output; you forgot to update getDirLogs (note that UPDATING is missing): http://www.tobez.org/cgi-bin/cvsweb.cgi/src/?only_with_tag=3DRELENG_6_1 > This message's purpose is two-fold: >=20 > - I would like the patch to be incorporated upstream, hence the relevan= t > people are Cc'ed; I guess you mean me :). This issue really annoys me and as the new CVSweb maintainer I'm determined to fix it. I really like your idea and I think I will incorporate it into CVSweb. Right now I am in the middle of a modularization/rewrite, so it may be a while before it hits the tree= =2E Also, I am pursuing what I consider the "proper" solution: fixing rlog! I worked with the RCS folks to hash out a commit log byte count option for rlog. This allows CVSweb to know exactly how long to read for the commit log, eliminating any ambiguity. Patches (including my RCS in-place attempts): http://www.noacks.org/cvsweb/ Test site: http://www.noacks.org/cgi-test/cvsweb.cgi > - I would like the patch to be incorporated into our running cvsweb. I think it would be great if we can update the main site to 3.0.x. I hope to work with the www@ folks to make this happen. -Jonathan --=20 Jonathan Noack | noackjr@alumni.rice.edu | OpenPGP: 0x991D8195 --------------enig67BB9EAFF0DDFF9364F3A86B Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.3 (FreeBSD) iD8DBQFEh8UCUFz01pkdgZURAh/lAJ9C6CU67Q6fbp0HQ91pGr8g6JCjHQCfSXFF qhBzjrz1mY6CDC906bxzRdA= =D13u -----END PGP SIGNATURE----- --------------enig67BB9EAFF0DDFF9364F3A86B--