Date: Thu, 08 Jun 2006 02:34:35 -0400 From: Jonathan Noack <noackjr@alumni.rice.edu> To: Anton Berezin <tobez@freebsd.org> Cc: freebsd-www@freebsd.org, scop@freebsd.org Subject: Re: [PATCH] Fix cvsweb.cgi to grok logs pasted into logs Message-ID: <4487C4FB.5010008@alumni.rice.edu> In-Reply-To: <20060607122327.GA80285@heechee.tobez.org> References: <20060607122327.GA80285@heechee.tobez.org>
next in thread | previous in thread | raw e-mail | index | archive | help
This is an OpenPGP/MIME signed message (RFC 2440 and 3156) --------------enig67BB9EAFF0DDFF9364F3A86B Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable On 06/07/06 08:23, Anton Berezin wrote: > Basically it uses a hack of feeding rlog with -z+00 option, which > happens to modify the dates in the resulting log from "2006/06/05 > 00:00:35" to "2006-06-05 00:00:35+00". The resulting output is still > somewhat ambiguous, but this ambiguity is *substantially* less likely t= o > confuse cvsweb, unless one specially crafts the commit log. =2E.. and users shouldn't specify "-z+00" because UTC is already the default in rlog. Brilliant. This is an ingenious idea and is better than any of the hacks I considered. See my "proper" solution further dow= n. > This way of fixing the problem is admittedly going for a low-hanging > fruit, since the proper proper PROPER solution would involve not using > rlog at all and doing all the RCS parsing in-place. Actually, doing the RCS parsing in-place is *really* hard to do without destroying performance. I tried for about a week before throwing in the towel. Consider this: $ time rlog /home/ncvs/src/UPDATING,v > /dev/null real 0m0.045s user 0m0.045s sys 0m0.000s $ time perl -e 'open(FILE,"/home/ncvs/src/UPDATING,v"); while(<FILE>){print $_;}' > /dev/null real 0m0.059s user 0m0.059s sys 0m0.000s $ time perl -e 'open(FILE,"/home/ncvs/src/UPDATING,v"); while(read(FILE,$buffer,4096)){print $buffer;}' > /dev/null real 0m0.014s user 0m0.014s sys 0m0.000s Just grabbing each line and printing it out in Perl takes longer than it does for rlog to produce usable output! The reason for this is that grabbing a line in Perl invokes the regex engine, which is expensive. I tried using a buffer and tokenizing input myself (into either characters or logical tokens), but parsing the rlog output was twice as fast as my best effort to do the RCS parsing in-place. Fixing rlog is a much better "proper" solution. > The patched up cvsweb showing FreeBSD repository is currently running > here http://www.tobez.org/cgi-bin/cvsweb.cgi , so that you can see the > difference for yourself, for example: >=20 > http://www.freebsd.org/cgi/cvsweb.cgi/ports/devel/p5-Config-Fast/Makefi= le > and > http://www.tobez.org/cgi-bin/cvsweb.cgi/ports/devel/p5-Config-Fast/Make= file There are two places where CVSweb parses rlog output; you forgot to update getDirLogs (note that UPDATING is missing): http://www.tobez.org/cgi-bin/cvsweb.cgi/src/?only_with_tag=3DRELENG_6_1 > This message's purpose is two-fold: >=20 > - I would like the patch to be incorporated upstream, hence the relevan= t > people are Cc'ed; I guess you mean me :). This issue really annoys me and as the new CVSweb maintainer I'm determined to fix it. I really like your idea and I think I will incorporate it into CVSweb. Right now I am in the middle of a modularization/rewrite, so it may be a while before it hits the tree= =2E Also, I am pursuing what I consider the "proper" solution: fixing rlog! I worked with the RCS folks to hash out a commit log byte count option for rlog. This allows CVSweb to know exactly how long to read for the commit log, eliminating any ambiguity. Patches (including my RCS in-place attempts): http://www.noacks.org/cvsweb/ Test site: http://www.noacks.org/cgi-test/cvsweb.cgi > - I would like the patch to be incorporated into our running cvsweb. I think it would be great if we can update the main site to 3.0.x. I hope to work with the www@ folks to make this happen. -Jonathan --=20 Jonathan Noack | noackjr@alumni.rice.edu | OpenPGP: 0x991D8195 --------------enig67BB9EAFF0DDFF9364F3A86B Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.3 (FreeBSD) iD8DBQFEh8UCUFz01pkdgZURAh/lAJ9C6CU67Q6fbp0HQ91pGr8g6JCjHQCfSXFF qhBzjrz1mY6CDC906bxzRdA= =D13u -----END PGP SIGNATURE----- --------------enig67BB9EAFF0DDFF9364F3A86B--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4487C4FB.5010008>