Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 08 Jun 2006 02:34:35 -0400
From:      Jonathan Noack <noackjr@alumni.rice.edu>
To:        Anton Berezin <tobez@freebsd.org>
Cc:        freebsd-www@freebsd.org, scop@freebsd.org
Subject:   Re: [PATCH] Fix cvsweb.cgi to grok logs pasted into logs
Message-ID:  <4487C4FB.5010008@alumni.rice.edu>
In-Reply-To: <20060607122327.GA80285@heechee.tobez.org>
References:  <20060607122327.GA80285@heechee.tobez.org>

next in thread | previous in thread | raw e-mail | index | archive | help
This is an OpenPGP/MIME signed message (RFC 2440 and 3156)
--------------enig67BB9EAFF0DDFF9364F3A86B
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

On 06/07/06 08:23, Anton Berezin wrote:
> Basically it uses a hack of feeding rlog with -z+00 option, which
> happens to modify the dates in the resulting log from "2006/06/05
> 00:00:35" to "2006-06-05 00:00:35+00".  The resulting output is still
> somewhat ambiguous, but this ambiguity is *substantially* less likely t=
o
> confuse cvsweb, unless one specially crafts the commit log.

=2E.. and users shouldn't specify "-z+00" because UTC is already the
default in rlog.  Brilliant.  This is an ingenious idea and is better
than any of the hacks I considered.  See my "proper" solution further dow=
n.

> This way of fixing the problem is admittedly going for a low-hanging
> fruit, since the proper proper PROPER solution would involve not using
> rlog at all and doing all the RCS parsing in-place.

Actually, doing the RCS parsing in-place is *really* hard to do without
destroying performance.  I tried for about a week before throwing in the
towel.  Consider this:

$ time rlog /home/ncvs/src/UPDATING,v > /dev/null
real    0m0.045s
user    0m0.045s
sys     0m0.000s
$ time perl -e 'open(FILE,"/home/ncvs/src/UPDATING,v");
while(<FILE>){print $_;}' > /dev/null
real    0m0.059s
user    0m0.059s
sys     0m0.000s
$ time perl -e 'open(FILE,"/home/ncvs/src/UPDATING,v");
while(read(FILE,$buffer,4096)){print $buffer;}' > /dev/null
real    0m0.014s
user    0m0.014s
sys     0m0.000s

Just grabbing each line and printing it out in Perl takes longer than it
does for rlog to produce usable output!  The reason for this is that
grabbing a line in Perl invokes the regex engine, which is expensive.  I
tried using a buffer and tokenizing input myself (into either characters
or logical tokens), but parsing the rlog output was twice as fast as my
best effort to do the RCS parsing in-place.  Fixing rlog is a much
better "proper" solution.

> The patched up cvsweb showing FreeBSD repository is currently running
> here http://www.tobez.org/cgi-bin/cvsweb.cgi , so that you can see the
> difference for yourself, for example:
>=20
> http://www.freebsd.org/cgi/cvsweb.cgi/ports/devel/p5-Config-Fast/Makefi=
le
> and
> http://www.tobez.org/cgi-bin/cvsweb.cgi/ports/devel/p5-Config-Fast/Make=
file

There are two places where CVSweb parses rlog output; you forgot to
update getDirLogs (note that UPDATING is missing):
http://www.tobez.org/cgi-bin/cvsweb.cgi/src/?only_with_tag=3DRELENG_6_1

> This message's purpose is two-fold:
>=20
> - I would like the patch to be incorporated upstream, hence the relevan=
t
>   people are Cc'ed;

I guess you mean me :).  This issue really annoys me and as the new
CVSweb maintainer I'm determined to fix it.  I really like your idea and
I think I will incorporate it into CVSweb.  Right now I am in the middle
of a modularization/rewrite, so it may be a while before it hits the tree=
=2E

Also, I am pursuing what I consider the "proper" solution: fixing rlog!
 I worked with the RCS folks to hash out a commit log byte count option
for rlog.  This allows CVSweb to know exactly how long to read for the
commit log, eliminating any ambiguity.

Patches (including my RCS in-place attempts):
http://www.noacks.org/cvsweb/

Test site:
http://www.noacks.org/cgi-test/cvsweb.cgi

> - I would like the patch to be incorporated into our running cvsweb.

I think it would be great if we can update the main site to 3.0.x.  I
hope to work with the www@ folks to make this happen.

-Jonathan

--=20
Jonathan Noack | noackjr@alumni.rice.edu | OpenPGP: 0x991D8195


--------------enig67BB9EAFF0DDFF9364F3A86B
Content-Type: application/pgp-signature; name="signature.asc"
Content-Description: OpenPGP digital signature
Content-Disposition: attachment; filename="signature.asc"

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.3 (FreeBSD)

iD8DBQFEh8UCUFz01pkdgZURAh/lAJ9C6CU67Q6fbp0HQ91pGr8g6JCjHQCfSXFF
qhBzjrz1mY6CDC906bxzRdA=
=D13u
-----END PGP SIGNATURE-----

--------------enig67BB9EAFF0DDFF9364F3A86B--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4487C4FB.5010008>