Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 12 Mar 2004 08:36:07 -0600
From:      "Jacques A. Vidrine" <nectar@FreeBSD.org>
To:        Oliver Eikemeier <eikemeier@fillmore-labs.com>
Cc:        Oliver Eikemeier <eik@FreeBSD.org>
Subject:   Re: cvs commit: ports/security/vuxml vuln.xml
Message-ID:  <20040312143607.GN8574@lum.celabo.org>
In-Reply-To: <4050C334.4030707@fillmore-labs.com>
References:  <200403111722.i2BHMXZV076423@repoman.freebsd.org> <20040311180138.GD7809@lum.celabo.org> <4050AD93.6080904@fillmore-labs.com> <20040311183125.GA7973@lum.celabo.org> <4050C334.4030707@fillmore-labs.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, Mar 11, 2004 at 08:51:16PM +0100, Oliver Eikemeier wrote:
> Since history is considered very valuable in the FreeBSD project, I guess
> I would prefer that over a slight runtime optimization for ceratain tools.
> 
> How much time does it take to produce a sorted file once and cache that?

I agree, I would not hold a `slight runtime' as higher priority than
history readability.  But using chronological sorted input makes more
than a `slight' difference in some cases.

Obviously any tool that will be called frequently (e.g. once for every
port built) should do some caching of data.  Since the input is in
chronological order, such tools need read only a minimal amount of the
input in order to determine whether or not the cache needs updating.

One could simply use the file timestamp in a limited number of
situations: (a) You acually have a file and the timestamp can be trusted
to be accurate; and (b) You don't care if updating the cache requires
starting over and reading the entire input.

Some real-world scenarious that I imagined where it matters:

 (1) Download VuXML periodically.  One must be careful to preserve
     timestamps.  Hopefully an appropriate timestamp is available
     via the download protocol.
 
 (2) Stream new updates.  A tool that maintains a cache may check
     a network resource periodically for updates.  Using e.g. HTTP,
     it need only download the first few `new' entries, rather than
     downloading the entire file every time.

Considering that in a few years time, the VuXML file could be
multi-megabyte, it seems like a good idea to avoid downloading the
entire file if possible.  Of course, other tools can take care of this
for you, e.g. CVSup or rsync.  However, there is something to be said
for being able to publish a VuXML file via HTTP or other `dumb' protocol
and still get such efficiencies, especially if there could be thousands
of downloads per day.


But perhaps, after all, this part is over-engineered.  I don't like the
difficulty in reading `diffs' that is a side-effect, either.  One could
require that content changes and sorting always be done in separate
commits, of course, but that could be an odious requirement.  Tools must
implement more complex behavior to take advantage of the chronological
sorting (but of course they can just `play dumb' too).

So in the end, I guess I'm on the fence about it.  I'd like to keep the
status quo (chronological sorting) for now--- I have a tool that uses
it :-) ---, but I'd like to hear more convincing arguments either way.

Cheers,
-- 
Jacques Vidrine / nectar@celabo.org / jvidrine@verio.net / nectar@freebsd.org



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20040312143607.GN8574>