Date: Fri, 12 Mar 2004 10:38:19 -0600 From: "Jacques A. Vidrine" <nectar@FreeBSD.org> To: Oliver Eikemeier <eikemeier@fillmore-labs.com> Cc: Oliver Eikemeier <eik@FreeBSD.org> Subject: Re: cvs commit: ports/security/vuxml vuln.xml Message-ID: <20040312163819.GA8990@lum.celabo.org> In-Reply-To: <4051D8C7.7070001@fillmore-labs.com> References: <200403111722.i2BHMXZV076423@repoman.freebsd.org> <20040311180138.GD7809@lum.celabo.org> <4050AD93.6080904@fillmore-labs.com> <20040311183125.GA7973@lum.celabo.org> <4050C334.4030707@fillmore-labs.com> <20040312143607.GN8574@lum.celabo.org> <4051D8C7.7070001@fillmore-labs.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Fri, Mar 12, 2004 at 04:35:35PM +0100, Oliver Eikemeier wrote: > Normally the file has a timestamp, if not you could add a timestamp at the > beginning or require tools that update the VuXML file to invalidate the > cache or produce a timestamp. That is even faster than reading part of the > file. Yes, we went over that. As I already described, a file timestamp allows the application to either not read the file, or read the whole file. Entry timestamps and chronological sorting allow the application to only read a tiny portion of the file. We're not talking about file timestamps. Also, tools don't update VuXML, humans do. Tools process VuXML. Do not confuse the document with the database. (oops, *requiring* that the document be sorted starts to cross that line, too :-) > Requiring moving modified entries to the top for tools to work > properly seems like causing more problems than it will ever solve. You've identified precisely one problem: reading diffs. This is an annoying problem, granted. You need not pretend that it is a larger issue than that. > >One could simply use the file timestamp in a limited number of > >situations: (a) You acually have a file and the timestamp can be trusted > >to be accurate; and (b) You don't care if updating the cache requires > >starting over and reading the entire input. > > > >Some real-world scenarious that I imagined where it matters: > > > > (1) Download VuXML periodically. One must be careful to preserve > > timestamps. Hopefully an appropriate timestamp is available > > via the download protocol. > > You would compress the file anyway. Compression has nothing to do with preserving the timestamp. Maybe you are thinking of archive formats. You *still* must be careful to preserve the timestamp. Hopefully both the source system and the downloading system have the same date. An advantage of using date strings in the entries to determine what needs to be processed is that *actual* dates set on systems are irrelevant. > > (2) Stream new updates. A tool that maintains a cache may check > > a network resource periodically for updates. Using e.g. HTTP, > > it need only download the first few `new' entries, rather than > > downloading the entire file every time. > > Sort it before distributing. Distribute diffs. Be creative. This `creativity' creates duplication. I much prefer that there is as little in between the original VuXML file and the processing application as possible. I don't want to constrain distribution. I'd prefer that tools be able to process the file whether they fetch the file from CVS, CVSup, cvsweb, an HTTP server, an FTP server, or whatever. Actually, ``sort it before distributing'' is exactly the method that is established: additions/modifications to ports/security/vuxml/vuln.xml are to be chronologically sorted. > Please, we live in the 21th century. You are not really trying to tell me > that a file has to be sorted by humans to be efficiently downloadable? I am saying that the file published at ports/security/vuxml/vuln.xml needs to be sorted to be used in these scenarios. Whether or not it is sorted by humans is not relevant (similar to, say, ports category Makefiles). > >But perhaps, after all, this part is over-engineered. I don't like the > >difficulty in reading `diffs' that is a side-effect, either. One could > >require that content changes and sorting always be done in separate > >commits, of course, but that could be an odious requirement. Tools must > >implement more complex behavior to take advantage of the chronological > >sorting (but of course they can just `play dumb' too). > > Cvsweb is your friend. It is *so* easy to grasp what I've done here: > > <http://cvsweb.freebsd.org/ports/security/vuxml/vuln.xml.diff?r1=1.39&r2=1.40&f=h> > > It is *much* harder to do when you have to read the entry twice. I cannot for the life of me connect what you just wrote with my paragraph above. Did I not just write about the negative side-effects? > >So in the end, I guess I'm on the fence about it. I'd like to keep the > >status quo (chronological sorting) for now--- I have a tool that uses > >it :-) ---, but I'd like to hear more convincing arguments either way. > > Someone *will* do a commit that will break your tool, for sure. Of course. Such happens. The base system and ports break, also. What exactly is your point? > Should I send you a xslt script that sorts the file? Sure! You might post it for general use. Even better, if you could post (or send privately) a modification to ports/security/vuxml for review that adds a `make sort' or similar target. I suggest that the sort be stable, but that isn't strictly necessary. > When the database is > multi-megabyte you won't have it in CVS anyway, but use some XML database > instead. Huh? Anyway, I understand your objection. I have explained why it is like it is (now twice) and admitted that it may be overkill. If you have something still more to add, great. Otherwise, I'm disinclined to abandon the sorting just yet. I still believe it is useful, but maybe I'm the only one who thinks so :-) Let me spell it out again so that there is not more wasted postings repeating what has already been said: With the chronological sorting, tools might be allowed to process only a portion of the VuXML file. Without chronological sorting, tools must process all or none of a VuXML file. Chronological sorting makes diffs between versions harder to read. This is great, this is exactly the kind of discussion I want to see before making VuXML ``really really official'' and drafting something for the security web page / Porter's Handbook / and whatever. So far you've criticized practically every aspect of VuXML. Thanks :-) Cheers, -- Jacques Vidrine / nectar@celabo.org / jvidrine@verio.net / nectar@freebsd.org
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20040312163819.GA8990>