Date: Mon, 30 Mar 1998 12:25:00 -0800 (PST) From: Simon Shapiro <shimon@simon-shapiro.org> To: nik@iii.co.uk Cc: Amancio Hasty <hasty@rah.star-gate.com>, Satoshi Asami <asami@FreeBSD.ORG>, scrappy@hub.org, andreas@klemm.gtn.com, freebsd-database@FreeBSD.ORG, Wolfram Schneider <wosch@cs.tu-berlin.de>, John Fieber <jfieber@indiana.edu> Subject: Re: Mailing list search interface Message-ID: <XFMail.980330122500.shimon@simon-shapiro.org> In-Reply-To: <19980330164024.47510@iii.co.uk>
next in thread | previous in thread | raw e-mail | index | archive | help
On 30-Mar-98 nik@iii.co.uk wrote: ... > My disk is single 2GB Atlas II, with tagged queuing turned *off* (because > of buggy firmware which I haven't updated yet). Ah! This is useful information. thanx! >> By quick back-of-an-envelope calculations, this is slower than >> the current indexing scheme on hub by at least a factor of 10. > > The time above was for creation of the HTML archives and for indexing, > not just indexing alone. This is something we need to keep in mind. Generating 100% output coverage for (probably) less than 10% need is wasteful. >> Indexing anything large is typically an I/O bound operation and >> when you start indexing much more than can fit in RAM, your >> performance will degrade dramatically, so it is probably slower >> by much more than a factor of 10. > > Don't know. I'll grab last years archive of -hackers (or another one, > if there's another you think would be more representative) and try that. > I can bring back figures for the time to create the entire archive (and > index), the time just to index, and the time to add a new message and > then reindex. Listen to the man :-) It gets worse. Extrapolation on a non-linear function is called gambling :-) You will run into scaling problems at certain sizes. The worsening can be dramatic. > I'd try this with the whole of the archives, but I don't have the spare > disk space (yet). I have. Is there an efficient way to get the whole archive here? Downloading on a modem is NOT considered efficient. > Are those survey results available online somewhere? Please! > A hybrid system is on my list of things to build here (but it'll be > Oracle based). I haven't investigated Postgres enough to know if it's > up to the task. Oracle based is good. Now, plase tell us how to run Oracle on FreeBSD, legally, and with source available. PostgreSQL is up to the task. This is not a dramatically complex database problem. Pretty much a linear table, with the text searching TBD. ---------- Sincerely Yours, Simon Shapiro Shimon@Simon-Shapiro.ORG Voice: 503.799.2313 To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-database" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?XFMail.980330122500.shimon>