From owner-freebsd-chat Sun Dec 22 20:48:12 1996 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.4/8.8.4) id UAA15682 for chat-outgoing; Sun, 22 Dec 1996 20:48:12 -0800 (PST) Received: from time.cdrom.com (root@time.cdrom.com [204.216.27.226]) by freefall.freebsd.org (8.8.4/8.8.4) with ESMTP id UAA15676 for ; Sun, 22 Dec 1996 20:48:10 -0800 (PST) Received: from time.cdrom.com (jkh@localhost [127.0.0.1]) by time.cdrom.com (8.8.4/8.6.9) with ESMTP id UAA04163; Sun, 22 Dec 1996 20:48:05 -0800 (PST) To: Marc Slemko cc: freebsd-chat@FreeBSD.org Subject: Re: mailing list archives In-reply-to: Your message of "Sun, 22 Dec 1996 21:26:17 MST." Date: Sun, 22 Dec 1996 20:48:05 -0800 Message-ID: <4159.851316485@time.cdrom.com> From: "Jordan K. Hubbard" Sender: owner-chat@FreeBSD.org X-Loop: FreeBSD.org Precedence: bulk Erm. I wasn't exactly kidding about the idea of putting things into a simplistic database of some sort. Since all *standard* storage formats suck, and since we have, from the very beginning, also been archiving this stuff without a whole heck of a lot of regard to how we might actually *use* the information, doesn't this suggest a new approach to the problem? We archive all this mail just *in case* someone might use it, yet we make almost no provisions for really making it all that easy to search and view threads of discussion, nor do we provide a meaningful way of aging and deleting (or archiving) older information. Databases do all those things, and they let you easily come up with new ways of viewing the data as you collect user feedback on what's useful and what's not. Databases also, in most cases, deal with *large* amounts of data efficiently. Seems like our needs to the tenth decimal place. The only really big question is - how could we implement something like this? There's gotta be at least one database weenie in the crowd here! :-) Jordan > Since all standard storage formats for mail archives have problems when > you are dealing with this volume, how about for now just making a snapshot > of the archives as they are right now available somewhere in whatever form > they may be stored in. I don't care if I need to ftp a 500 meg file; > that's well under an hour if it is coming from wcarchive. > > Any format will be unmanagable for most people due to sheer volume. If > you know what you are looking for, less is a pretty good search utility. > > On Sun, 22 Dec 1996, Jordan K. Hubbard wrote: > > > > This isn't any improvement, IMO. The files would still be way-too-larg e > > > for people to deal with and it doesn't make it any easier to index the > > > contents. One message per file is the only scheme that addresses these > > > problems. > > > > Except then we'll almost certainly run out of inodes in the target > > directory sooner rather than later. Just judging by the mailstats > > output and calculating about 90 days ahead, the math does not look > > promising. :-) > > > > Sigh. Face it, we need a database. :-) > > > > Jordan > > >