Date: Mon, 11 Apr 2005 02:06:42 +0200 From: Mario Hoerich <lists@MHoerich.de> To: Doug Lee <dgl@dlee.org>, Chuck Swiger <cswiger@mac.com>, freebsd-questions@freebsd.org Subject: Re: Anyone ever consider a filesystem served by MySQL for mail folders? Message-ID: <20050411000641.GB620@Pandora.MHoerich.de> In-Reply-To: <20050409223001.GA58918@kirk.dlee.org> References: <20050409203727.GI4670@kirk.dlee.org> <42584A22.9010209@mac.com> <20050409223001.GA58918@kirk.dlee.org>
next in thread | previous in thread | raw e-mail | index | archive | help
# Doug Lee: [ fixed quote-levels ] > On Sat, Apr 09, 2005 at 05:33:22PM -0400, Chuck Swiger wrote: [ mail storage backed by DB ] > > > > The advantage is that users gets fancy searching. > > > > The disadvantage is that you need to provide around 4 times as much disk > > space for a DB-based mailstore as you would for a normal mbox/maildir style > > representation, you need to provide a lot more server horsepower, you need > > to continuously maintain and purge old mail from the database, and you end > > up with your mail buried in database tables, so heaven help you if the > > database becomes inconsistent and you need to recover. Whereas you can repair mbox-files with your favorite editor and employ pretty much the same level of fancy searching with a couple of scripts. > But as for increased storage requirements, I've always wondered how > much could be saved by an intelligent method of behind-the-scenes > handling of quoting among messages in a thread. Goodness knows half > the mail on a lot of lists, and even in a lot of personal mail > streams, is simply copies of some or all of other messages, perhaps > shifted over by quote signs like `>' etc. Seems to me a system could > be devised to store directions for rebuilding a message instead of the > message itself with all quoting intact. Basically, you could just kill any quotechar, trim headers and store the threads as incremental diffs. You could squeeze redundancy a bit more, but then you'll cry if some bug decides to eat a byte or two. ;) > but I wouldn't be surprised if it could reverse the > increased storage requirements you mention. Probably. What's the gain in all that, though? The mbox-format is simple enough[1], you can just build something to suit your needs in your favorite scripting language. Personally, I'd just build three scripts for that: - The first to interactively insert some headers from within my MUA (mutt, in this instance), i.e. 'X-Archive-Keywords: ' and 'X-Archive-Location: '. - The second to (as a cron-job) i) extract mails from mbox files ii) move them into some kind of archive directory tree (based on the above -location-header, i.e. $TREEBASE/$LOCATION) and iii) store interesting headers inside a DB. - The third for searching and cat(1)ing results to stdout (which in turn is nothing but a new mbox-file). The hard part about this is integrating it into $MUA, but there might be some hook around for that. Actually looks like a perfect mini-project to learn a new language with. ;) Cheers. Mario [1]: IIRC: the header of a mail starts with /^From / and terminates with /^$/ and the other way around for the body of a mail. Can't get more simple than that.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20050411000641.GB620>