Date: Mon, 11 Apr 2005 02:06:42 +0200 From: Mario Hoerich <lists@MHoerich.de> To: Doug Lee <dgl@dlee.org>, Chuck Swiger <cswiger@mac.com>, freebsd-questions@freebsd.org Subject: Re: Anyone ever consider a filesystem served by MySQL for mail folders? Message-ID: <20050411000641.GB620@Pandora.MHoerich.de> In-Reply-To: <20050409223001.GA58918@kirk.dlee.org> References: <20050409203727.GI4670@kirk.dlee.org> <42584A22.9010209@mac.com> <20050409223001.GA58918@kirk.dlee.org>
next in thread | previous in thread | raw e-mail | index | archive | help
# Doug Lee:
[ fixed quote-levels ]
> On Sat, Apr 09, 2005 at 05:33:22PM -0400, Chuck Swiger wrote:
[ mail storage backed by DB ]
> >
> > The advantage is that users gets fancy searching.
> >
> > The disadvantage is that you need to provide around 4 times as much disk
> > space for a DB-based mailstore as you would for a normal mbox/maildir style
> > representation, you need to provide a lot more server horsepower, you need
> > to continuously maintain and purge old mail from the database, and you end
> > up with your mail buried in database tables, so heaven help you if the
> > database becomes inconsistent and you need to recover.
Whereas you can repair mbox-files with your favorite editor
and employ pretty much the same level of fancy searching
with a couple of scripts.
> But as for increased storage requirements, I've always wondered how
> much could be saved by an intelligent method of behind-the-scenes
> handling of quoting among messages in a thread. Goodness knows half
> the mail on a lot of lists, and even in a lot of personal mail
> streams, is simply copies of some or all of other messages, perhaps
> shifted over by quote signs like `>' etc. Seems to me a system could
> be devised to store directions for rebuilding a message instead of the
> message itself with all quoting intact.
Basically, you could just kill any quotechar, trim headers and
store the threads as incremental diffs. You could squeeze redundancy
a bit more, but then you'll cry if some bug decides to eat a byte
or two. ;)
> but I wouldn't be surprised if it could reverse the
> increased storage requirements you mention.
Probably.
What's the gain in all that, though?
The mbox-format is simple enough[1], you can just build something
to suit your needs in your favorite scripting language.
Personally, I'd just build three scripts for that:
- The first to interactively insert some headers from within my
MUA (mutt, in this instance), i.e. 'X-Archive-Keywords: ' and
'X-Archive-Location: '.
- The second to (as a cron-job)
i) extract mails from mbox files
ii) move them into some kind of archive directory tree (based
on the above -location-header, i.e. $TREEBASE/$LOCATION)
and iii) store interesting headers inside a DB.
- The third for searching and cat(1)ing results to stdout
(which in turn is nothing but a new mbox-file).
The hard part about this is integrating it into $MUA, but
there might be some hook around for that. Actually looks
like a perfect mini-project to learn a new language with. ;)
Cheers.
Mario
[1]:
IIRC: the header of a mail starts with /^From / and
terminates with /^$/ and the other way around for the
body of a mail. Can't get more simple than that.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20050411000641.GB620>
