Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 28 Sep 2001 13:28:48 -0500
From:      Mike Meyer <mwm@mired.org>
To:        Cliff Sarginson <cliff@raggedclown.net>, swear@blarg.net (Gary W. Swearingen), "Brian" <bri@sonicboom.org>
Cc:        questions@freebsd.org
Subject:   Re: Looking for Mr good mail archiver port...[OT ?]
Message-ID:  <15284.49504.73445.529806@guru.mired.org>
In-Reply-To: <108871926@toto.iv>

next in thread | previous in thread | raw e-mail | index | archive | help
Cliff Sarginson <cliff@raggedclown.net> types:
> Hello,
> Yesterday I cleaned up a lot of my mail directories and
> archived a lot of the messages for future use. What I want
> to do is to create my own personal database of information from
> the various mailing lists I am on. I also tend to eliminate
> the messages that say "thanks, I will try that" etc.. (not because
> there is anything wrong with that, I am just trying to keep
> concrete information).
> Now I am looking for a way of handling this ever growing
> body of information. Rather than having to load mailboxes
> and search etc. 
> For example say I am looking for a piece of information on
> "FreeBSD and Massively Parallel Processors for use on the
> Moon" I want to do fuzzy searches on the mail files and come up
> with an indexed display of relevant messages. Bearing in mind
> that some of my saved archives come from different mailing lists.

WAIS is a pretty good one. Keep reading for more comments.

Gary W. Swearingen <swear@blarg.net> types:
> If you can't find a tool that meets your needs, you might try learning
> XEmacs with Gnus.  I don't see a lot of searching support -- there's
> a function for searching the messages or headers of the current folder.

If you're going to try XEmacs, I'd recommend VM instead of Gnus. VM
allows you to construct ad-hoc searches based on various headers, and
will either mark the results or put them in a virtual folder. If you
have complex searches you perform on a regular basis, you can create a
virtual folder that will track that search for you.

Brian <bri@sonicboom.org> types:
> I have heard of people indexing data with glimpse I believe, to improve
> future searchability.

WAIS is a better tool for this kind of thing, and I've used it for
that before. What makes it better is that it understands the structure
of mail messages and mailboxes, so you can do queries on it as mail,
instead of as flat text files. The net/zebra-server port provides the
same kind of thing, only it uses the standards that grew out of WAIS
so you can use other clients to search the database.

If you're planning on offering a web-based front end, there's a WAIS
module for apache, and browsers used to be able to do WAIS directly -
some may still be able to do that - though the interface pretty much
sucked.

The WAIS implementations in the ports blow up if you give them to much
data. This may not be a problem if you're going to keep separate
indices for each list. Since WAIS lets you search multiple databases
with a single query, this shouldn't be a serious problem.

I haven't really looked at zebra-server yet.

Cliff Sarginson <cliff@raggedclown.net> types:
> Thanks for the replies.
> I experimented with "grepmail" which has a neat front-end
> someone wrote for use with mutt.
> It is very good.
> I plan to see how fast it can handle huge compressed
> mailboxes in the next day or so..

Once WAIS started blowing on my personal folder, I found that standard
Unix tools work fairly well. I sort the mail out into directories by
month, one message per file. So doing things like "look for a message
in june or july about SCSI disks" turns into:

    find 2001-0[67] -type f | xargs grep -i "^subject:.*scsi"

Adding the qualifier "From joe" turns it into:

    find 2001-0[67] -type f | xargs grep -il "^from:.*joe" |
	xargs grep -i "^subject:.*scsi"

Reading the messages instead of looking at the list of names/subjects is:

    find 2001-0[67] -type f | xargs grep -il "^from:.*joe" |
	xargs grep -il "^subject:.*scsi" | more

I'm a CLI kind of guy, so the above doesn't bother me much. Doing a Tk
or web front end for this should be pretty simple.

	<mike
--
Mike Meyer <mwm@mired.org>			http://www.mired.org/home/mwm/
Q: How do you make the gods laugh?		A: Tell them your plans.

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-questions" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?15284.49504.73445.529806>