From owner-freebsd-questions Fri Sep 28 11:28:56 2001 Delivered-To: freebsd-questions@freebsd.org Received: from guru.mired.org (okc-94-248-46.mmcable.com [24.94.248.46]) by hub.freebsd.org (Postfix) with SMTP id 70D8037B40F for ; Fri, 28 Sep 2001 11:28:49 -0700 (PDT) Received: (qmail 87507 invoked by uid 100); 28 Sep 2001 18:28:48 -0000 From: Mike Meyer MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <15284.49504.73445.529806@guru.mired.org> Date: Fri, 28 Sep 2001 13:28:48 -0500 To: Cliff Sarginson , swear@blarg.net (Gary W. Swearingen), "Brian" Cc: questions@freebsd.org Subject: Re: Looking for Mr good mail archiver port...[OT ?] In-Reply-To: <108871926@toto.iv> X-Mailer: VM 6.90 under 21.1 (patch 14) "Cuyahoga Valley" XEmacs Lucid X-face: "5Mnwy%?j>IIV\)A=):rjWL~NB2aH[}Yq8Z=u~vJ`"(,&SiLvbbz2W`;h9L,Yg`+vb1>RG% *h+%X^n0EZd>TM8_IB;a8F?(Fb"lw'IgCoyM.[Lg#r\ Sender: owner-freebsd-questions@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG Cliff Sarginson types: > Hello, > Yesterday I cleaned up a lot of my mail directories and > archived a lot of the messages for future use. What I want > to do is to create my own personal database of information from > the various mailing lists I am on. I also tend to eliminate > the messages that say "thanks, I will try that" etc.. (not because > there is anything wrong with that, I am just trying to keep > concrete information). > Now I am looking for a way of handling this ever growing > body of information. Rather than having to load mailboxes > and search etc. > For example say I am looking for a piece of information on > "FreeBSD and Massively Parallel Processors for use on the > Moon" I want to do fuzzy searches on the mail files and come up > with an indexed display of relevant messages. Bearing in mind > that some of my saved archives come from different mailing lists. WAIS is a pretty good one. Keep reading for more comments. Gary W. Swearingen types: > If you can't find a tool that meets your needs, you might try learning > XEmacs with Gnus. I don't see a lot of searching support -- there's > a function for searching the messages or headers of the current folder. If you're going to try XEmacs, I'd recommend VM instead of Gnus. VM allows you to construct ad-hoc searches based on various headers, and will either mark the results or put them in a virtual folder. If you have complex searches you perform on a regular basis, you can create a virtual folder that will track that search for you. Brian types: > I have heard of people indexing data with glimpse I believe, to improve > future searchability. WAIS is a better tool for this kind of thing, and I've used it for that before. What makes it better is that it understands the structure of mail messages and mailboxes, so you can do queries on it as mail, instead of as flat text files. The net/zebra-server port provides the same kind of thing, only it uses the standards that grew out of WAIS so you can use other clients to search the database. If you're planning on offering a web-based front end, there's a WAIS module for apache, and browsers used to be able to do WAIS directly - some may still be able to do that - though the interface pretty much sucked. The WAIS implementations in the ports blow up if you give them to much data. This may not be a problem if you're going to keep separate indices for each list. Since WAIS lets you search multiple databases with a single query, this shouldn't be a serious problem. I haven't really looked at zebra-server yet. Cliff Sarginson types: > Thanks for the replies. > I experimented with "grepmail" which has a neat front-end > someone wrote for use with mutt. > It is very good. > I plan to see how fast it can handle huge compressed > mailboxes in the next day or so.. Once WAIS started blowing on my personal folder, I found that standard Unix tools work fairly well. I sort the mail out into directories by month, one message per file. So doing things like "look for a message in june or july about SCSI disks" turns into: find 2001-0[67] -type f | xargs grep -i "^subject:.*scsi" Adding the qualifier "From joe" turns it into: find 2001-0[67] -type f | xargs grep -il "^from:.*joe" | xargs grep -i "^subject:.*scsi" Reading the messages instead of looking at the list of names/subjects is: find 2001-0[67] -type f | xargs grep -il "^from:.*joe" | xargs grep -il "^subject:.*scsi" | more I'm a CLI kind of guy, so the above doesn't bother me much. Doing a Tk or web front end for this should be pretty simple. http://www.mired.org/home/mwm/ Q: How do you make the gods laugh? A: Tell them your plans. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-questions" in the body of the message