Date: Fri, 11 May 2001 20:56:10 -0500 From: Mike Meyer <mwm@mired.org> To: Nathan@Vidican.com Cc: questions@freebsd.org, "Ted Mittelstaedt" <tedm@toybox.placo.com> Subject: RE: email to SQL Message-ID: <15100.38970.996390.52851@guru.mired.org> In-Reply-To: <68112128@toto.iv>
next in thread | previous in thread | raw e-mail | index | archive | help
Ted Mittelstaedt <tedm@toybox.placo.com> types: > Somewhere there are patches to qmail that make it use a SQL > server. You might look at that, maybe there is something you > can use there. There's one of those in the ports (qmail-mysql). It, like most such things, just uses the SQL server for admin data (users info, aliases, etc.). Delivering to a mailbox is a different critter entirely. There may be such hacks that actually delivers to an SQL database, but I wouldn't bet on it - doing that right is a nasty problem. >From: Nathan Vidican > >Does anyone happen to know of, (or have), some small utility which will > >archive email into an SQL table? I'm looking for something that will > >retrieve the messages either via direct access to the mail spool, or via > >pop3. I know that I could probably just ripoff a portion of some > >webmail app > >to accomplish this, but to be optimistic I figured someone might have > >already done so, and would be willing to share their code. I would > >prefer to > >use C, but PERL will work too. Well, if you were willing to use Python, their's a pop3 client class and a mail parser class in the standard library. That's 90% of the work; all you have to do is write the message objects attributes into your database. Last time a client needed this kind of thing from me, I wrote a sendmail delivery agent and all mail to the domain of interest was handed to it by sendmail. Worked like a charm. > > I will require the code so-as to allow for an indexing of the emails > >from within a website. I want the website to be able to search for messages > >based on content and subject. I would prefer not to keep the emails in an > >archive file similar to the mail spool format because of performance > >reasons. I figure running an SQL query once the system has 10,000+ > >emails in > >it will be much faster than trying to search a couple hundred > >thousand lines > >of a text file. I think you've misfigured. The amount of time it takes to search a text is pretty much determined by the search algorithm, not whether the text is stored in an SQL server or a flat file. In fact, assuming the same search algorithm is being used, the flat text file should be faster. mmap it in and you've got it all to search. Since your text is be scattered across multiple database rows, it will take more than that for the SQL server to load it before it can start searching. The best text search algorithm is to prepare an index of the stuff before you need to search it. It's possible to store index information in a database and search those efficiently, but I'm not sure that's the most efficient tack to take. Datablades - if mysql has those, *please* let me know! - might be useful here, but I've not had a chance to play with them. Someone who's more current on the issue may suggest something else. Unless your requirements are strange, your best bet is probably using a text search tool of some kind, preferably one that text that's structured like mail messages. The best sucess I've had is with WAIS (there are two versions in the ports), and your database seems to be small enough for it to handle. Drop me a note off-list if you want to talk about it some more. <mike -- Mike Meyer <mwm@mired.org> http://www.mired.org/home/mwm/ Independent WWW/Perforce/FreeBSD/Unix consultant, email for more information. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-questions" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?15100.38970.996390.52851>