From owner-freebsd-hackers Tue Nov 19 14:10:17 1996 Return-Path: owner-hackers Received: (from root@localhost) by freefall.freebsd.org (8.7.5/8.7.3) id OAA10205 for hackers-outgoing; Tue, 19 Nov 1996 14:10:17 -0800 (PST) Received: from brasil.moneng.mei.com (brasil.moneng.mei.com [151.186.109.160]) by freefall.freebsd.org (8.7.5/8.7.3) with ESMTP id OAA10183 for ; Tue, 19 Nov 1996 14:10:06 -0800 (PST) Received: (from jgreco@localhost) by brasil.moneng.mei.com (8.7.Beta.1/8.7.Beta.1) id QAA05968; Tue, 19 Nov 1996 16:07:41 -0600 From: Joe Greco Message-Id: <199611192207.QAA05968@brasil.moneng.mei.com> Subject: Re: Announce: Alternative Mail Archive To: jfieber@indiana.edu (John Fieber) Date: Tue, 19 Nov 1996 16:07:41 -0600 (CST) Cc: mark@quickweb.com, hackers@freebsd.org In-Reply-To: from "John Fieber" at Nov 19, 96 00:09:37 am X-Mailer: ELM [version 2.4 PL24] Content-Type: text Sender: owner-hackers@freebsd.org X-Loop: FreeBSD.org Precedence: bulk Hi, > First, browsing, as hypermail sets it up, is of very limited > utility for finding anything in list archives of FreeBSD scale > (currently about 300 megabytes and growing fast). Browsing is > much better suited as a second step after an initial search has > identified a few key messages. Using those keys, it is then > useful to retrieve the thread context. Being able to re-sort a > chunk of message by date, subject, author is useful, but only if > the searcher has control over what is in the chunk. Hypermail > just blindly chops things up into time segments and the chunk > composition is static. The proper place for chunk sorting is on > a set of retrieved messages. That is probably true, but (at least when I am searching the lists) I usually have some idea what time frame I am interested in. I am usually looking to quote something back at somebody, etc. It is very frustrating to type in a bunch of terms and still have it hit a hundred messages, half of which are from 1995. Often I would much rather just see a thread of messages, and look through them. A lengthy list, of course, is unmanageable and unwieldy, I was looking through the gated-people lists the other evening and swearing that it took five to ten seconds every time I read a message and then hit "Back" to return to the zillions of messages long list. > The problem is that good IR systems are proprietary, and free IR > systems are crap. Of course, I've spent quite a lot of time > reading and writing about IR theory, so I'm pretty cynical about > the whole field. (Since this is the direction of my Ph.D. > research, maybe it isn't such a good thing?) Write a good free IR system? :-) In general I am frustrated with the current search engine and often I would rather go to the raw list archives and search backwards for a keyword or two, because that way at least I am assured of getting the date relevance I usually desire. The size of the current list archives are rather hefty... 19954856 Nov 19 13:32 freebsd-bugs 15828458 Nov 4 14:26 freebsd-commit 34684292 Nov 19 11:47 freebsd-current 76949942 Nov 19 13:31 freebsd-hackers 6535669 Nov 19 11:25 freebsd-isp 14245498 Nov 19 12:50 freebsd-ports 72657153 Nov 19 13:07 freebsd-questions That is a LOT of data to look through, and dates back to early 1995.. ... JG