From owner-freebsd-doc Fri Jun 14 15:35:16 2002 Delivered-To: freebsd-doc@freebsd.org Received: from proxy.centtech.com (moat.centtech.com [206.196.95.10]) by hub.freebsd.org (Postfix) with ESMTP id 3D61737B406; Fri, 14 Jun 2002 15:34:55 -0700 (PDT) Received: from sprint.centtech.com (sprint.centtech.com [10.177.173.31]) by proxy.centtech.com (8.11.6/8.11.6) with ESMTP id g5EMYs111616; Fri, 14 Jun 2002 17:34:54 -0500 (CDT) Received: (from root@localhost) by sprint.centtech.com (8.11.6+Sun/8.11.6) id g5EMYsn06511; Fri, 14 Jun 2002 17:34:54 -0500 (CDT) Received: from centtech.com (proton [10.177.173.77]) by sprint.centtech.com (8.11.6+Sun/8.11.6) with ESMTP id g5EMYo606488; Fri, 14 Jun 2002 17:34:50 -0500 (CDT) Message-ID: <3D0A6F8A.3F56245B@centtech.com> Date: Fri, 14 Jun 2002 17:34:50 -0500 From: Eric Anderson X-Mailer: Mozilla 4.79 [en] (X11; U; Linux 2.4.2 i386) X-Accept-Language: en MIME-Version: 1.0 To: Nik Clayton Cc: doc@freebsd.org Subject: Re: Search engine enhancements References: <3D08A7DB.8BE28A90@centtech.com> <20020614085133.V39690@canyon.nothing-going-on.org> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Virus-Scanned: by AMaViS perl-11 Sender: owner-freebsd-doc@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org These all sound good, and most I already had on my "feature" list. The whole engine will most likely be in Perl, and NO SQL db backend. I'll try a cvsup of www this weekend (I can do that right?). Will that lay out the file structure the same as the official www sites? I'd like to have an idea of how things are layed out before I begin my planning. Eric Nik Clayton wrote: > > On Thu, Jun 13, 2002 at 09:10:35AM -0500, Eric Anderson wrote: > > I saw this on the FreeBSD documentation site, as a current project. Is anyone > > currently working on this? > > Not that I know of. > > > I would love to jump into this project see what we > > can do. What do I need to do to get started? > > Come up with a better search engine interface and backend than what > we're currently using, preferably based on software that's available in > the ports tree so that it's trivial for mirrors to set up. > > "better" in this case is subjective, and I can't recall a thread in here > that's really covered a 'wishlist' of requirements for a better search > system. > > So lets start one. Off the top of my head: > > The set of search operators is too small. I'd like to be able > to limit my search to text that appears in the: > > Subject line > Body text > From/To address > > I'd like to do queries by date, so that I can search for > messages that match only in the last 3 months. > > I'd like the bug that makes it flaky when you search more than > three mailing list archives fixed. > > Viewing the thread that a message comes from is painful. Google > have solved this in a particularly nice way -- you can view all > the messages in a thread, in thread order, on a single page, > using their Google Groups interface. > > An alternative way of specifying the mailing lists to search > would be nice. Keep the checkboxes, but give me a box where I > can type in "arch,current,hackers" to limit the search to just > those lists -- I can type that much faster than I can navigate > the mouse over to three fairly small interface elements and > click. Especially if I have to scroll the screen in order to > reach all the checkboxes. > > Lose the requirement to specify "AND" and "OR" as connectives in > the query string. The string > > foo bar baz > > should rank messages that feature all three words high in the > results. Messages that only feature two of them should be a > little lower, and so on. Maybe use the (fairly) standard > notation > > +foo +bar baz > > to indicate that 'foo' and 'bar' are mandatory, and that baz is > optional. > > It's not clear when you limit the number of search results how > the limit is done. Does it just stop when it finds the first > 'n' results? Or does it gather all of them, order them, and > show you the first 'n'. Better to generate a page of 'n' > results at a time, where the user can specify how many results > they want per page. > > When viewing the results, highlight the terms in the search that > matched in the text (maybe). > > Anyone else? > > N > -- > FreeBSD: The Power to Serve http://www.freebsd.org/ (__) > FreeBSD Documentation Project http://www.freebsd.org/docproj/ \\\'',) > \/ \ ^ > --- 15B8 3FFC DDB4 34B0 AA5F 94B7 93A8 0764 2C37 E375 --- .\._/_) > > -------------------------------------------------------------------------------- > Part 1.2Type: application/pgp-signature -- ------------------------------------------------------------------ Eric Anderson Systems Administrator Centaur Technology Torque, it makes the world go 'round. ------------------------------------------------------------------ To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-doc" in the body of the message