From owner-freebsd-chat Tue Jul 9 17:35:34 2002 Delivered-To: freebsd-chat@freebsd.org Received: from mx1.FreeBSD.org (mx1.FreeBSD.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id D4C0337B401; Tue, 9 Jul 2002 17:35:30 -0700 (PDT) Received: from mx1.eskimo.com (mx1.eskimo.com [204.122.16.48]) by mx1.FreeBSD.org (Postfix) with ESMTP id 1E99343E3B; Tue, 9 Jul 2002 17:35:30 -0700 (PDT) (envelope-from ripper@eskimo.com) Received: from eskimo.com (ripper@eskimo.com [204.122.16.13]) by mx1.eskimo.com (8.9.1a/8.8.8) with ESMTP id RAA22098; Tue, 9 Jul 2002 17:35:27 -0700 Received: (from ripper@localhost) by eskimo.com (8.9.1a/8.9.1) id RAA19820; Tue, 9 Jul 2002 17:35:27 -0700 (PDT) Date: Tue, 9 Jul 2002 17:35:27 -0700 (PDT) Message-Id: <200207100035.RAA19820@eskimo.com> From: Ross Lippert To: anderson@centtech.com Cc: joseph@randomnetworks.com, freebsd-doc@freebsd.org, freebsd-chat@freebsd.org In-reply-to: <3D2B43EF.955661FC@centtech.com> (message from Eric Anderson on Tue, 09 Jul 2002 15:13:35 -0500) Subject: Re: Beta FreeBSD search engine Sender: owner-freebsd-chat@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org Well, isn't the freeBSD documentation at least under directories which have 2-letter codes? Alternatively, if you clustered documents by common word count, the various corpi for the different languages should show up as distinct clusters. If you are doing any word co-occurrence calculations, I'm sure they could be modified to produce a language classification. -r To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-chat" in the body of the message