Date: Tue, 9 Jul 2002 22:16:42 +0200 From: Brad Knowles <brad.knowles@skynet.be> To: Eric Anderson <anderson@centtech.com>, Ross Lippert <ripper@eskimo.com> Cc: joseph@randomnetworks.com, freebsd-doc@freebsd.org, freebsd-chat@freebsd.org Subject: Re: Beta FreeBSD search engine Message-ID: <a05111b3cb950f4f23d0e@[10.0.1.15]> In-Reply-To: <3D2B43EF.955661FC@centtech.com> References: <200207091944.MAA05507@eskimo.com> <3D2B43EF.955661FC@centtech.com>
next in thread | previous in thread | raw e-mail | index | archive | help
At 3:13 PM -0500 2002/07/09, Eric Anderson wrote: > Ok, all good thoughts.. One question: > > How can I determine a language for a page by looking at it? You need dictionaries of words in various languages, then you do a sort | uniq of all words in the document and compare it against the language dictionaries. The language dictionary with the highest number of hits is most likely to be the one in which the document is written. -- Brad Knowles, <brad.knowles@skynet.be> "They that can give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety." -Benjamin Franklin, Historical Review of Pennsylvania. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-doc" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?a05111b3cb950f4f23d0e>