Date: Wed, 07 Nov 2001 23:25:34 -0500 From: Joseph Jacobson <jacobson@pobox.com> To: admin@twwells.com Cc: freebsd-hackers@freebsd.org Subject: Re: missing words, lots of them Message-ID: <200111080425.fA84PZ921358@bjork.quonix.net> In-Reply-To: Message from admin@twwells.com of "Tue, 25 Sep 2001 05:51:47 EDT." <E15losV-000BIQ-00@twwells.com>
next in thread | previous in thread | raw e-mail | index | archive | help
> These words, 830 of them, were obtained by intersecting the words > in a number of lexicons and then subtracting the words in > /usr/share/dict/web2. This all done with words that contain only > lowercase letters. > > You'll find those words at the end of this message. Should you > take even a cursory look at this list, I expect you'll be appalled > at the words that are not in the lexicon. > > The point is *not* that these words should be added. The point is > that a cursory, in-my-sleep check of the word list shows glaring > deficiencies. A serious audit of the list will find way many more > missing words (I did a preliminary -- think ~50,000-100,000 > missing words if it is supposed to approximate the contents of an > unabridged dictionary.) > > Anyway, I'm willing to create a replacement list, if it's likely > to actually get used. Wondering if anything became of this.... It would be nice to have a relatively complete word list. http://www.puzzlers.org/secure/wordlists/dictinfo.html contains a good summary of publically available word lists. IMHO, the ENABLE list mentioned there (http://www.puzzlers.org/secure/wordlists/enable_readme.txt) seems like a good candidate for a drop-in replacement.... Joe To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-hackers" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200111080425.fA84PZ921358>