From owner-freebsd-hackers Mon Jan 30 18:22:11 1995 Return-Path: hackers-owner Received: (from root@localhost) by freefall.cdrom.com (8.6.9/8.6.6) id SAA19724 for hackers-outgoing; Mon, 30 Jan 1995 18:22:11 -0800 Received: from seagull.rtd.com (root@Seagull.rtd.com [198.102.68.2]) by freefall.cdrom.com (8.6.9/8.6.6) with ESMTP id SAA19718 for ; Mon, 30 Jan 1995 18:22:08 -0800 Received: (from dgy@localhost) by seagull.rtd.com (8.6.9/8.6.9.1) id TAA21204 for freebsd-hackers@freefall.cdrom.com; Mon, 30 Jan 1995 19:21:50 -0700 From: Don Yuniskis Message-Id: <199501310221.TAA21204@seagull.rtd.com> Subject: ispell / sed bug To: freebsd-hackers@freefall.cdrom.com Date: Mon, 30 Jan 1995 19:21:49 -0700 (MST) X-Mailer: ELM [version 2.4 PL24] MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Content-Length: 2894 Sender: hackers-owner@FreeBSD.org Precedence: bulk > > Actually, I believe 3.1.08 *also* suffers from a broken sed problem. > > Could you elaborate on the nature of the bugs/problems you are aware > > of? Thx, --don > > I think the problem is with the creation of affixes from dictionary, > in which sed is not involved: > Collecting input. > Finding flag marker. > Generating roots and affixes. > Expanding dictionary into EXPANDEDPAIRS. > Creating list of legal roots/flags. > Creating list of flags that participate in cross-products. > Finding prefix and suffix flags. > Creating awk script. > Creating cross expansions (pass 0). > Finding illegal cross expansions (pass 0). > Creating cross expansions (pass 1). > Finding illegal cross expansions (pass 1). > Creating cross expansions (pass 2). > Finding illegal cross expansions (pass 2). > Creating cross expansions (pass 3). > Finding illegal cross expansions (pass 3). > Finding roots of cross expansions. > > Illegal affix flag character 'a' > > Illegal affix flag character 'a' > > Illegal affix flag character 'a' > > etc. No! This bug is caused exactly by the sed bug I've mentioned! The "D" and "P" operators for sed don't work (P inserts a NUL instead of a NL and D flushes the line prematurely.). As a result, when munchlist tries to break an entry of the form alamo/ABCDE into alamo/A alamo/B alamo/C etc. (purely fictitious example), it instead generates alamo/Aalamo/BCDE (or something like that). However, the first "/" is used to delimit the start of the "flags". So, this is erroneously interpreted as "alamo" with the flags "Aalamo/BCDE". If you look at the list of "Illegal affix flag character" error messages, you'll note that each successive "illegal flag" is actually the next letter of a word from the dictionary. So, you'll end up with serveral MB of error messages if you build a big dictionary! An important note: This bug is present in 3.1.08 also! The reason you never noticed the errors in a 3.1.08 build was because the older versions "forced" each flag to be valid (i.e. flag = toupper(flag); for all practical purposes.) The newer code actually verifies the flag as being legal or illegal... hence the abundance of error messages. I have, unfortunately, not spent the time to *prove* that the resulting dictionaries contain illegal suffix/affix combinations (sorry, not high on my todo list :-( ) > There is another buglet in the munchlist script (the 'SIGNED' variable > is incorect), which is fixed in the last version. Ah, I hadn't noticed this... > What exactly is the sed problem? > > Jean-Marc. GNU sed 2.05 works correctly (at least, it doesn't exhibit the D & P bug) However, I haven't tried to rebuild ispell in quite some time... I'd appreciate any feedback others have to offer (since I supher frum pore spelink lik moest enjinears... or, is it perhaps, laziness???) Mercy buckets! --don