Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 30 Jan 1995 19:21:49 -0700 (MST)
From:      Don Yuniskis <dgy@seagull.rtd.com>
To:        freebsd-hackers@freefall.cdrom.com
Subject:   ispell / sed bug
Message-ID:  <199501310221.TAA21204@seagull.rtd.com>

next in thread | raw e-mail | index | archive | help
>  > Actually, I believe 3.1.08 *also* suffers from a broken sed problem.
>  > Could you elaborate on the nature of the bugs/problems you are aware
>  > of?   Thx, --don
> 
> I think the problem is with the creation of affixes from dictionary,
> in which sed is not involved:
> 	Collecting input.
> 	Finding flag marker.
> 	Generating roots and affixes.
> 	Expanding dictionary into EXPANDEDPAIRS.
> 	Creating list of legal roots/flags.
> 	Creating list of flags that participate in cross-products.
> 	Finding prefix and suffix flags.
> 	Creating awk script.
> 	Creating cross expansions (pass 0).
> 	Finding illegal cross expansions (pass 0).
> 	Creating cross expansions (pass 1).
> 	Finding illegal cross expansions (pass 1).
> 	Creating cross expansions (pass 2).
> 	Finding illegal cross expansions (pass 2).
> 	Creating cross expansions (pass 3).
> 	Finding illegal cross expansions (pass 3).
> 	Finding roots of cross expansions.
> 
> 	Illegal affix flag character 'a'
> 
> 	Illegal affix flag character 'a'
> 
> 	Illegal affix flag character 'a'
> 
> 	etc.

No!  This bug is caused exactly by the sed bug I've mentioned!
The "D" and "P" operators for sed don't work (P inserts a NUL
instead of a NL and D flushes the line prematurely.).  As a
result, when munchlist tries to break an entry of the form
	alamo/ABCDE
into
	alamo/A
	alamo/B
	alamo/C
	etc.
(purely fictitious example), it instead generates
	alamo/Aalamo/BCDE
(or something like that).  However, the first "/" is used to delimit
the start of the "flags".  So, this is erroneously interpreted as
"alamo" with the flags "Aalamo/BCDE".  If you look at the list of
"Illegal affix flag character" error messages, you'll note that each
successive "illegal flag" is actually the next letter of a word from the
dictionary.  So, you'll end up with serveral MB of error messages if
you build a big dictionary!

An important note:  This bug is present in 3.1.08 also!  The reason you 
never noticed the errors in a 3.1.08 build was because the older versions
"forced" each flag to be valid (i.e. flag = toupper(flag); for all
practical purposes.)  The newer code actually verifies the flag as 
being legal or illegal... hence the abundance of error messages.

I have, unfortunately, not spent the time to *prove* that the
resulting dictionaries contain illegal suffix/affix combinations
(sorry, not high on my todo list  :-( )

> There is another buglet in the munchlist script (the 'SIGNED' variable
> is incorect), which is fixed in the last version.

Ah, I hadn't noticed this...

> What exactly is the sed problem?
> 
> Jean-Marc.

GNU sed 2.05 works correctly (at least, it doesn't exhibit the D & P bug)
However, I haven't tried to rebuild ispell in quite some time... I'd
appreciate any feedback others have to offer (since I supher frum pore
spelink lik moest enjinears... or, is it perhaps, laziness???)

Mercy buckets!
--don



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199501310221.TAA21204>