Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 04 Sep 2005 09:36:54 +0300
From:      Rein Kadastik <wigry@uninet.ee>
To:        freebsd-hackers@freebsd.org
Subject:   Re: sed not working
Message-ID:  <431A9606.6010401@uninet.ee>
In-Reply-To: <20050903154526.GA1247@gothmog.gr>
References:  <43196C96.6040504@uninet.ee>	<20050903101800.GA77285@cirb503493.alcatel.com.au>	<43198251.6070606@uninet.ee> <43198354.3000402@uninet.ee>	<4319864A.3040706@uninet.ee> <20050903154526.GA1247@gothmog.gr>

next in thread | previous in thread | raw e-mail | index | archive | help
Giorgos Keramidas wrote:

>On 2005-09-03 14:17, Rein Kadastik <wigry@uninet.ee> wrote:
>  
>
>>Rein Kadastik wrote:
>>    
>>
>>>Well I have one guess here. In estonian alphabet, the z comes
>>>immediately after s and before t. So as the regex orders [a-z] the
>>>characters t, u, v, w, x, y are left out
>>>
>>>How to order the sed to use english alphabet?
>>>      
>>>
>>Well, My guess was right. I have a following line in the /etc/profile:
>>
>>export LANG=et_EE.ISO8859-15
>>
>>After I expoerted LANG=en_US.ISO8859-1, the sed started to work.
>>
>>I did not thought that LANG parameter will also alter the alfabet and
>>therefore the expression [a-z] does not cover the full alphabet anymore.
>>    
>>
>
>By using a character class:
>
>	[[:alpha:]]
>
>AFAIK, if you are using non-English locales, there's no guarantee that
>[a-z] will be the entire set of lowercase letters, or that it will only
>include lowercase letters, for that matter.
>
>_______________________________________________
>freebsd-hackers@freebsd.org mailing list
>http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
>To unsubscribe, send any mail to "freebsd-hackers-unsubscribe@freebsd.org"
>
>  
>
Yep, I know but it does not matter. The form [a-z] is used all over the 
place in the FreeBSD source (1629 lines in 4.11-RELEASE-p11 and almost 
1600 in 5-STABLE). Totally hopeless. Seems, that no developer have ever 
heard about character classes and it VERY UNSAFE to try to compile (and 
actually even run) FreeBSD with some other locale than C/en_US.ISO8859-1.

I actually searched for existance of character classes in source code. 
Found around 30 matches. Mostly in manual pages. Perl configure script 
checks if tr supports them, but it actually never uses the featuire 
(even if available).

I am totally dissappointed about this. I thought about reporting a bug, 
but as it is everywhere, there is no point to do so.

Rein



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?431A9606.6010401>