Date: Sun, 04 Sep 2005 09:36:54 +0300 From: Rein Kadastik <wigry@uninet.ee> To: freebsd-hackers@freebsd.org Subject: Re: sed not working Message-ID: <431A9606.6010401@uninet.ee> In-Reply-To: <20050903154526.GA1247@gothmog.gr> References: <43196C96.6040504@uninet.ee> <20050903101800.GA77285@cirb503493.alcatel.com.au> <43198251.6070606@uninet.ee> <43198354.3000402@uninet.ee> <4319864A.3040706@uninet.ee> <20050903154526.GA1247@gothmog.gr>
next in thread | previous in thread | raw e-mail | index | archive | help
Giorgos Keramidas wrote: >On 2005-09-03 14:17, Rein Kadastik <wigry@uninet.ee> wrote: > > >>Rein Kadastik wrote: >> >> >>>Well I have one guess here. In estonian alphabet, the z comes >>>immediately after s and before t. So as the regex orders [a-z] the >>>characters t, u, v, w, x, y are left out >>> >>>How to order the sed to use english alphabet? >>> >>> >>Well, My guess was right. I have a following line in the /etc/profile: >> >>export LANG=et_EE.ISO8859-15 >> >>After I expoerted LANG=en_US.ISO8859-1, the sed started to work. >> >>I did not thought that LANG parameter will also alter the alfabet and >>therefore the expression [a-z] does not cover the full alphabet anymore. >> >> > >By using a character class: > > [[:alpha:]] > >AFAIK, if you are using non-English locales, there's no guarantee that >[a-z] will be the entire set of lowercase letters, or that it will only >include lowercase letters, for that matter. > >_______________________________________________ >freebsd-hackers@freebsd.org mailing list >http://lists.freebsd.org/mailman/listinfo/freebsd-hackers >To unsubscribe, send any mail to "freebsd-hackers-unsubscribe@freebsd.org" > > > Yep, I know but it does not matter. The form [a-z] is used all over the place in the FreeBSD source (1629 lines in 4.11-RELEASE-p11 and almost 1600 in 5-STABLE). Totally hopeless. Seems, that no developer have ever heard about character classes and it VERY UNSAFE to try to compile (and actually even run) FreeBSD with some other locale than C/en_US.ISO8859-1. I actually searched for existance of character classes in source code. Found around 30 matches. Mostly in manual pages. Perl configure script checks if tr supports them, but it actually never uses the featuire (even if available). I am totally dissappointed about this. I thought about reporting a bug, but as it is everywhere, there is no point to do so. Rein
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?431A9606.6010401>