Date: Mon, 05 Sep 2005 19:40:20 +1000 From: Tim Robbins <tjr@freebsd.org> To: Rein Kadastik <wigry@uninet.ee> Cc: freebsd-hackers@freebsd.org Subject: Re: sed not working Message-ID: <431C1284.70207@freebsd.org> In-Reply-To: <431A9606.6010401@uninet.ee> References: <43196C96.6040504@uninet.ee> <20050903101800.GA77285@cirb503493.alcatel.com.au> <43198251.6070606@uninet.ee> <43198354.3000402@uninet.ee> <4319864A.3040706@uninet.ee> <20050903154526.GA1247@gothmog.gr> <431A9606.6010401@uninet.ee>
next in thread | previous in thread | raw e-mail | index | archive | help
Rein Kadastik wrote: > Giorgos Keramidas wrote: > >> On 2005-09-03 14:17, Rein Kadastik <wigry@uninet.ee> wrote: >> >> >>> Rein Kadastik wrote: >>> >>> >>>> Well I have one guess here. In estonian alphabet, the z comes >>>> immediately after s and before t. So as the regex orders [a-z] the >>>> characters t, u, v, w, x, y are left out >>>> >>>> How to order the sed to use english alphabet? >>>> >>> >>> Well, My guess was right. I have a following line in the /etc/profile: >>> >>> export LANG=et_EE.ISO8859-15 >>> >>> After I expoerted LANG=en_US.ISO8859-1, the sed started to work. >>> >>> I did not thought that LANG parameter will also alter the alfabet and >>> therefore the expression [a-z] does not cover the full alphabet >>> anymore. >>> >> >> >> By using a character class: >> >> [[:alpha:]] >> >> AFAIK, if you are using non-English locales, there's no guarantee that >> [a-z] will be the entire set of lowercase letters, or that it will only >> include lowercase letters, for that matter. >> >> _______________________________________________ >> freebsd-hackers@freebsd.org mailing list >> http://lists.freebsd.org/mailman/listinfo/freebsd-hackers >> To unsubscribe, send any mail to >> "freebsd-hackers-unsubscribe@freebsd.org" >> >> >> > Yep, I know but it does not matter. The form [a-z] is used all over > the place in the FreeBSD source (1629 lines in 4.11-RELEASE-p11 and > almost 1600 in 5-STABLE). Totally hopeless. Seems, that no developer > have ever heard about character classes and it VERY UNSAFE to try to > compile (and actually even run) FreeBSD with some other locale than > C/en_US.ISO8859-1. > > I actually searched for existance of character classes in source code. > Found around 30 matches. Mostly in manual pages. Perl configure script > checks if tr supports them, but it actually never uses the featuire > (even if available). > > I am totally dissappointed about this. I thought about reporting a > bug, but as it is everywhere, there is no point to do so. I think you're blowing things out of proportion. Providing that you build world as root (which most people do), and that you don't change the LANG setting for root (think single-user mode), the following command will give you an approximate idea of which utilities are affected: $ find /usr/src -name \*.c | xargs grep -e '".*a-z' -e '".*A-Z' 25 Of these 25 hits, about half are in comments or test code that is never built. The utilities that are genuinely affected are: kbdmap, scon, ppp (when using ATM), m4 (in GNU compatibility mode), fdisk, named, cvs, diff and vi. Tim
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?431C1284.70207>