From owner-freebsd-hackers@FreeBSD.ORG Mon Sep 5 09:40:27 2005 Return-Path: X-Original-To: freebsd-hackers@freebsd.org Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 0A48516A41F for ; Mon, 5 Sep 2005 09:40:27 +0000 (GMT) (envelope-from tjr@freebsd.org) Received: from mail.netspace.net.au (cumulus.netspace.net.au [203.10.110.72]) by mx1.FreeBSD.org (Postfix) with ESMTP id 95F0F43D45 for ; Mon, 5 Sep 2005 09:40:26 +0000 (GMT) (envelope-from tjr@freebsd.org) Received: from [192.168.0.2] (220-253-29-118.VIC.netspace.net.au [220.253.29.118]) by mail.netspace.net.au (Postfix) with ESMTP id 588E57AF4E; Mon, 5 Sep 2005 19:40:21 +1000 (EST) Message-ID: <431C1284.70207@freebsd.org> Date: Mon, 05 Sep 2005 19:40:20 +1000 From: Tim Robbins User-Agent: Mozilla Thunderbird 1.0.2 (Windows/20050317) X-Accept-Language: en-us, en MIME-Version: 1.0 To: Rein Kadastik References: <43196C96.6040504@uninet.ee> <20050903101800.GA77285@cirb503493.alcatel.com.au> <43198251.6070606@uninet.ee> <43198354.3000402@uninet.ee> <4319864A.3040706@uninet.ee> <20050903154526.GA1247@gothmog.gr> <431A9606.6010401@uninet.ee> In-Reply-To: <431A9606.6010401@uninet.ee> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-hackers@freebsd.org Subject: Re: sed not working X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 05 Sep 2005 09:40:27 -0000 Rein Kadastik wrote: > Giorgos Keramidas wrote: > >> On 2005-09-03 14:17, Rein Kadastik wrote: >> >> >>> Rein Kadastik wrote: >>> >>> >>>> Well I have one guess here. In estonian alphabet, the z comes >>>> immediately after s and before t. So as the regex orders [a-z] the >>>> characters t, u, v, w, x, y are left out >>>> >>>> How to order the sed to use english alphabet? >>>> >>> >>> Well, My guess was right. I have a following line in the /etc/profile: >>> >>> export LANG=et_EE.ISO8859-15 >>> >>> After I expoerted LANG=en_US.ISO8859-1, the sed started to work. >>> >>> I did not thought that LANG parameter will also alter the alfabet and >>> therefore the expression [a-z] does not cover the full alphabet >>> anymore. >>> >> >> >> By using a character class: >> >> [[:alpha:]] >> >> AFAIK, if you are using non-English locales, there's no guarantee that >> [a-z] will be the entire set of lowercase letters, or that it will only >> include lowercase letters, for that matter. >> >> _______________________________________________ >> freebsd-hackers@freebsd.org mailing list >> http://lists.freebsd.org/mailman/listinfo/freebsd-hackers >> To unsubscribe, send any mail to >> "freebsd-hackers-unsubscribe@freebsd.org" >> >> >> > Yep, I know but it does not matter. The form [a-z] is used all over > the place in the FreeBSD source (1629 lines in 4.11-RELEASE-p11 and > almost 1600 in 5-STABLE). Totally hopeless. Seems, that no developer > have ever heard about character classes and it VERY UNSAFE to try to > compile (and actually even run) FreeBSD with some other locale than > C/en_US.ISO8859-1. > > I actually searched for existance of character classes in source code. > Found around 30 matches. Mostly in manual pages. Perl configure script > checks if tr supports them, but it actually never uses the featuire > (even if available). > > I am totally dissappointed about this. I thought about reporting a > bug, but as it is everywhere, there is no point to do so. I think you're blowing things out of proportion. Providing that you build world as root (which most people do), and that you don't change the LANG setting for root (think single-user mode), the following command will give you an approximate idea of which utilities are affected: $ find /usr/src -name \*.c | xargs grep -e '".*a-z' -e '".*A-Z' 25 Of these 25 hits, about half are in comments or test code that is never built. The utilities that are genuinely affected are: kbdmap, scon, ppp (when using ATM), m4 (in GNU compatibility mode), fdisk, named, cvs, diff and vi. Tim