From owner-freebsd-hackers@FreeBSD.ORG Sun Sep 4 06:36:52 2005 Return-Path: X-Original-To: freebsd-hackers@freebsd.org Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 0CD0016A41F for ; Sun, 4 Sep 2005 06:36:52 +0000 (GMT) (envelope-from wigry@uninet.ee) Received: from mail.neti.ee (smtp-out-1.neti.ee [194.126.101.98]) by mx1.FreeBSD.org (Postfix) with ESMTP id 9D8B043D53 for ; Sun, 4 Sep 2005 06:36:50 +0000 (GMT) (envelope-from wigry@uninet.ee) Message-ID: <431A9606.6010401@uninet.ee> Date: Sun, 04 Sep 2005 09:36:54 +0300 From: Rein Kadastik User-Agent: Mozilla Thunderbird 1.0.2 (Windows/20050317) X-Accept-Language: en-us, en MIME-Version: 1.0 To: freebsd-hackers@freebsd.org References: <43196C96.6040504@uninet.ee> <20050903101800.GA77285@cirb503493.alcatel.com.au> <43198251.6070606@uninet.ee> <43198354.3000402@uninet.ee> <4319864A.3040706@uninet.ee> <20050903154526.GA1247@gothmog.gr> In-Reply-To: <20050903154526.GA1247@gothmog.gr> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Scanned: by amavisd-new-2.2.1 (20041222) (Debian) at neti.ee Subject: Re: sed not working X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 04 Sep 2005 06:36:52 -0000 Giorgos Keramidas wrote: >On 2005-09-03 14:17, Rein Kadastik wrote: > > >>Rein Kadastik wrote: >> >> >>>Well I have one guess here. In estonian alphabet, the z comes >>>immediately after s and before t. So as the regex orders [a-z] the >>>characters t, u, v, w, x, y are left out >>> >>>How to order the sed to use english alphabet? >>> >>> >>Well, My guess was right. I have a following line in the /etc/profile: >> >>export LANG=et_EE.ISO8859-15 >> >>After I expoerted LANG=en_US.ISO8859-1, the sed started to work. >> >>I did not thought that LANG parameter will also alter the alfabet and >>therefore the expression [a-z] does not cover the full alphabet anymore. >> >> > >By using a character class: > > [[:alpha:]] > >AFAIK, if you are using non-English locales, there's no guarantee that >[a-z] will be the entire set of lowercase letters, or that it will only >include lowercase letters, for that matter. > >_______________________________________________ >freebsd-hackers@freebsd.org mailing list >http://lists.freebsd.org/mailman/listinfo/freebsd-hackers >To unsubscribe, send any mail to "freebsd-hackers-unsubscribe@freebsd.org" > > > Yep, I know but it does not matter. The form [a-z] is used all over the place in the FreeBSD source (1629 lines in 4.11-RELEASE-p11 and almost 1600 in 5-STABLE). Totally hopeless. Seems, that no developer have ever heard about character classes and it VERY UNSAFE to try to compile (and actually even run) FreeBSD with some other locale than C/en_US.ISO8859-1. I actually searched for existance of character classes in source code. Found around 30 matches. Mostly in manual pages. Perl configure script checks if tr supports them, but it actually never uses the featuire (even if available). I am totally dissappointed about this. I thought about reporting a bug, but as it is everywhere, there is no point to do so. Rein