From owner-freebsd-questions@FreeBSD.ORG Fri May 7 08:27:37 2004 Return-Path: Delivered-To: freebsd-questions@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 73ACF16A4CE for ; Fri, 7 May 2004 08:27:37 -0700 (PDT) Received: from dan.emsphone.com (dan.emsphone.com [199.67.51.101]) by mx1.FreeBSD.org (Postfix) with ESMTP id 0519143D49 for ; Fri, 7 May 2004 08:27:37 -0700 (PDT) (envelope-from dan@dan.emsphone.com) Received: (from dan@localhost) by dan.emsphone.com (8.12.10/8.12.10) id i47FRX9S095298; Fri, 7 May 2004 10:27:33 -0500 (CDT) (envelope-from dan) Date: Fri, 7 May 2004 10:27:33 -0500 From: Dan Nelson To: "Christoph P. Kukulies" Message-ID: <20040507152733.GA12942@dan.emsphone.com> References: <200405070853.i478r1UM048075@www.kukulies.org> <20040507085901.GA31936@xor.obsecurity.org> <20040507112448.GA49142@kukulies.org> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <20040507112448.GA49142@kukulies.org> X-OS: FreeBSD 5.2-CURRENT X-message-flag: Outlook Error User-Agent: Mutt/1.5.6i cc: Christoph Kukulies cc: questions@freebsd.org cc: Kris Kennaway Subject: Re: tr A-Z a-z X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 07 May 2004 15:27:37 -0000 In the last episode (May 07), Christoph P. Kukulies said: > On Fri, May 07, 2004 at 01:59:01AM -0700, Kris Kennaway wrote: > > On Fri, May 07, 2004 at 10:53:01AM +0200, Christoph Kukulies wrote: > > > Strange: I was used to do upper case lower case conversion always like this > > > and it suddenly doesn't work anymore: > > > > > > $ echo Z | tr "[A-Z]" "[a-z]" > > > ΓΏ > > > > Something locale-related? > > locale > LANG=en_US.ISO_8859-1 > LC_COLLATE="en_US.ISO_8859-1" > > echo Z | tr "A-Z" "a-z" | od -x > 0000000 0aff > 0000002 >From the tr manpage: c-c For non-octal range endpoints represents the range of characters between the range endpoints, inclusive, in ascending order, as defined by the collation sequence. Note that 8859-1 has uppercase and lowercase accented characters, which collate alongside the unaccented characters. /usr/src/share/colldef/la_LN.ISO8859-1.src holds the collation sequence for en_US.ISO_8859-1. There are two lowercase y, but three uppercase Y's. This means that your ranges are different sizes, and Z maps to , which happens to be 0xff in the 8859-1 charset. > It must be something too obvious but I don't see it at the moment. > > I found that it depends on my special environment settings. > A different user doesn't have this problem. -- Dan Nelson dnelson@allantgroup.com