Date: Fri, 7 May 2004 10:27:33 -0500 From: Dan Nelson <dnelson@allantgroup.com> To: "Christoph P. Kukulies" <kuku@kukulies.org> Cc: Kris Kennaway <kris@obsecurity.org> Subject: Re: tr A-Z a-z Message-ID: <20040507152733.GA12942@dan.emsphone.com> In-Reply-To: <20040507112448.GA49142@kukulies.org> References: <200405070853.i478r1UM048075@www.kukulies.org> <20040507085901.GA31936@xor.obsecurity.org> <20040507112448.GA49142@kukulies.org>
next in thread | previous in thread | raw e-mail | index | archive | help
In the last episode (May 07), Christoph P. Kukulies said: > On Fri, May 07, 2004 at 01:59:01AM -0700, Kris Kennaway wrote: > > On Fri, May 07, 2004 at 10:53:01AM +0200, Christoph Kukulies wrote: > > > Strange: I was used to do upper case lower case conversion always like this > > > and it suddenly doesn't work anymore: > > > > > > $ echo Z | tr "[A-Z]" "[a-z]" > > > ΓΏ > > > > Something locale-related? > > locale > LANG=en_US.ISO_8859-1 > LC_COLLATE="en_US.ISO_8859-1" > > echo Z | tr "A-Z" "a-z" | od -x > 0000000 0aff > 0000002 >From the tr manpage: c-c For non-octal range endpoints represents the range of characters between the range endpoints, inclusive, in ascending order, as defined by the collation sequence. Note that 8859-1 has uppercase and lowercase accented characters, which collate alongside the unaccented characters. /usr/src/share/colldef/la_LN.ISO8859-1.src holds the collation sequence for en_US.ISO_8859-1. There are two lowercase y, but three uppercase Y's. This means that your ranges are different sizes, and Z maps to <y:>, which happens to be 0xff in the 8859-1 charset. > It must be something too obvious but I don't see it at the moment. > > I found that it depends on my special environment settings. > A different user doesn't have this problem. -- Dan Nelson dnelson@allantgroup.com
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20040507152733.GA12942>