Date: Tue, 07 Feb 2006 02:53:46 +0100 From: Martin Krzysiak <cinek@gmx.de> To: freebsd-stable@FreeBSD.ORG Subject: Re: tr(1) buggy with de_DE.ISO8859-1(5) locale? Message-ID: <43E7FDAA.3010409@gmx.de> In-Reply-To: <200602061658.k16GwqLr068150@lurza.secnetix.de> References: <200602061658.k16GwqLr068150@lurza.secnetix.de>
next in thread | previous in thread | raw e-mail | index | archive | help
Oliver Fromme wrote: > It's not a bug. It's perfectly POSIX-compatible. I think this behavior is "undefined" in POSIX, as I found in some documents. This is a difference. > To convert lower case to upper case, use the command > "tr '[:lower:]' '[:upper:]'" (or enumerate all letters > explicitely, like "tr abcdef ABCDEF"). Skripts that > use things like "tr a-z A-Z" are broken and need to be > fixed. It's not only upper-lowercase conversion that is weird. Try "echo wxyz | tr w-z a-d". Ranges are broken generally in ISO-locales, in my opinion. > By the way: Do not set LANG or LC_ALL, expecially for > the root user, and especially when compiling things. One thing I like about FreeBSD is that I have my German environment. But you are right. The only locale that is expected to work correctly is "C". > Not only will tr behave in unexpected ways when used > like above, but also other things might break. For > example, German month names appear in "ls -l", which > will break scripts that try to parse them. Don't tell me about localization problems. I've seen lots of stupid things. The latest one was a localized "Date:" header produced by a commercial application. > Some tools > use decimal commas instead of decimal points, which > can lead to further confusion, etc. Yes, scripts > which try to do that are broken, but they do exist. Yes. You are right. How many times did you use tr(1) to convert your texts to upper/lower case? Do you expect that it works correctly? I would prefer to use it like: "tr a-zäöü A-ZÄÖÜ", _if_ I ever need to do it. > If you only need support for German umlauts, then only > set LC_CTYPE. That shouldn't break anything. I appreciate really really really that FreeBSD supports German locales. Let's stop arguing. I just wanted to ask about the behavior. Now I know that something might by fishy with tr(1) and I understand how to avoid this problem. That's all I need to know. For people who are interested in a simple workaround. Don't use de_DE.ISO8859-1(5). Instead use de_DE.UTF-8. tr(1)'s ranges work like expected there. Martin
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?43E7FDAA.3010409>