From owner-freebsd-i18n@FreeBSD.ORG Thu Nov 20 10:47:37 2003 Return-Path: Delivered-To: freebsd-i18n@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 1CDD116A4CE for ; Thu, 20 Nov 2003 10:47:37 -0800 (PST) Received: from mandy.mts.ru (mandy.mts.ru [81.211.47.3]) by mx1.FreeBSD.org (Postfix) with ESMTP id 6468943F93 for ; Thu, 20 Nov 2003 10:47:35 -0800 (PST) (envelope-from tiamat@komi.mts.ru) Received: from maeko.inside.mts.ru (maeko [192.168.10.3]) by mandy.mts.ru with SMTP id hAKIlYb00883 for ; Thu, 20 Nov 2003 21:47:34 +0300 (MSK) Received: from stella.komi.mts.ru ([10.50.1.1]) by maeko.inside.mts.ru (NAVGW 2.5.2.12) with SMTP id M2003112021473309808 for ; Thu, 20 Nov 2003 21:47:33 +0300 Received: from cdrw (cdrw.komi.mts.ru [10.50.1.103]) (user=tiamat mech=NTLM bits=0) by stella.komi.mts.ru (MTS Komi/Smtp) with ESMTP id hAKIlXFm085593 for ; Thu, 20 Nov 2003 21:47:33 +0300 (MSK) (envelope-from tiamat@komi.mts.ru) Message-ID: <003001c3af96$c2336850$6701320a@komi.mts.ru> From: "Alex Deiter" To: Date: Thu, 20 Nov 2003 21:47:33 +0300 Organization: MTS Komi MIME-Version: 1.0 Content-Type: text/plain; charset="koi8-r" X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 5.50.4927.1200 X-MimeOLE: Produced By Microsoft MimeOLE V5.50.4927.1200 Content-Transfer-Encoding: quoted-printable X-MIME-Autoconverted: from 8bit to quoted-printable by stella.komi.mts.ru id hAKIlXFm085593 Subject: /bin/ls incorrectly displays names of files on UTF-8 locales X-BeenThere: freebsd-i18n@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: FreeBSD Internationalization Effort List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 20 Nov 2003 18:47:37 -0000 /bin/ls incorrectly displays names of files on UTF-8 locales (ports/misc/utf8locale): $ locale LANG=3Dru_RU.UTF-8 LC_CTYPE=3D"ru_RU.UTF-8" LC_COLLATE=3D"ru_RU.UTF-8" LC_TIME=3D"ru_RU.UTF-8" LC_NUMERIC=3D"ru_RU.UTF-8" LC_MONETARY=3D"ru_RU.UTF-8" LC_MESSAGES=3D"ru_RU.UTF-8" LC_ALL=3Dru_RU.UTF-8 $ touch =D0=D2=CF=C2=C1 $ ls -l =D0=D2=CF=C2=C1 -rw-r--r-- 1 test test 0 19 =CE=CF=D1 15:17 =D0=91?=CF=C2=C1 However ls | cat (ls|less, ls|sort, etc) works correctly: $ ls -l =D0=D2=CF=C2=C1 | cat -rw-r--r-- 1 test test 0 19 =CE=CF=D1 15:17 =D0=D2=CF=C2=C1 Why ? All other programs from /bin and /sbin work correctly. Thanks! From owner-freebsd-i18n@FreeBSD.ORG Thu Nov 20 17:09:21 2003 Return-Path: Delivered-To: freebsd-i18n@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id E17B416A4CE for ; Thu, 20 Nov 2003 17:09:21 -0800 (PST) Received: from smtp02.syd.iprimus.net.au (smtp02.syd.iprimus.net.au [210.50.76.52]) by mx1.FreeBSD.org (Postfix) with ESMTP id 3025B43FF7 for ; Thu, 20 Nov 2003 17:09:21 -0800 (PST) (envelope-from tim@robbins.dropbear.id.au) Received: from robbins.dropbear.id.au (210.50.217.136) by smtp02.syd.iprimus.net.au (7.0.020) id 3F8F522A00C1BBEB; Fri, 21 Nov 2003 12:09:20 +1100 Received: by robbins.dropbear.id.au (Postfix, from userid 1000) id 69140611E; Fri, 21 Nov 2003 12:13:03 +1100 (EST) Date: Fri, 21 Nov 2003 12:13:03 +1100 From: Tim Robbins To: Alex Deiter Message-ID: <20031121011303.GB67377@wombat.robbins.dropbear.id.au> References: <003001c3af96$c2336850$6701320a@komi.mts.ru> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <003001c3af96$c2336850$6701320a@komi.mts.ru> User-Agent: Mutt/1.4.1i cc: freebsd-i18n@freebsd.org Subject: Re: /bin/ls incorrectly displays names of files on UTF-8 locales X-BeenThere: freebsd-i18n@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: FreeBSD Internationalization Effort List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 21 Nov 2003 01:09:22 -0000 On Thu, Nov 20, 2003 at 09:47:33PM +0300, Alex Deiter wrote: > /bin/ls incorrectly displays names of files on UTF-8 locales > (ports/misc/utf8locale): [...] > Why ? > > All other programs from /bin and /sbin work correctly. ls is trying to avoid writing what it thinks are non-printable characters, to avoid screwing up the terminal by writing control characters etc. It doesn't understand multibyte characters, though, so the output is incorrect. (It doesn't understand characters that take up more than one column on the screen, either.) There's already a PR about this problem, but I haven't found the time to fix it; it involves scanning the string with mbtowc() and checking each character with iswprint(). The other programs work correctly because they do not check for non-printable characters. Tim From owner-freebsd-i18n@FreeBSD.ORG Thu Nov 20 17:24:08 2003 Return-Path: Delivered-To: freebsd-i18n@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 47E9C16A4CE; Thu, 20 Nov 2003 17:24:08 -0800 (PST) Received: from smtp.sw.oz.au (alt.aurema.com [203.217.18.57]) by mx1.FreeBSD.org (Postfix) with ESMTP id CF58F43FE0; Thu, 20 Nov 2003 17:24:04 -0800 (PST) (envelope-from vance@aurema.com) Received: from smtp.sw.oz.au (localhost [127.0.0.1]) by smtp.sw.oz.au with ESMTP id hAL1O2BA005376; Fri, 21 Nov 2003 12:24:02 +1100 (EST) Received: (from vance@localhost) by smtp.sw.oz.au id hAL1O0OT005346; Fri, 21 Nov 2003 12:24:00 +1100 (EST) Date: Fri, 21 Nov 2003 12:24:00 +1100 From: Christopher Vance To: Tim Robbins Message-ID: <20031121012400.GG12532@aurema.com> References: <003001c3af96$c2336850$6701320a@komi.mts.ru> <20031121011303.GB67377@wombat.robbins.dropbear.id.au> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Disposition: inline In-Reply-To: <20031121011303.GB67377@wombat.robbins.dropbear.id.au> User-Agent: Mutt/1.4.1i X-Scanned-By: MIMEDefang 2.38 cc: freebsd-i18n@freebsd.org Subject: Re: /bin/ls incorrectly displays names of files on UTF-8 locales X-BeenThere: freebsd-i18n@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: FreeBSD Internationalization Effort List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 21 Nov 2003 01:24:08 -0000 On Fri, Nov 21, 2003 at 12:13:03PM +1100, Tim Robbins wrote: >ls is trying to avoid writing what it thinks are non-printable characters, >to avoid screwing up the terminal by writing control characters etc. >It doesn't understand multibyte characters, though, so the output is >incorrect. (It doesn't understand characters that take up more than >one column on the screen, either.) There's already a PR about this problem, >but I haven't found the time to fix it; it involves scanning the string >with mbtowc() and checking each character with iswprint(). > >The other programs work correctly because they do not check for non-printable >characters. What character set did Alex have his terminal (program) set to? If the terminal was set to a character set with highbit data, ls should just pump the data out and let the terminal (program) handle multibyte rendering. As you've said, column alignment indicates either a need for ls to know character count. Cursor addressing might solve part of the problem, but not all. -- Christopher Vance