From owner-freebsd-i18n@FreeBSD.ORG Thu Nov 20 17:24:08 2003 Return-Path: Delivered-To: freebsd-i18n@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 47E9C16A4CE; Thu, 20 Nov 2003 17:24:08 -0800 (PST) Received: from smtp.sw.oz.au (alt.aurema.com [203.217.18.57]) by mx1.FreeBSD.org (Postfix) with ESMTP id CF58F43FE0; Thu, 20 Nov 2003 17:24:04 -0800 (PST) (envelope-from vance@aurema.com) Received: from smtp.sw.oz.au (localhost [127.0.0.1]) by smtp.sw.oz.au with ESMTP id hAL1O2BA005376; Fri, 21 Nov 2003 12:24:02 +1100 (EST) Received: (from vance@localhost) by smtp.sw.oz.au id hAL1O0OT005346; Fri, 21 Nov 2003 12:24:00 +1100 (EST) Date: Fri, 21 Nov 2003 12:24:00 +1100 From: Christopher Vance To: Tim Robbins Message-ID: <20031121012400.GG12532@aurema.com> References: <003001c3af96$c2336850$6701320a@komi.mts.ru> <20031121011303.GB67377@wombat.robbins.dropbear.id.au> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Disposition: inline In-Reply-To: <20031121011303.GB67377@wombat.robbins.dropbear.id.au> User-Agent: Mutt/1.4.1i X-Scanned-By: MIMEDefang 2.38 cc: freebsd-i18n@freebsd.org Subject: Re: /bin/ls incorrectly displays names of files on UTF-8 locales X-BeenThere: freebsd-i18n@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: FreeBSD Internationalization Effort List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 21 Nov 2003 01:24:08 -0000 On Fri, Nov 21, 2003 at 12:13:03PM +1100, Tim Robbins wrote: >ls is trying to avoid writing what it thinks are non-printable characters, >to avoid screwing up the terminal by writing control characters etc. >It doesn't understand multibyte characters, though, so the output is >incorrect. (It doesn't understand characters that take up more than >one column on the screen, either.) There's already a PR about this problem, >but I haven't found the time to fix it; it involves scanning the string >with mbtowc() and checking each character with iswprint(). > >The other programs work correctly because they do not check for non-printable >characters. What character set did Alex have his terminal (program) set to? If the terminal was set to a character set with highbit data, ls should just pump the data out and let the terminal (program) handle multibyte rendering. As you've said, column alignment indicates either a need for ls to know character count. Cursor addressing might solve part of the problem, but not all. -- Christopher Vance