From owner-freebsd-questions@FreeBSD.ORG Thu Jan 29 06:36:46 2009 Return-Path: Delivered-To: questions@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 41016106566C for ; Thu, 29 Jan 2009 06:36:46 +0000 (UTC) (envelope-from frank@esperance-linux.co.uk) Received: from mailout.zetnet.co.uk (mailout.zetnet.co.uk [194.247.47.231]) by mx1.freebsd.org (Postfix) with ESMTP id C36898FC1D for ; Thu, 29 Jan 2009 06:36:45 +0000 (UTC) (envelope-from frank@esperance-linux.co.uk) Received: from irwell.zetnet.co.uk ([194.247.47.48] helo=zetnet.co.uk) by mailout.zetnet.co.uk with esmtp (Exim 4.63) (envelope-from ) id 1LSQGr-0004PX-Ql; Thu, 29 Jan 2009 06:21:01 +0000 Received: from melon.esperance-linux.co.uk (54-144.adsl.zetnet.co.uk [194.247.54.144]) by zetnet.co.uk (8.14.1/8.14.1/Debian-9) with ESMTP id n0T6Kxvv028323; Thu, 29 Jan 2009 06:21:00 GMT Received: by melon.esperance-linux.co.uk (Postfix, from userid 1001) id 86FCAFCA6A0; Thu, 29 Jan 2009 06:20:54 +0000 (GMT) Date: Thu, 29 Jan 2009 06:20:54 +0000 From: Frank Shute To: Svein Halvor Halvorsen Message-ID: <20090129062054.GA19589@melon.esperance-linux.co.uk> References: <497E31EE.9010202@lvor.halvorsen.cc> <0B02CEE8-D38A-4D94-B76D-49721BDDACF0@mac.com> <497E41B8.2030203@lvor.halvorsen.cc> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <497E41B8.2030203@lvor.halvorsen.cc> User-Agent: Mutt/1.4.2.3i X-Face: *}~{PHnDTzvXPe'wl_-f%!@+r5; VLhb':*DsX%wEOPg\fDrXWQJf|2\,92"DdS%63t*BHDyQ|OWo@Gfjcd72eaN!4%NE{0]p)ihQ1MyFNtWL X-Operating-System: FreeBSD 6.4-RELEASE-p2 i386 X-Organisation: 'http://www.shute.org.uk/' X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.1.7 (zetnet.co.uk [194.247.46.1]); Thu, 29 Jan 2009 06:21:00 +0000 (GMT) Cc: questions@freebsd.org Subject: Re: printf and utf-8 X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: Frank Shute List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 29 Jan 2009 06:36:47 -0000 On Tue, Jan 27, 2009 at 12:05:28AM +0100, Svein Halvor Halvorsen wrote: > > Chuck Swiger wrote: > >On Jan 26, 2009, at 1:58 PM, Svein Halvor Halvorsen wrote: > >>As far as I can see, printf is not calculating strings lengths > >>correctly when using utf-8 encoding. Either that, or I'm using byte > >>count, and can't find the character count :-/ > > > >printf(1) explicitly states that it works with ASCII and ANSI > >X3.159-1989 (``ANSI C89'') character escapes, and it also notes: > > > > Multibyte characters are not recognized in format strings (this is > >only a > > problem if `%' can appear inside a multibyte character). > > > >Some platforms have a printf_l(3) which is locale/xlocale-aware, but > >there doesn't seem to be a corresponding CLI utility which understands > >Unicode/UTF8/widechars. > > Thanks for your explanation. > > Do you have a suggestion to solve the following problem without using > printf(1): > > I have a text file that I want to print in a "box" on a terminal from a > shell script. Now I've padded the lines with spaces to a certain length > using printf %-70s and appended the box drawing character. Is there > another simple way that will work with utf-8? > What's your perl like? http://search.cpan.org/~sadahiro/String-Multibyte-1.05/Multibyte.pm http://perldoc.perl.org/perlfaq6.html#How-can-I-match-strings-with-multibyte-characters%3f Looks like they might be interesting. Regards, -- Frank Contact info: http://www.shute.org.uk/misc/contact.html