From owner-freebsd-hackers Mon Sep 18 05:13:35 1995 Return-Path: owner-hackers Received: (from root@localhost) by freefall.freebsd.org (8.6.12/8.6.6) id FAA13714 for hackers-outgoing; Mon, 18 Sep 1995 05:13:35 -0700 Received: from expo.x.org (expo.x.org [198.112.45.11]) by freefall.freebsd.org (8.6.12/8.6.6) with ESMTP id FAA13709 for ; Mon, 18 Sep 1995 05:13:32 -0700 Received: from exalt.x.org by expo.x.org id AA24431; Mon, 18 Sep 95 07:47:51 -0400 Received: from localhost by exalt.x.org id HAA15090; Mon, 18 Sep 1995 07:47:42 -0400 Message-Id: <199509181147.HAA15090@exalt.x.org> To: Terry Lambert Cc: hackers@freefall.FreeBSD.org Subject: Re: Policy on printf format specifiers? In-Reply-To: Your message of Sun, 17 Sep 1995 13:04:17 EST. <199509172004.NAA06540@phaeton.artisoft.com> Organization: X Consortium Date: Mon, 18 Sep 1995 07:47:42 EST From: "Kaleb S. KEITHLEY" Sender: owner-hackers@FreeBSD.org Precedence: bulk > I'd like to add a format specifier '%S' to the list of format specifiers > accepted by printf. Well, kernel printf, anyway. > > It's purpose would be to print wchar_t strings that are "NULL" terminated; > the actual output would include the embedded NULL's. This would be true > 16 bit character output. > > I'd also like the wchar_t value to be 16 rather than 32 bits. That would be a serious mistake. All modern OSes are using 32-bit wchar_t. Don't take a step backward. > Other > than page 0 (Unicode), no other code pages in ISO-10646 have yet been > allocated. Er, I don't have my copy of 10646 here at home. As I recall page 0 is just Latin1. If page 0 is in fact Unicode, which already has encodings for every written language on Earth, then what would 10646 need any other pages for? The 2.1.0-SNAP has a Japanese EUC and Cyrillic code pages, which, as I recall, are not on page 0. > This would affect constant ISO 8859-1 strings using the 'L' quailfier; > for example: > > > main() > { > printf( "%S\n", L"Hello World"); > } > To print a widechar string you should convert it to a multi-byte string with wcstombs and then print it. Because you're asking for 16-bit wchar_t I presume you have a large number of strings and are concerned about the amount of space they'll use when stored in your program file. If that's the case your strings should be stored in locale specific message catalogs. Because wchar_t is different, i.e. 16-bit on some systems, 32-bit on others, you never store wchar_t strings in a file. You always convert them to multi-byte strings with wcstombs before writing to a file. Since the locale the file was created in is not recorded in the file the burden is on the user to remember and use the correct locale when rereading the file and convert it back to a wchar_t string with mbstowcs. -- Kaleb KEITHLEY