From owner-freebsd-hackers  Mon Sep 18 05:13:35 1995
Return-Path: owner-hackers
Received: (from root@localhost)
          by freefall.freebsd.org (8.6.12/8.6.6) id FAA13714
          for hackers-outgoing; Mon, 18 Sep 1995 05:13:35 -0700
Received: from expo.x.org (expo.x.org [198.112.45.11])
          by freefall.freebsd.org (8.6.12/8.6.6) with ESMTP id FAA13709
          for <hackers@freefall.FreeBSD.org>; Mon, 18 Sep 1995 05:13:32 -0700
Received: from exalt.x.org by expo.x.org id AA24431; Mon, 18 Sep 95 07:47:51 -0400
Received: from localhost by exalt.x.org id HAA15090; Mon, 18 Sep 1995 07:47:42 -0400
Message-Id: <199509181147.HAA15090@exalt.x.org>
To: Terry Lambert <terry@lambert.org>
Cc: hackers@freefall.FreeBSD.org
Subject: Re: Policy on printf format specifiers? 
In-Reply-To: Your message of Sun, 17 Sep 1995 13:04:17 EST.
             <199509172004.NAA06540@phaeton.artisoft.com> 
Organization: X Consortium
Date: Mon, 18 Sep 1995 07:47:42 EST
From: "Kaleb S. KEITHLEY" <kaleb@x.org>
Sender: owner-hackers@FreeBSD.org
Precedence: bulk


> I'd like to add a format specifier '%S' to the list of format specifiers
> accepted by printf.  Well, kernel printf, anyway.
> 
> It's purpose would be to print wchar_t strings that are "NULL" terminated;
> the actual output would include the embedded NULL's.  This would be true
> 16 bit character output.
> 
> I'd also like the wchar_t value to be 16 rather than 32 bits.  

That would be a serious mistake. All modern OSes are using 32-bit wchar_t.
Don't take a step backward.

> Other
> than page 0 (Unicode), no other code pages in ISO-10646 have yet been
> allocated.

Er, I don't have my copy of 10646 here at home. As I recall page 0 is just
Latin1. If page 0 is in fact Unicode, which already has encodings for every 
written language on Earth, then what would 10646 need any other pages for?

The 2.1.0-<mumble>SNAP has a Japanese EUC and Cyrillic code pages, which, 
as I recall, are not on page 0.

> This would affect constant ISO 8859-1 strings using the 'L' quailfier;
> for example:
> 
> 
> main()
> {
> 	printf( "%S\n", L"Hello World");
> }
> 

To print a widechar string you should convert it to a multi-byte string
with wcstombs and then print it. Because you're asking for 16-bit wchar_t
I presume you have a large number of strings and are concerned about the
amount of space they'll use when stored in your program file. If that's
the case your strings should be stored in locale specific message catalogs.

Because wchar_t is different, i.e. 16-bit on some systems, 32-bit on others,
you never store wchar_t strings in a file. You always convert them to
multi-byte strings with wcstombs before writing to a file. Since the locale 
the file was created in is not recorded in the file the burden is on the 
user to remember and use the correct locale when rereading the file and 
convert it back to a wchar_t string with mbstowcs.

--

Kaleb KEITHLEY