Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 18 Sep 1995 07:47:42 EST
From:      "Kaleb S. KEITHLEY" <kaleb@x.org>
To:        Terry Lambert <terry@lambert.org>
Cc:        hackers@freefall.FreeBSD.org
Subject:   Re: Policy on printf format specifiers? 
Message-ID:  <199509181147.HAA15090@exalt.x.org>
In-Reply-To: Your message of Sun, 17 Sep 1995 13:04:17 EST. <199509172004.NAA06540@phaeton.artisoft.com> 

next in thread | previous in thread | raw e-mail | index | archive | help

> I'd like to add a format specifier '%S' to the list of format specifiers
> accepted by printf.  Well, kernel printf, anyway.
> 
> It's purpose would be to print wchar_t strings that are "NULL" terminated;
> the actual output would include the embedded NULL's.  This would be true
> 16 bit character output.
> 
> I'd also like the wchar_t value to be 16 rather than 32 bits.  

That would be a serious mistake. All modern OSes are using 32-bit wchar_t.
Don't take a step backward.

> Other
> than page 0 (Unicode), no other code pages in ISO-10646 have yet been
> allocated.

Er, I don't have my copy of 10646 here at home. As I recall page 0 is just
Latin1. If page 0 is in fact Unicode, which already has encodings for every 
written language on Earth, then what would 10646 need any other pages for?

The 2.1.0-<mumble>SNAP has a Japanese EUC and Cyrillic code pages, which, 
as I recall, are not on page 0.

> This would affect constant ISO 8859-1 strings using the 'L' quailfier;
> for example:
> 
> 
> main()
> {
> 	printf( "%S\n", L"Hello World");
> }
> 

To print a widechar string you should convert it to a multi-byte string
with wcstombs and then print it. Because you're asking for 16-bit wchar_t
I presume you have a large number of strings and are concerned about the
amount of space they'll use when stored in your program file. If that's
the case your strings should be stored in locale specific message catalogs.

Because wchar_t is different, i.e. 16-bit on some systems, 32-bit on others,
you never store wchar_t strings in a file. You always convert them to
multi-byte strings with wcstombs before writing to a file. Since the locale 
the file was created in is not recorded in the file the burden is on the 
user to remember and use the correct locale when rereading the file and 
convert it back to a wchar_t string with mbstowcs.

--

Kaleb KEITHLEY



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199509181147.HAA15090>