From owner-freebsd-hackers  Mon Sep 18 10:32:26 1995
Return-Path: owner-hackers
Received: (from root@localhost)
          by freefall.freebsd.org (8.6.12/8.6.6) id KAA24058
          for hackers-outgoing; Mon, 18 Sep 1995 10:32:26 -0700
Received: from netcom10.netcom.com (bakul@netcom10.netcom.com [192.100.81.120])
          by freefall.freebsd.org (8.6.12/8.6.6) with ESMTP id KAA24051
          for <hackers@freefall.freebsd.org>; Mon, 18 Sep 1995 10:32:24 -0700
Received: from localhost by netcom10.netcom.com (8.6.12/Netcom)
	id KAA09594; Mon, 18 Sep 1995 10:27:03 -0700
Message-Id: <199509181727.KAA09594@netcom10.netcom.com>
To: Poul-Henning Kamp <phk@critter.tfs.com>
cc: Terry Lambert <terry@lambert.org>, hackers@freefall.freebsd.org
Subject: Re: Policy on printf format specifiers? 
In-reply-to: Your message of "Mon, 18 Sep 95 05:31:58 PDT."
             <6585.811427518@critter.tfs.com> 
Date: Mon, 18 Sep 95 10:26:58 -0700
From: Bakul Shah <bakul@netcom.com>
Sender: owner-hackers@FreeBSD.org
Precedence: bulk

> As far as I recall there is still some concern about Sanskrit and 10646
> isn't there ?

Last I looked Unicode handled Sanskrit and other Indian
languages fine.  [Indian languages support is dear to my
heart so I looked into it back when Unicode-1 was being
worked on -- AFAIK there have been no changes in this area
since then]

Presumably Terry wants Unicode support in the kernel so that
one can print kernel messages in any language.  While I
agree with his sentiment IMHO we have a long way to go
before that becomes critical.  We need a filesystem that'll
support Unicode file names, common applications need support
for Unicode input/output etc.

Hmm....  Support for reading/writing of Unicode filenames
may be required in the kernel.  How else can you deal with
code like

	sprintf(name, "%s.core", p->p_comm);

where p_comm points to a Unicode filename?

Bruce writes:
> I think wchar_t's were made 32 bits so that they are the same as rune_t's.
> I don't know if this is important.

I too think 16 bit is good enough. 10646 is a 32 bit
standard but given that other than Unicode no other pages
are populated and that Unicode supports all living and many
(most?) dead languages and that except for scholars of dead
languages (a tiny tiny percentage of people) no one else
will benefit *even if* pages beyond Unicode are ever used,
allowing for such extension now is IMHO a waste of space.
rune_t can be made 16 bit, too.

> How are you supposed to print such strings in ANSI C?

If and when true wchar_t support becomes a reality, one
presumes fonts for at least the local language and English
will be supported.  Window systems, with their bazillion
fonts should have no problem :-).

Printf support for wchar_t (and wchar_t *) should really be
specified by the standards people.  If they haven't, may be
they should be petitioned.

--bakul