Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 18 Sep 1995 13:19:02 -0700 (MST)
From:      Terry Lambert <terry@lambert.org>
To:        phk@critter.tfs.com (Poul-Henning Kamp)
Cc:        bde@zeta.org.au, hackers@FreeBSD.ORG, terry@lambert.org
Subject:   Re: Policy on printf format specifiers?
Message-ID:  <199509182019.NAA08435@phaeton.artisoft.com>
In-Reply-To: <6760.811430864@critter.tfs.com> from "Poul-Henning Kamp" at Sep 18, 95 06:27:44 am

next in thread | previous in thread | raw e-mail | index | archive | help
> > >I'd like to add a format specifier '%S' to the list of format specifiers
> > >accepted by printf.  Well, kernel printf, anyway.
> > 
> > I don't want wchar_t's in the kernel.
> I also fail to see the need for this, and even if I did see the need, I
> still think we shouldn't have them in the kernel...

Unicode encoding of file names without yanking around the value of
MAXPATHLEN or MAXNAMLEN by runic encoding of the file name data.

There is no method of determining the runic length required for an
unknown file name before it is entered.

File name entry will occur in process encoding.

By divorcing process and storage encoding, you increase the length of a
non-7-bit-ASCII string unpredictably when transforming it into the
storage encoding format.

How many characters do you let the user enter in the "file name" dialog
before you tell them they've entered too many by visual or auditory feedback?

If your storage encoding, like Plan9, is UTF-8, then the answer is you
can allow them no more than 51 characters for file names, unless you
provide a prohibitively expensive (in terms of interactive response
time) "check" callback for character entry.

Even if you implement such an expensive callback (after all, everyone
will be running P6's, right?), you are limiting it such that using one
set of glyphs vs. another vary the overall length allowed.

That is, if you use ISO-8859-1 characters, and they are all in the range
0x80-0xff, you get a length limit of 127 characters for your file name,
whereas if they are in the range 0x00-0x7f, you get the full 255.

Characters outside the 0x00-0xff range of 8859-1 (for instance, all of
the characters in 8859-2 through 8859-9 not intersecting with 8859-1) take
3-5 8-bit characters to encode, depending on their lexical position.

Say "goodbye, fixed field input", say "goodbye, fixed length record storage",
say "hello, record oriented file systems", say "hello, user interface rewrite
for all internationally sold products".


					Regards,
					Terry Lambert
					terry@lambert.org
---
Any opinions in this posting are my own and not those of my present
or previous employers.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199509182019.NAA08435>