From owner-freebsd-hackers  Wed Oct 18 14:27:43 1995
Return-Path: owner-hackers
Received: (from root@localhost)
          by freefall.freebsd.org (8.6.12/8.6.6) id OAA23281
          for hackers-outgoing; Wed, 18 Oct 1995 14:27:43 -0700
Received: from phaeton.artisoft.com (phaeton.Artisoft.COM [198.17.250.211])
          by freefall.freebsd.org (8.6.12/8.6.6) with ESMTP id OAA23275
          for <hackers@freefall.freebsd.org>; Wed, 18 Oct 1995 14:27:39 -0700
Received: (from terry@localhost) by phaeton.artisoft.com (8.6.11/8.6.9) id OAA00987; Wed, 18 Oct 1995 14:21:51 -0700
From: Terry Lambert <terry@lambert.org>
Message-Id: <199510182121.OAA00987@phaeton.artisoft.com>
Subject: Re: xterm dumps core
To: kaleb@x.org (Kaleb S. KEITHLEY)
Date: Wed, 18 Oct 1995 14:21:51 -0700 (MST)
Cc: ache@astral.msk.su, hackers@freefall.freebsd.org
In-Reply-To: <199510181626.MAA29751@exalt.x.org> from "Kaleb S. KEITHLEY" at Oct 18, 95 12:26:07 pm
X-Mailer: ELM [version 2.4 PL24]
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Content-Length: 4347      
Sender: owner-hackers@FreeBSD.org
Precedence: bulk

> The ANSI/POSIX/ISO locale model is inadequate for describing things 
> like I/O in a graphical user interface. One of the deficiencies is the 
> inability to describe a set of fonts to use for rendering text in an
> arbitrary locale. Another deficiency is its failure to address input 
> methods, without which keyboard input in Oriental and Arabic languages 
> would be all but impossible.

With the implementation of Unicode as a character set standard, the
real issue has moved to either:

1)	The deficiency in the Unicode standard in the placement of
	"private use" areas such that there is a *very* strong bias
	against fixed cell rendering technologies, like X, that
	use BLIT copies of prerendered characters at the server
	level.

OR

2)	The deficiency in X string drawing with regard to its choice
	of fixed cell as a rendering technology.

The main issue here is whether a single "Unicode font" (in violation of
the wishes of the authors of the Unicode standard, who happen to have
a large economic interest in Adobe PostScript, so their opinions don't
count for much) is possible or not.

For non-ligatured languages (ie: not Arabic, Hebrew, Tamil, Devengari,
or English script ["Cursive"]), the answer is "yes, it is possible,
as long as we are talking about internationalization (enabling for
soft localization to a single locale) instead of multinationalization
(ability to simultaneously support multiple glyphs for a single
character encoding, ie: the Han Unifications, etc.).

For ligatured languages, it's possible to either adopt a locale
regocognized block print font (Hebrew has one), or redefine the
areas where the ligatured fonts lie as "private use" areas (in tacit
violation of the standard), and respecify character encodings and
round-trip tables for those languages.


Keyboard input methodology is an interpretational issue, and is only
loosely bound to the fact that X (improperly) assigns keycode values
based on internal knowledge of keycap legends.  This is loosely bound
because of the ability to symbolically rebind these values with single
forward table references.


The "support for locale-based characater set designation" argues on the
basis of a choice of a character set that is a subset of Unicode, or
is an artifact of coding technique (ie: "xtamil").

I believe this to be a largely specious argument.

What the ANSI/POSIX/ISO standards *do* lack is the ability to specify
locale-based input methods for distinct sub-character set based locales
as part of the locale information.

This (and runic encoding at all) is why I believe that XPG/4 is itself
bogus, thoughit is quite argualbe that locale specificity of input
is a problem entirely addressable by hardware alone.

Note that input method *could* be specified by locale specific hardware,
as long as one was not intereted in multinationalization and/or various
multilingual applications without a single round-trip standard for use
in conversion to/from Unicode.


> If you make changes like this without considering how it might affect
> the things that have dependencies on them, you pretty much get what you 
> deserve. I'm sure you wouldn't make a gratuitous change like moving 
> printf out of libc would you? 

I agree with this summation.  One must consider the ramifications of
changes that will cause unexpected behavior that is not of a ducumented
type.

> If you're going to change your locale naming convention then you need 
> to document the change where people can find it and preserve the old 
> names (perhaps with symlinks) long enough that people can find either 
> the changes or the documentation and make the changes necessary in
> their software to accomodate your changes.

I don't think anyone has suggested directly modifying locale specification
to anything other than ISO standards.  The X locale alias mechanism is
indeed an artifact of local extensions (ie: AIX "DOSANSI", HP, etc.)
rather than an artifact of the deficiencies in the weel defined naming
conventions for locales which are not vendor private.

On the other hand, I have no problem whatsoever orphaning vendor-private
locale naming mechanisms if it buys an additional level of functionality
at no other cost.


					Terry Lambert
					terry@lambert.org
---
Any opinions in this posting are my own and not those of my present
or previous employers.