From owner-freebsd-hackers  Mon Oct 16 13:40:40 1995
Return-Path: owner-hackers
Received: (from root@localhost)
          by freefall.freebsd.org (8.6.12/8.6.6) id NAA20158
          for hackers-outgoing; Mon, 16 Oct 1995 13:40:40 -0700
Received: from phaeton.artisoft.com (phaeton.Artisoft.COM [198.17.250.211])
          by freefall.freebsd.org (8.6.12/8.6.6) with ESMTP id NAA20150
          for <hackers@freefall.freebsd.org>; Mon, 16 Oct 1995 13:40:35 -0700
Received: (from terry@localhost) by phaeton.artisoft.com (8.6.11/8.6.9) id NAA25305; Mon, 16 Oct 1995 13:34:10 -0700
From: Terry Lambert <terry@lambert.org>
Message-Id: <199510162034.NAA25305@phaeton.artisoft.com>
Subject: Re: A couple problems in FreeBSD 2.1.0-950922-SNAP
To: joerg_wunsch@uriah.heep.sax.de
Date: Mon, 16 Oct 1995 13:34:10 -0700 (MST)
Cc: kaleb@x.org, hackers@freefall.freebsd.org
In-Reply-To: <199510152303.AAA22305@uriah.heep.sax.de> from "J Wunsch" at Oct 16, 95 00:03:59 am
X-Mailer: ELM [version 2.4 PL24]
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Content-Length: 2786      
Sender: owner-hackers@FreeBSD.org
Precedence: bulk

> As Kaleb S. KEITHLEY wrote:
> > 
> > As near as I can tell the SVR4 ls doesn't change its locale, yet still
> > manages to do the right thing, probably because for most SVR4-en the C 
> > locale is full ISO8859-1. This leads me to believe that FreeBSD's ls
> > probably doesn't need to change its locale either if the default chartype
> > table is fully populated.
> 
> So SVR4 would still break on koi8-r, for example.  Either make it
> right, or let it be.

But not on 8859-x.  For Coptic alphabets, that's 8859-9.

The problem with KOI-8 is that KOI-8 is a defacto standard, and is not
accepted by international standards bodies.  Mostly because the most
popular BBS software in the area picked it up instead of 8859-9.

The problem is not in the blank areas of the locale.

In point of fact, the ANSI standards for terminal control sequences
after ANSI 3.64 leave the codes in columns 0x80 and 0x90 to be used
to represent 8 bit command sequence introducers, which are the same
as an escape character followed by a character in columns 0x20 or 0x30.
Because of this, KOI-8 as a character set is not compatible with post
3.64 ANSI terminal control sequence standardization.

It is not unreasonable, then, for the code to function correctly in
the 8-bit case for ISO 8859-x but not function correctly without an
environment or (more preferrably) code change in the application for
KOI-8.

Really, they should be using the 8859 character set instead of KOI-8,
but there is understood to be a large historical investment in the
non-standard KOI-8 representation (unfortunately).

> isctype() is not necessarily related to message catalogs.  The
> extensive (i.e. blatant) use of message catalogs (AIX, IRIX) leads to
> very undesirable results, e.g. SMTP daemons throwing their error
> messages in German. :-(

This is an issue of when the message is translated: a function of the
interface violation of an internationalized system.  Actually, the
error presentation between the dameons should be abstract and token
based: ie: a number, which your smtp agen translates into the correct
error.

The problem is that the translation to a human representation should
occur in the locale of the recipient of the human-form message rather
than in the locale of the human who ran the program.

That is, the error reporting protocol is a half-circuit and the
abstraction is thereby compromised.

The only correct procedure for dealing with this would be to change
the error reporting in the SMTP protocol to encapsulate it as well
as the transactions for which the error is being reported, and then
translate it locally before presentation to the user.


					Terry Lambert
					terry@lambert.org
---
Any opinions in this posting are my own and not those of my present
or previous employers.