From owner-freebsd-hackers Mon Oct 16 15:38:20 1995 Return-Path: owner-hackers Received: (from root@localhost) by freefall.freebsd.org (8.6.12/8.6.6) id PAA26076 for hackers-outgoing; Mon, 16 Oct 1995 15:38:20 -0700 Received: from phaeton.artisoft.com (phaeton.Artisoft.COM [198.17.250.211]) by freefall.freebsd.org (8.6.12/8.6.6) with ESMTP id PAA26069 for ; Mon, 16 Oct 1995 15:38:15 -0700 Received: (from terry@localhost) by phaeton.artisoft.com (8.6.11/8.6.9) id PAA25657; Mon, 16 Oct 1995 15:33:04 -0700 From: Terry Lambert Message-Id: <199510162233.PAA25657@phaeton.artisoft.com> Subject: Re: A couple problems in FreeBSD 2.1.0-950922-SNAP To: ache@astral.msk.su (=?KOI8-R?Q?=E1=CE=C4=D2=C5=CA_=FE=C5=D2=CE=CF=D7?=) Date: Mon, 16 Oct 1995 15:33:03 -0700 (MST) Cc: kaleb@x.org, hackers@freefall.freebsd.org In-Reply-To: from "=?KOI8-R?Q?=E1=CE=C4=D2=C5=CA_=FE=C5=D2=CE=CF=D7?=" at Oct 16, 95 07:21:56 am X-Mailer: ELM [version 2.4 PL24] MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Content-Length: 2673 Sender: owner-hackers@FreeBSD.org Precedence: bulk > >>A nice suggestion. Too bad it doesn't work. ANSI/POSIX1 say that a > >>program does the equivalent of setlocale(LC_ALL, "C") on startup. Given > >>that ls, and I gather everything else, disregard my LANG, LC_ALL, and > >>LC_CTYPE environment variables, I'm left wondering how it is you think > >>that using the "proper locale" will help. Are you assuming that I'm > >>using the undocumented hack of setting the ENABLE_STARTUP_LOCALE > >>environment variable? > > Briefly says, I disagree with default table propogation to 8859-1 > (well, maybe agree with propogation to KOI8-R :-) because: > > 1) It violates POSIX default "C" locale description. But not the ISO/ANSI C description. I can put in anything for "undefined" that I damn well please. Including 8859-1. > 2) It breaks all >=8bit charsets which names != 8859-1. This is patently false. What results is a predominantly 8-bit clean interface that has 0x00-0x1f,0x80-0x9f shown a controls, and everything else shown as printable characters. This is valid for all 8859-x display/input systems, since the reuse of the code points are not transformed by this (8859-x does not encode characters in those locations). The only potentially incorrect behaviour is on blanks not being interpreted as blanks. If you want a blank, you shouldn't be using some wild code point other than 0x20 anyway. You get what you deserve. At this point, that would cause the resulting behaviour to be "undefined". This is acceptable according to ISO interpretation of X3J11 anyway. You use an undefined character, you get wierd grap on your screen. The problems you will encounter in this circumstance are all *very* specific to cases where a single file system is being used by multiple nationalities of clients. Since the locale mechanism is a *internationalization* mechanism, not a *multinationalization* mechanism, this is in fact correct behaviour. The difference is that "internationalization" is defined as enabling for localization to a particular (read as *single*) language. Thus this behaviour is acceptable and, in fact, expected. If you want *multinationalization*, use ISO10646, code page 0. That is, 16 bit Unicode. It won't buy you font selection, but unless you are a language bigot, that shouldn't matter on file names. If you care, start lobbying for allocation of code pages other than 0 in 10646. Good luck on your lobbying, you'll need it. I hope you don't mind if I lobby for RTF encoding for language bigots, since they are in the minority. 8-). Terry Lambert terry@lambert.org --- Any opinions in this posting are my own and not those of my present or previous employers.