From owner-freebsd-hackers Tue Oct 17 19:13:19 1995 Return-Path: owner-hackers Received: (from root@localhost) by freefall.freebsd.org (8.6.12/8.6.6) id TAA20469 for hackers-outgoing; Tue, 17 Oct 1995 19:13:19 -0700 Received: from phaeton.artisoft.com (phaeton.Artisoft.COM [198.17.250.211]) by freefall.freebsd.org (8.6.12/8.6.6) with ESMTP id TAA20460 ; Tue, 17 Oct 1995 19:13:08 -0700 Received: (from terry@localhost) by phaeton.artisoft.com (8.6.11/8.6.9) id TAA29166; Tue, 17 Oct 1995 19:07:29 -0700 From: Terry Lambert Message-Id: <199510180207.TAA29166@phaeton.artisoft.com> Subject: Re: Locale stuff: call for conclusion. To: ache@astral.msk.su (=?KOI8-R?Q?=E1=CE=C4=D2=C5=CA_=FE=C5=D2=CE=CF=D7?=) Date: Tue, 17 Oct 1995 19:07:29 -0700 (MST) Cc: terry@lambert.org, core@FreeBSD.ORG, hackers@FreeBSD.ORG, phk@critter.tfs.com, wollman@lcs.mit.edu In-Reply-To: from "=?KOI8-R?Q?=E1=CE=C4=D2=C5=CA_=FE=C5=D2=CE=CF=D7?=" at Oct 18, 95 02:38:56 am X-Mailer: ELM [version 2.4 PL24] MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Content-Length: 4761 Sender: owner-hackers@FreeBSD.ORG Precedence: bulk > Well... What is our final decision about XPG4->XPG3 migration? > Does your words about #define XPG4 means that Garrett > agrees to #ifdef XPG4 ? No. He wants localization to take place to XPG/4, if it takes place at all. In the runic case of XPG/4, he wants the declaration "doesn't work", and thus it will poke its head up the first time someone in a runic locale trys to run an 8 bit clean piece of code that sets locale but is not capable of handling runes. There are two approaches to deal with this: 1) Call setlocale() in 8bit clean code and let it break in a runic locale. 2) Don't call setlocale in the code, even if it is 8 bit clean, until it is capable of handling runic encoded data. The consequences are: 1) Breaks runic locales; it is unlikely that someone would put forth the effort to do the necessary fixups, since it will require the engineer doing the work to do so in a non-native locale (probably english) until they get all of the applications running. 2) Breaks currently running (with the crt0.o hack and with setlocale() calls in 8 bit clean but runic incapable programs) code in the runic and non-runic environemnts until a full localization. It is unlikely that someone would put forth the effort to do the necessary fixups, since it will require the engineer doing the work to do a full XPG/4 localization of the code, instead of the easier but runic incapable XPG/3 and 8 bit clean code changes. Both these approaches dissuade use until everything is fixed at once; the first in runic locales only, the second in all locales other than ISO 8859-1. A third alternative would be: 3) Leave both XPG/3 and XPG/4 setlocale() calls out of code until it is fully internationalized via XPG/4 (like option 2), but make the C locale ASCII only. The consequences are: 3) Restores everything to what it was before your crt0.o hack, but all of the world, other than the US, is inconvenienced. This both encourages internationalization efforts that will then benefit all locales, and encourages the use of Linux instead of BSD by people who aren't up to full XPG/4 type internationalization. What we have now is "almost 1", where attempts to use locales introduce state because of the crt0.o hack, and cause things (like xterm) to core their brains out. This is Garrett's approach, assuming he wants the crt0.o hack removed. My suggestion is: 4) Use #ifdef'ed code to make a seperate XPG/4 library from the same code that results in an XPG/3 library integrated with the C library. Use object module dependencies in the XPG/4 version of the library to cause it to be completely dragged in place of the libc XPG/3 code if it is linked before the libc which contains the XPG/3 code. I believe there is a shared library problem that this will cause to show up: shared library symbols will override non-shared library symbols. This has to be fixed. Add XPG/3 localization to all 8-bit clean code. Leave all 8-bit unclean code unlocalized. Use ISO 8859-1 as the C locale. Add a directive to the default .mk files to cause the use of XPG/4 instead of XPG/3 when the directive is specified. The consequences are: 4) -The XPG/3 and XPG/4 code is maintained in parallel. -Both XPG/3 and XPG/4 localization mechanisms can exist at the same time. -The linker bug with shared library symbols gets fixed. -The internationalization interface is better abstracted. -The benefits of the crt0.o hack are not lost for 8 bit clean code, but there is incentive to XPG/3 localize and 8-bit clean the code where the benefits are not lost. -The crt0.o hack will go away. -Unclean code will not bogusly call setlocale() and pretend to work, but fail unexpectedly (like the crt0.o hack causes now). -The default C locale will be sufficient for most 8859-x use, except for those locale-specific features that locale unaware code probably wouldn't be using in the first place. -The directive will flag code which has been internationalized, as well as simplifying the code that has not. Of course, my suggestion does not address: o Someone actually fixing the linker bug (for all I know, it could be fixed already). o The other actual interface issues for allowing optioning XPG/4 in in place of XPG/3. o Someone actually adding the directive to the .mk files to make it a "toggle switch" that can be thrown. o Most of the incentives for the unaffected locales to do localization is lost. I'm not a big fan of negative reinforcement anyway... I think all it does is promote Linux when we negatively reinforce and they do not. Terry Lambert terry@lambert.org --- Any opinions in this posting are my own and not those of my present or previous employers.