From owner-freebsd-hackers Mon Oct 16 17:56:22 1995 Return-Path: owner-hackers Received: (from root@localhost) by freefall.freebsd.org (8.6.12/8.6.6) id RAA01777 for hackers-outgoing; Mon, 16 Oct 1995 17:56:22 -0700 Received: from phaeton.artisoft.com (phaeton.Artisoft.COM [198.17.250.211]) by freefall.freebsd.org (8.6.12/8.6.6) with ESMTP id RAA01772 for ; Mon, 16 Oct 1995 17:56:19 -0700 Received: (from terry@localhost) by phaeton.artisoft.com (8.6.11/8.6.9) id RAA25909; Mon, 16 Oct 1995 17:51:14 -0700 From: Terry Lambert Message-Id: <199510170051.RAA25909@phaeton.artisoft.com> Subject: Re: A couple problems in FreeBSD 2.1.0-950922-SNAP To: ache@astral.msk.su (=?KOI8-R?Q?=E1=CE=C4=D2=C5=CA_=FE=C5=D2=CE=CF=D7?=) Date: Mon, 16 Oct 1995 17:51:14 -0700 (MST) Cc: kaleb@x.org, terry@lambert.org, hackers@freefall.freebsd.org, joerg_wunsch@uriah.heep.sax.de In-Reply-To: from "=?KOI8-R?Q?=E1=CE=C4=D2=C5=CA_=FE=C5=D2=CE=CF=D7?=" at Oct 17, 95 02:49:45 am X-Mailer: ELM [version 2.4 PL24] MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Content-Length: 2947 Sender: owner-hackers@FreeBSD.org Precedence: bulk > >> Along the same line, it'd be really convenient if the default chartype > >> table had its right side populated for ISO8859-1 so that broken tools > >> could still manage to do the right thing most of the time. > > >My major premise in this whole discussion is that the bogus code in > >crt0.o is a result of trying to correct the C locale deficiencies > >without actually correcting the C locale. > > >It is a kludge on a bug, not a bugfix. > > Lets consider, how your proposed hack differs from mine: > You plan to force non-locale-aware programs to 8859-1, > it works right for 8859-1 users as exactly matched case, > so for such users my hack == your hack, no diffs exists. For one, my "hack" meets the definition of the ISO ratification of X3J11 and at the same time conforms to ISO 8859-x character set rules. It works for all ISO8859-x users, not just ISO8859-1. The difference is wherein the character code points are set based on columnar location. This was, in fact, one of the stated design goals of the 8859-x standards. > When users are not 8859-1, your hack != my hack, because > you load so-called 'default udefined table' and I load > table which match current charset. Lets consider, what is better > for user, some 'default undefined hack' or code table > which match his charset exactly? First of all, since it conforms to the letter of the standards, the approach I have suggested is not technically a hack. You suggested soloution is in fact a hack. Second of all, my soloution does not load a locale, so nothing is "loaded"; instead, pointer is agregate initialized to a static "C" locale table. This is, in fact, the preferred method of dealing with this issue *without* calling setlocale in crt0.o or as the first thing in main(), according to the standards documents. The one real issue is the collating sequence. This is a non-issue for "7-bit-ASCII-first" sort orders. They will be correct. It *IS* an issue for "non-internationalized code pretending to be internationalized". I have absolutely no sympathy for such code; it should be fixed. Your case has sympathy for this bogus code by making it work at the expense of breaking other things. You have drawn the following figure: ,------. | | | B |D | | `------+--. C |A | `--' And expanded the supported area from {A} to {A,B}, at the expense of {C,D}. In point of fact, the figure should be drawn as: ,--. | | B |D | | | ,------+--+ | C |A | `------+--' And {B}, the set of programs pretending to be internationalized but lacking an explicit setlocale() call, are what should not be supported. If you need to make code that isn't internationalized and you want a hack, call the setlocale(,"") in main() if the desired program. Terry Lambert terry@lambert.org --- Any opinions in this posting are my own and not those of my present or previous employers.