From owner-freebsd-hackers Tue Oct 17 11:27:47 1995 Return-Path: owner-hackers Received: (from root@localhost) by freefall.freebsd.org (8.6.12/8.6.6) id LAA02752 for hackers-outgoing; Tue, 17 Oct 1995 11:27:47 -0700 Received: from phaeton.artisoft.com (phaeton.Artisoft.COM [198.17.250.211]) by freefall.freebsd.org (8.6.12/8.6.6) with ESMTP id LAA02745 for ; Tue, 17 Oct 1995 11:27:43 -0700 Received: (from terry@localhost) by phaeton.artisoft.com (8.6.11/8.6.9) id LAA27951; Tue, 17 Oct 1995 11:21:46 -0700 From: Terry Lambert Message-Id: <199510171821.LAA27951@phaeton.artisoft.com> Subject: Re: A couple problems in FreeBSD 2.1.0-950922-SNAP To: joerg_wunsch@uriah.heep.sax.de Date: Tue, 17 Oct 1995 11:21:46 -0700 (MST) Cc: freebsd-hackers@FreeBSD.ORG In-Reply-To: <199510162245.XAA27289@uriah.heep.sax.de> from "J Wunsch" at Oct 16, 95 11:45:15 pm X-Mailer: ELM [version 2.4 PL24] MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Content-Length: 3497 Sender: owner-hackers@FreeBSD.ORG Precedence: bulk > > The problem with KOI-8 is that KOI-8 is a defacto standard, and is not > > accepted by international standards bodies. Mostly because the most > > popular BBS software in the area picked it up instead of 8859-9. > > The X Consortium finally agreed to accept koi8-r as a valid character > set/encoding. > > :-) 8-(. > Well, if we would rely on things like ISO, we wouldn't use IP etc. > and suffer from OSI/X.400 instead... Or X.500. Wait, that would be a good thing. I agree that FTAM and all that cruft really, really sucks. I'm only pointing at ISO so that you Western Europeans (8-)) don't have to live with a C locale that doesn't include high bits (collation, etc. is your own problem -- fix the code if you want it). > > The problem is not in the blank areas of the locale. > > > > In point of fact, the ANSI standards for terminal control sequences > > after ANSI 3.64 leave the codes in columns 0x80 and 0x90 to be used > > to represent 8 bit command sequence introducers, which are the same > > as an escape character followed by a character in columns 0x20 or 0x30. > > Because of this, KOI-8 as a character set is not compatible with post > > 3.64 ANSI terminal control sequence standardization. > > Do you know KOI8-R? It doesn't even touch those areas. This is NOT > IBM's code page crime. KOI8-R does basically use the same printable > characters like ISO-8859-*. The most notable difference to the > ISO-8859-* fonts is that KOI has the upper/lower case reversed for > some obscure reason. According to the Taligent published translation tables (yes, I know, they swear they don't own them) for KOI8<->Unicode round trip conversion, those code points are allocated but optional. I really think you are looking at a font that doesn't implement all the code points in the character set. > > Really, they should be using the 8859 character set instead of KOI-8, > > but there is understood to be a large historical investment in the > > non-standard KOI-8 representation (unfortunately). > > You're sounding like the OSI protagonists when they started the German > educational network project (WiN) here. :-) I'm unfamiliar with the project, but if we sound alike, then they are probably right. 8-). Just kidding; OSI protocols rot. The "historical investment issue" with ISO 2022 and JIS-208/JIS-208,212 is the reason for the Japanese opposition to Unicode. Well, that and the use of Chinese dictionary sort order for the ideogrammatic characters in the CJK unification part of the standard, and the fact that you can't mask out all non-Japanese characters using a bit test. 8-). The question is whether a "round trip" translation can be done transparently, and the storage encoding varied for new systems with little impact. I think the answer is "yes". I'd be happy to ditch ASCII ordering for this as well, if it were necessary. ASCII happens to have columnar seperation for case conversion that is probably wrong, actually, and the "(", "[", "{", "<", and "`" characters ought to have their corresponding pairings a shift width apart. Maybe "\" and "/" and "-" and "+" should as well. Though this might play hell with bit test based command sequence recognition for 3.64 terminals. The point is that no one has proposed a better standard to which ASCII isn't already conformant, for better or for worse. Terry Lambert terry@lambert.org --- Any opinions in this posting are my own and not those of my present or previous employers.