From owner-freebsd-hackers Wed Apr 22 17:44:07 1998 Return-Path: Received: (from majordom@localhost) by hub.freebsd.org (8.8.8/8.8.8) id RAA20356 for freebsd-hackers-outgoing; Wed, 22 Apr 1998 17:44:07 -0700 (PDT) (envelope-from owner-freebsd-hackers@FreeBSD.ORG) Received: from smtp01.primenet.com (smtp01.primenet.com [206.165.6.131]) by hub.freebsd.org (8.8.8/8.8.8) with ESMTP id AAA20258 for ; Thu, 23 Apr 1998 00:43:35 GMT (envelope-from tlambert@usr02.primenet.com) Received: (from daemon@localhost) by smtp01.primenet.com (8.8.8/8.8.8) id RAA16457; Wed, 22 Apr 1998 17:43:31 -0700 (MST) Received: from usr02.primenet.com(206.165.6.202) via SMTP by smtp01.primenet.com, id smtpd016425; Wed Apr 22 17:43:25 1998 Received: (from tlambert@localhost) by usr02.primenet.com (8.8.5/8.8.5) id RAA15289; Wed, 22 Apr 1998 17:43:24 -0700 (MST) From: Terry Lambert Message-Id: <199804230043.RAA15289@usr02.primenet.com> Subject: Re: Euro key ? To: nate@mt.sri.com (Nate Williams) Date: Thu, 23 Apr 1998 00:43:24 +0000 (GMT) Cc: freebsd-hackers@FreeBSD.ORG In-Reply-To: <199804222054.OAA05690@mt.sri.com> from "Nate Williams" at Apr 22, 98 02:54:08 pm X-Mailer: ELM [version 2.4 PL25] MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: owner-freebsd-hackers@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.ORG > > FreeBSD will have to switch to Unicode sooner or later anyway, > > I think. > > I disagree. We're trying to use Unicode in a commercial application, > and we've come to the conclusion that Unicode is *NO* better than > shifted wide-char support. ?????????????? Multibyte encodings instead of raw wchar_t mean: o I can't know how many characters will fit in an N character field: it's not sizeof(field)/sizeof(wchar_t) o I can't do fixed field input because field lengths are no longer fixed. o I can't do fixed field storage (all my COBOL programs quit working, and no one wrote any replacements for me). o I can't use sizeof(file)/sizeof(struct) to get a record count o I can't know ahead of time whether or not I have enough disk space to store the document I created in memory (oops! -- better "store" it on the printer!). o Input buffer overrun is harder to prevent. o I have to translate between storage encoding and program internal (wchar_t) encoding. Consider "cat a b | more". o I can't "attribute" a file system into a round trip character set at mount time: for example, a legacy file system that has not-7-bit data on it already... like ISO 8859-X or KOI-8 or KOI-8U, or ISO 2022 encoded JIS-208 and JIS-212, which are not already multibyte encoded in UTF-8 (or -- gack! -- UTF-7). o Because I can't do that, all my CDROM's are now useless unless I twiddle my locale in-and-out, in-and-out. o I can't NFS mount a legacy system not in 7 bit US ASCII, even if I make a "magic" layer that applies only to NFS. o I can't round-trip between 8-bit ??? encoding and 16-bit Unicode encoding (ISO 10646 code page 0) and 32-bit ISO 10646 (of which only code page 0 is likely to be defined for the next 10 years, since others only exist as a nod to the language bigots) automatically, using page multiplication (where an on disk 4k page becomes 2 or 4, respectively, in core VM pages). o I can't use text data in mmap()'ed files without calling translation functions. o I can't support VFAT32 Unicode names directly. o I can't support LDAP and other ASN.1 encoded raw Unicode byte streams directly. o I can't support NTFS long file names directly. o I can't support NetWare client services, directly. o I can't support CIFS client services directly. o I can't know if the new directory entry will take more room thn the previous directory entry (in the FS directory block). Basically, the only reason for them is so that 7 bit ASCII users (read: English speakers) don't have to modify their legacy code or US ASCII-centric data so that it will keep working. ASCII bigotry: what a stupid excuse for all those limitations and all that extra processing overhead. Terry Lambert terry@lambert.org --- Any opinions in this posting are my own and not those of my present or previous employers. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-hackers" in the body of the message