From owner-freebsd-hackers Mon Apr 3 21:38:26 2000 Delivered-To: freebsd-hackers@freebsd.org Received: from mail.bfm.org (mail.bfm.org [216.127.218.26]) by hub.freebsd.org (Postfix) with ESMTP id 7123437B7AA for ; Mon, 3 Apr 2000 21:38:15 -0700 (PDT) (envelope-from adam@whizkidtech.net) Received: from WhizKid (r31.bfm.org [216.127.220.127]) by mail.bfm.org (Post.Office MTA v3.5.3 release 223 ID# 0-52399U2500L250S0V35) with SMTP id org; Mon, 3 Apr 2000 23:38:53 -0500 Message-Id: <3.0.6.32.20000403233641.008e6590@mail85.pair.com> X-Sender: whizkid@mail85.pair.com X-Mailer: QUALCOMM Windows Eudora Light Version 3.0.6 (32) Date: Mon, 03 Apr 2000 23:36:41 -0500 To: Alex Belits From: "G. Adam Stanislav" Subject: Re: Unicode on FreeBSD Cc: MikeM , freebsd-hackers@FreeBSD.ORG In-Reply-To: References: <3.0.6.32.20000403221617.008e2500@mail85.pair.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Sender: owner-freebsd-hackers@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.ORG At 20:59 03-04-2000 -0700, Alex Belits wrote: > I feel perfectly fine with "multilingual" documents that contain English >and Russian text without Unicode. Those are bilingual, not multilingual. I once had to create a document in English, Slovak, and Sanskrit (using Devanagari alphabet). There is only one standard that makes it possible: Unicode. Too bad UTF-8 did not exist at the time, and I had to use graphics. >> Everyone who wants to >> follow a single international standard as opposed to a slew of mutually >> exclusive local standards. Anyone who thinks globally. > "Globally" in this case means following self-proclaimed unificators from >Unicode Consortium. I don't know what you mean by "unificators." Why self proclaimed? Those were people with a need for which they found a solution. Unicode Consortium has no power to force Unicode on anyone. It just happens that it was widely accepted. You're free to create your own system, or ignore it all together. But just because you see no need for Unicode does not mean you should be upset when people are willing to work on Unicode support in FreeBSD. >> Anyone who has anything to do with the Internet must deal with UTF-8: >> "Protocols MUST be able to use the UTF-8 charset, which consists of the ISO >> 10646 coded character set combined with the UTF-8 character encoding >> scheme, as defined in [10646] Annex R (published in Amendment 2), for all >> text." > This is not approved by ANYONE but a bunch of "unificators". It never >was widely discussed, and affected people never had a chance to give any >input. This is the same kind of "standard documents" that ITU issues by >dozens. Affected in what way? Many ways of encoding Unicode were proposed, developed, and used. Most of them are history by now. UTF-8 is the best way to encode Unicode to this day. Don't like it? Design a better one. >> >-- I am Russian. >> >> So? > > So I don't want UTF-8 to be forced on me. Who's forcing it on you? > Charset definitions in MIME >headers exist for a reason. If we want to make something usable we can >create a format that can encapsulate existing charsets instead of banning >them altogether and replacing with "unified" stuff where cut(1) and >dd(1) can produce the output that will be declared "illegal" to be >processed as text because it can not be a valid UTF-8 sequence. You are worried about nothing. No one in this discussion has said anything about making anything but Unicode and UTF-8 "illegal." Supporting Unicode does not mean stopping support for everything else. > One of the most basic strengths of Unix is the ease with which text can >be manipulated, and how "non-text" data can be processed using the same >tools without any complex "this is text and this is not" >application-specific procedures. Nothing complex about it. UTF-8 uses a very simple algorithm which makes it very simple to distinguish text from non-text. >UTF-8 turns "text" into something that >gives us a dilemma -- to redesign everything to treat "text" as the stream >of UTF-8 encoded Unicode (and make it impossible to combine text and >"non-text" without a lot of pain), or to leave tools as they are and deal >with "invalid" output from perfectly valid operations. You don't have to treat everything as the stream of UTF-8 encoded Unicode. Again, supporting Unicode does not mean EVERYTHING must be Unicode. That would not make sense, at least not now. It may in the future. Unicode is here to stay. >In >Windows/Office/... that lives and feeds on complex and unparceable formats >this problem couldn't appear or even thought of -- "text" doesn't exist as >text at all, and the less stuff will look as something that can be usable >outside of strict "object" environment, the better (they now don't even >encode it in UTF-8, and use bare 16-bit Unicode). In Unixlike system it's >a violation of some very basic rules. What does Windows have to do with Unicode? Windows support for Unicode sucks royally. Except for NT, Windows' Unicode support is virtually non-existent. When did it stop Unix programmers from doing something Microsoft cannot handle? Unix already handles Unicode better than anything under Windows. For example, Lynx handles Unicode quite well, and it does it on text-only displays that have no way of supporting a multitude of fonts. Cheers, Adam ----------------------------------------------------------- "I think, therefore I am." - Seventeenth Century Philosophy "I publish what I think, therefore I have." - Twenty-First Century Action Details at http://www.OnlinePublisher.net/ To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-hackers" in the body of the message