From owner-freebsd-hackers Thu Jun 11 21:09:21 1998 Return-Path: Received: (from majordom@localhost) by hub.freebsd.org (8.8.8/8.8.8) id VAA23932 for freebsd-hackers-outgoing; Thu, 11 Jun 1998 21:09:21 -0700 (PDT) (envelope-from owner-freebsd-hackers@FreeBSD.ORG) Received: from coconut.itojun.org (root@coconut.itojun.org [210.160.95.97]) by hub.freebsd.org (8.8.8/8.8.8) with ESMTP id VAA23882 for ; Thu, 11 Jun 1998 21:08:47 -0700 (PDT) (envelope-from itojun@itojun.org) Received: from localhost (itojun@localhost.itojun.org [127.0.0.1]) by coconut.itojun.org (8.8.8+3.0Wbeta12/3.6W) with ESMTP id NAA02758; Fri, 12 Jun 1998 13:07:07 +0900 (JST) To: Terry Lambert cc: hackers@FreeBSD.ORG In-reply-to: tlambert's message of Fri, 12 Jun 1998 03:09:02 GMT. <199806120309.UAA11238@usr09.primenet.com> X-Template-Reply-To: itojun@itojun.org X-Template-Return-Receipt-To: itojun@itojun.org X-PGP-Fingerprint: F8 24 B4 2C 8C 98 57 FD 90 5F B4 60 79 54 16 E2 Subject: Re: internationalization From: Jun-ichiro itojun Itoh Date: Fri, 12 Jun 1998 13:07:07 +0900 Message-ID: <2754.897624427@coconut.itojun.org> Sender: owner-freebsd-hackers@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.ORG >Not assuming that character bitwidth is proportional to storage encoding >is a bad idea. It destroys useful information and a number of simplifying >assumptions which bear on computational complexity. Being in a multibyte >locale, this may not be obvious to you, since using a multibyte locale >destroys the same information. >The fact that Americans are not in a multibyte locale and can make >these simplifying assumptions is one of the competitive advantages of >the American and European software industries. >I would prefer that the Japanese enjoy the American and European >competitive advantage, rather than the Americans and Europeans being >forced to suffer the Japanese disadvantage. Just by supporting Unicode, you can't assume that every wchar_t will be rendered into a fixed-width font. You'll have to use scrwidth() for your curses programming, anyway:-) >> - there are too many programs that assume 8th bit of "char" is >> availble as flag bit >> - and more >Most newer software is 8-bit clean. The obvious offendors were SMTP >and termcap, both of which have since been corrected. I think this >assumption is not widespread. How about regex? Anyway, we have to work for "wchar_t ready" termcap/regex/curses/ whatever. >> I agree that the man-months are eaten for Kanji processing in Japanese >> software industry, but I certanly not agree that Japanese should >> have been moved to Kana-only world. How do you think if you are >> told to move to 6-letter (yea, A to F) world just to fit letters >> and digits into 4bits? >The point is not a reduction in an alphabetic symbol space, as in >your A-F example. >A switch from Kanji to Kana would not damage the ability to represent >any Japanese words; it's a switch from an ideogrammatic to an >alphabetic representation. bzzzz, you are wrong. We Japnaese can't live without Kanji. Kanji is not an extra character sets. Kanji is mandatory character set for us, just like G-Z for you. Believe me, I speak and write Japanese every day :-) The number of kanji used as a daily basis varies by person to person. However, it is clear that current JIS X0208 character set (occupies 94x94 space, has some 6000 or 7000 Kanji letters) is not enough for daily use as it does not even cover the letters used for name, by some people. We learn several thousands of Kanji letters in elementary school. >With an ideogrammatic language, the child can not even guess at the >correct word. I believe it is common to use Katakana dictionaries to >look up Kanji for children in primary language education in Japan... The above is correct. Kana has very tight relationship to its sound and it is very nice. >my copy of "Peach boy Momo" has Katakana superscript for new Kanji >symbols I wasn't previously exposed to. 8-). Oops, you are misunderstanding something... I believe you mean a book by Osamu Hashimoto. Superscript on that book is not a standard sound for those Kanji letters. The author adds some "taste" to his text, by let people read Kanji letters in non-standard manner. It is just like writing "Gimme" for "give me". itojun To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-hackers" in the body of the message