FreeBSD Mail Archives

Date:      Fri, 12 Jun 1998 13:07:07 +0900
From:      Jun-ichiro itojun Itoh <itojun@iijlab.net>
To:        Terry Lambert <tlambert@primenet.com>
Cc:        hackers@FreeBSD.ORG
Subject:   Re: internationalization 
Message-ID:  <2754.897624427@coconut.itojun.org>
In-Reply-To: tlambert's message of Fri, 12 Jun 1998 03:09:02 GMT. <199806120309.UAA11238@usr09.primenet.com>


>Not assuming that character bitwidth is proportional to storage encoding
>is a bad idea.  It destroys useful information and a number of simplifying
>assumptions which bear on computational complexity.  Being in a multibyte
>locale, this may not be obvious to you, since using a multibyte locale
>destroys the same information.
>The fact that Americans are not in a multibyte locale and can make
>these simplifying assumptions is one of the competitive advantages of
>the American and European software industries.
>I would prefer that the Japanese enjoy the American and European
>competitive advantage, rather than the Americans and Europeans being
>forced to suffer the Japanese disadvantage.

	Just by supporting Unicode, you can't assume that every wchar_t will
	be rendered into a fixed-width font.  You'll have to use scrwidth()
	for your curses programming, anyway:-)

>> 	- there are too many programs that assume 8th bit of "char" is
>> 	  availble as flag bit
>> 	- and more
>Most newer software is 8-bit clean.  The obvious offendors were SMTP
>and termcap, both of which have since been corrected.  I think this
>assumption is not widespread.

	How about regex?
	Anyway, we have to work for "wchar_t ready" termcap/regex/curses/
	whatever.

>> 	I agree that the man-months are eaten for Kanji processing in Japanese
>> 	software industry, but I certanly not agree that Japanese should
>> 	have been moved to Kana-only world.  How do you think if you are
>> 	told to move to 6-letter (yea, A to F) world just to fit letters
>> 	and digits into 4bits?
>The point is not a reduction in an alphabetic symbol space, as in
>your A-F example.
>A switch from Kanji to Kana would not damage the ability to represent
>any Japanese words; it's a switch from an ideogrammatic to an
>alphabetic representation.

	bzzzz, you are wrong.  We Japnaese can't live without Kanji.
	Kanji is not an extra character sets.  Kanji is mandatory
	character set for us, just like G-Z for you.  Believe me,
	I speak and write Japanese every day :-)

	The number of kanji used as a daily basis varies by person to person.
	However, it is clear that current JIS X0208 character set
	(occupies 94x94 space, has some 6000 or 7000 Kanji letters)
	is not enough for daily use as it does not even cover the letters
	used for name, by some people.
	We learn several thousands of Kanji letters in elementary school.

>With an ideogrammatic language, the child can not even guess at the
>correct word.  I believe it is common to use Katakana dictionaries to
>look up Kanji for children in primary language education in Japan...

	The above is correct.  Kana has very tight relationship to its
	sound and it is very nice.

>my copy of "Peach boy Momo" has Katakana superscript for new Kanji
>symbols I wasn't previously exposed to.  8-).

	Oops, you are misunderstanding something...  I believe
	you mean a book by Osamu Hashimoto.  Superscript on that book
	is not a standard sound for those Kanji letters.  The author
	adds some "taste" to his text, by let people read Kanji letters
	in non-standard manner.  It is just like writing "Gimme" for
	"give me".

itojun

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-hackers" in the body of the message

Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?2754.897624427>

Header And Logo

Peripheral Links

Site Navigation

Header And Logo

Peripheral Links

Search

Site Navigation