Date: Sat, 19 May 2001 07:38:21 +0900 (JST) From: Noriyuki Soda <soda@sra.co.jp> To: ache@nagual.pp.ru, i18n@freebsd.org, audit@freebsd.org Cc: bsd-locale@hauN.org Subject: Re: CFR: ISO_* -> ISO-* locale renaming Message-ID: <200105182238.HAA29872@srapc342.sra.co.jp> In-Reply-To: <20010519050946U.tshiozak@din.or.jp> References: <20010518203702.B79058@nagual.pp.ru> <20010519050946U.tshiozak@din.or.jp>
next in thread | previous in thread | raw e-mail | index | archive | help
Andrey A. Chernov <ache@nagual.pp.ru> wrote: > In the spirit of GNU locale (which use IANA charsets too) I plan to rename > our ISO_* locales to ISO-* ones, because ISO-* is preferred name according > to http://www.iana.org/assignments/character-sets and we have only one > locale name, it should be preferred. GNU locale use preferred MIME names > in the first place too. It's highly suspicious to me. I still think that using X11 codeset name is better than using IANA registry due to the following problems. 6 questions. 1. As I already wrote, Solaris, Tru64 and IRIX uses "ISO8859-1". And X11's primary name of Latin-1 codeset is also "ISO8859-1". I prefer to use the name compatible with Solaris, Tru64, IRIX and X Window System, rather than the name only compatible with Linux. Note that Linux also supports "ISO8859-1" as locale's codeset suffix. So, if we use "ISO8859-1", we are still compatible with Linux, as well as Solaris, Tru64 and IRIX. If we use "ISO-8859-1", we are only compatible with Linux, and we become incompatible with Solaris, Tru64 and IRIX. Why do you think that it is better to become only compatible with Linux? For me, apparently "ISO8859-1" is better. 2. What codeset name will you use for codesets which are available on X Window System, but not defined in IANA registry? (Yes, there is such codeset in locales supported by X11, already.) If we follow the convention of X Window System, this problem never happens. Note that nl_langinfo(CODESET) of glibc-2 returns *WRONG* result for such locale. 3. IANA registry (MIME charset name) is case insensitive. Will you support case-insensitive codeset-suffix for locale name? Yes, codeset-suffix in glibc is case insenstive, although language part and territory part of locale name are case sensitive. i.e. ja_JP.EUC-JP, ja_JP.eUc-jP, ja_JP.EuC-Jp are all correct locale name on glibc, although ja_jp.EUC-JP is incorrect. If we use IANA registry for codeset name, we should support case-insensitive codeset-suffix as above. Will you really support this? 4. IANA registry (MIME charset name) has many name variants in one codeset. For example, "Extended_UNIX_Code_Packed_Format_for_Japanese", "csEUCPkdFmtJapanese" are same codeset with "EUC-JP". Will you support all variants for locale name? Yes, glibc supports all variants. e.g. the following names are all valid locale names in glibc: "ja_JP.Extended_UNIX_Code_Packed_Format_for_Japanese" "ja_JP.csEUCPkdFmtJapanese" "ja_JP.EUC-JP" "ja_JP.eucJP" (note that because MIME charset name is case insenstive, names which only differs about upper-case/lower-case are also valid. e.g. "ja_JP.eXTENDED_unix_cODE_pACKED_fORMAT_FOR_jAPANESE" is valid, too.) Will you really support this? 5. Why do you think that is is better *NOT* to follow OpenGroup standard? At least, "eucJP" and "SJIS" seem to be OpenGroup standard as I already said. And those names are not compatible with IANA registry. 6. Do you really think that the following name should be usable for locale name? "ja_JP.Extended_UNIX_Code_Packed_Format_for_Japanese" (I don't think so.) -- soda To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-i18n" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200105182238.HAA29872>