From owner-freebsd-doc Sun Jun 27 7:29:56 1999 Delivered-To: freebsd-doc@freebsd.org Received: from ns11.rim.or.jp (ns11.rim.or.jp [202.247.130.230]) by hub.freebsd.org (Postfix) with ESMTP id D920414C4C; Sun, 27 Jun 1999 07:29:51 -0700 (PDT) (envelope-from motoyuki@snipe.rim.or.jp) Received: from rayearth.rim.or.jp (rayearth.rim.or.jp [202.247.130.242]) by ns11.rim.or.jp (8.8.8/3.5Wpl2-ns11/RIMNET-2) with ESMTP id XAA15978; Sun, 27 Jun 1999 23:29:48 +0900 (JST) Received: (from uucp@localhost) by rayearth.rim.or.jp (8.8.8/3.5Wpl2-uucp1/RIMNET) with UUCP id XAA11620; Sun, 27 Jun 1999 23:29:47 +0900 (JST) Received: from rei.snipe.rim.or.jp (localhost.snipe.rim.or.jp [127.0.0.1]) by rei.snipe.rim.or.jp (8.9.3/3.7W) with ESMTP id XAA06581; Sun, 27 Jun 1999 23:17:19 +0900 (JST) Message-Id: <199906271417.XAA06581@rei.snipe.rim.or.jp> To: "Jordan K. Hubbard" Cc: Motoyuki Konno , Nik Clayton , Jun Kuriyama , doc@FreeBSD.ORG, freebsd-translate@ngo.org.uk, jdp@FreeBSD.ORG Subject: Re: Resolution: FDP reorganisation X-Mailer: mh-e on Mule 2.3 / Emacs 19.34.1 References: <67622.930333696@zippy.cdrom.com> Mime-Version: 1.0 (generated by tm-edit 7.106) Content-Type: text/plain; charset=US-ASCII Date: Sun, 27 Jun 1999 23:17:19 +0900 From: Motoyuki Konno Sender: owner-freebsd-doc@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org Hi, "Jordan K. Hubbard" wrote: > OK, so the Japanese folks have some sort of auto-conversion. That > takes care of strictly the Japanese language, but what about the > Chinese folks or the others that Nik pointed out? It seemed to me > that he was looking for a much wider convention here, not just a > solution to the ja problem. If you want to know more about this, please read Ken Lunde's book "CJKV Information Processing", from O'Reilly. # CJKV means Chinese, Japanese, Korean & Vietnamese. -------------------- For General: ISO-2022: ISO-2022 is a '7 bit encoding method', because all characters do not have their 8 bit enabled. So, ISO-2022 encoding is very useful for e-mail, netnews. EUC: EUC is short from 'Extended UNIX code'. Japanese -------- character set : JIS X 0208 encoding system: JIS, SJIS, EUC-JP o JIS : also known as 'ISO-2022-JP', used for e-mail, netnews. ISO-2022-JP is defined in RFC 1922. o SJIS : short from 'Shift JIS'. DOS/Windows computers and Macintosh use SJIS as internal code. o EUC-JP : most UNIX computers use EUC-JP as internal code. conversion between JIS, SJIS an EUC-JP is very easy. Korean ------ character set : KS X 1001 encoding system : ISO-2022-KR, EUC-KR o ISO-2022-KR : defined in RFC 1557. similar to ISO-2022-JP for Japanese. o EUC-KR : similar to EUC-JP for Japanese. I have heard that many Korean people use EUC-KR for e-mail, not ISO-2022-KR. Chinese Taiwan -------------- character set : CNS 11643 (traditional Chinese characters) also known as 'Big5' (*1). encoding system : ISO-2022-CN (*2), EUC-TW, Big5 o ISO-2022-CN : defined in RFC 1922. o EUC-TW : similar to EUC-JP for Japanese. o Big5 : Big5 encoding suports more characters than EUC-TW. Ken Lunde says 'It seems a bit silly to compare Big Five and EUC-TW encodings because they are so different from one another' in his 'CJKV' book. Chinese Mainland ---------------- character set : GB 2312 (simplified Chinese characters) encoding system : ISO-2022-CN (*2), EUC-CN, GBK o ISO-2022-CN : see the section 'Chinese Taiwan'. o EUC-CN : similar to EUC-JP for Japanese. o GBK : Windows computers use GBK as internal code. EUC-CN is a subset of GBK. *1: To be exact, CNS 11643 is corrected and supplemented version of 'Big5' *2: ISO-2022-CN supports both CNS (Taiwan) and GB (Chinese Mainland) character sets. -- ------------------------------------------------------------------------ Motoyuki Konno mkonno@res.yamanashi-med.ac.jp (Univ) motoyuki@snipe.rim.or.jp (Home) motoyuki@FreeBSD.ORG (FreeBSD Project) Yamanashi Medical University http://www.freebsd.org/~motoyuki/ (WWW) To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-doc" in the body of the message