Date: Wed, 18 Sep 1996 02:50:41 -0700 (PDT) From: asami@freebsd.org (Satoshi Asami) To: doc@freebsd.org Subject: more on Japanese handbook Message-ID: <199609180950.CAA15480@silvia.HIP.Berkeley.EDU>
next in thread | raw e-mail | index | archive | help
John, How's the current status of the merging effort? Please let us know your thoughts on the directory structure. By the way, there is one more thing you may want to consider re the handbook encoding. The Japanese encoding used in the handbook sources (EUC-JP) is good in that many tools allow the Japanes part of it to pass through untouched, but there is an annoying tendency of netscape (and maybe others) misjudging the language code of some files and thinking it's Shift-JIS (the brain-damaged code nobody likes but since NEC decided to use it for their once-popular PC98 series, it's not dying anytime soon). Also there is no language information in EUC, so if someone reads the pages using netscape with language set to Chinese, well it will show something totally incoherent (when it should have ignored it). The optimal solution (at least viewed from the Japanese side of us) is to convert the file into JIS just before it's written to whatever output files sgmlfmt is creating (or is it instant now? :). This is really quite simple, since all we need to do here is to scan for bytes with the eighth bit set and convert that, as well as the following bytes with the eighth bit set, from something like 1[B1] 1[B2] ... 1[B2N-1] 1[B2N] to Esc '$' 'B' 0[B1] 0[B2] ... 0[B2N-1] 0[B2N] Esc '(' 'B' (1 is the eighth bit set, 0 is it cleared -- [BX] is the lower 7 bits of the X-th byte) The nkf program (ports/japanese/nkf) is one such filter but that's a gross overkill, as it needs to deal with all three forms of input and output (plus mime and...). Since we only need EUC->JIS conversion, it can be done with a 10-line (or so) C program. What do you think? By the way, I'm not sure what the user should set ${LANG} to when there might be both EUC and JIS on the system, would it be suffice to just say "ja_JP"? (I'm asking this mostly to Mr. Hanai, I guess.) Satoshi
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199609180950.CAA15480>