From owner-freebsd-doc Tue Sep 3 20:37:05 1996 Return-Path: owner-doc Received: (from root@localhost) by freefall.freebsd.org (8.7.5/8.7.3) id UAA18539 for doc-outgoing; Tue, 3 Sep 1996 20:37:05 -0700 (PDT) Received: from tokyonet-entrance.astec.co.jp (tokyonet-entrance.astec.co.jp [202.239.16.2]) by freefall.freebsd.org (8.7.5/8.7.3) with SMTP id UAA18534; Tue, 3 Sep 1996 20:36:58 -0700 (PDT) Received: from amont.astec.co.jp (amont.astec.co.jp [172.20.10.1]) by tokyonet-entrance.astec.co.jp (8.6.12+2.5Wb7/3.4Wbeta5-astecMX2.3) with ESMTP id MAA24443; Wed, 4 Sep 1996 12:36:50 +0900 Received: from adjanta.astec.co.jp (adjanta [172.20.12.5]) by amont.astec.co.jp (8.6.9+2.4W/3.4Wbeta5-astecNoMX2.3) with SMTP id MAA12290; Wed, 4 Sep 1996 12:36:50 +0900 Received: by adjanta.astec.co.jp (4.1/astec-1.2) id AA07720; Wed, 4 Sep 96 12:36:49 JST Message-Id: <9609040336.AA07720@adjanta.astec.co.jp> To: asami@freebsd.org Cc: jfieber@indiana.edu, jkh@time.cdrom.com, doc@freebsd.org Subject: Re: Warning: SGML doc changes In-Reply-To: Your message of "Tue, 3 Sep 1996 20:00:53 -0700 (PDT)" References: <199609040300.UAA10130@silvia.HIP.Berkeley.EDU> X-Mailer: Mew version 1.05+ on Emacs 19.28.1, Mule 2.3 Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Date: Wed, 04 Sep 1996 12:36:48 +0900 From: Hanai Hiroyuki Sender: owner-doc@freebsd.org X-Loop: FreeBSD.org Precedence: bulk I've also checked John's Web page for Japanese verion of Handbook and I think there is no problems on the new tools. > I'm afraid this could be quite confusing, 'cause Chinese and Korean > can also be encoded in EUC and there is nothing in there to > distinguish. The only way to mix multiple multi-byte languages is to Yes, I'm afraid that too. > use a stateful encoding (JIS for Japanese), but then we'll have a much > larger task of fixing tools to handle these. Yes, it's too hard. Also, another point is the SGML declaration. When we write some documents(SGML instances) in EUC, sections of BASESET and DESCSET for EUC part in the SGML declaration should be like... BASESET "ISO Registration Number 87//CHARSET JIS X 0208 Japanese Character Set//ESC 2/6 4/0 ESC 2/4 2/9 4/2" DESCSET 128 127 128 255 1 UNUSED In practice, SGML parsers such as sgmls, nsgmls.. can handle Japanese characters correctly without above part in the SGML declaration if 8-bits code is not forbidden. So, this is not an actual problem but a political problem ;-> -----H.Hanai hanai@jp.freebsd.org