From owner-freebsd-doc Sat Feb 23 16:29:51 2002 Delivered-To: freebsd-doc@freebsd.org Received: from eos.ocn.ne.jp (eos.ocn.ne.jp [210.190.142.171]) by hub.freebsd.org (Postfix) with ESMTP id A469B37B41A for ; Sat, 23 Feb 2002 16:29:17 -0800 (PST) Received: from mail.hrslab.yi.org (p0775-ip01funabasi.chiba.ocn.ne.jp [61.119.148.13]) by eos.ocn.ne.jp (OCN) with ESMTP id JAA21365; Sun, 24 Feb 2002 09:29:14 +0900 (JST) Received: from localhost (alph.hrslab.yi.org [192.168.0.10]) by mail.hrslab.yi.org (8.9.3/3.7W/DomainMaster) with ESMTP id JAA13667; Sun, 24 Feb 2002 09:13:26 +0900 (JST) (envelope-from hrs@eos.ocn.ne.jp) Date: Sun, 24 Feb 2002 09:09:26 +0900 (JST) Message-Id: <20020224.090926.85420473.hrs@eos.ocn.ne.jp> To: sziszi@bsd.hu Cc: freebsd-doc@FreeBSD.ORG Subject: Re: Entities in translations From: Hiroki Sato In-Reply-To: <20020223115416.GA1152@fonix.adamsfamily.xx> References: <20020223115416.GA1152@fonix.adamsfamily.xx> X-Mailer: Mew version 2.1 on Emacs 20.7 / Mule 4.0 (HANANOEN) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-freebsd-doc@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org Szilveszter Adam wrote in <20020223115416.GA1152@fonix.adamsfamily.xx>: sziszi> So now, please advise. Which method should I follow? Should I stick to sziszi> entities for non-ASCII characters? But then how do I make them display sziszi> in the HTML rendering in their expanded form instead of the entity sziszi> itself? Or should I just follow the lead of the non Latin-1 teams and sziszi> start inputting these characters as-is? What is, for example the Greek sziszi> Doc Project doing about this? For localized documents, I do not think sticking to using entities for non-ascii characters is always needed. Almost all Japanese characters are categorized into non-ascii characters, but we cannot use entities since they are too many. The advantage to use entities is for original (English) documents. Languages used by translation teams define character code of non-ascii characters on their own terms, so if non-ascii characters are included as is, we would misunderstand what the character means. In addition, the translators often use localized tools (e.g. editors, web browsers) for their work, but such tools cannot often handle non-ascii characters in other language properly. As you pointed out, this do not become a problem for translated documents. I think you can use non-ascii characters as is to write your language, but I also think if we can use entities we should use them because entities do not mislead us to understand the meaning in non-English documents. The Japanese team uses entities for Latin characters, although our Japanese character set has a own set of Latin characters. Any comments or suggestions else? -- | Hiroki Sato | To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-doc" in the body of the message