Date: Sat, 16 Dec 2000 03:06:04 -0600 From: "Michael C . Wu" <keichii@iteration.net> To: doc@freebsd.org, i18n@freebsd.org Subject: Docbook and CJK languages Message-ID: <20001216030604.B46336@peorth.iteration.net>
next in thread | raw e-mail | index | archive | help
While working on some freebsd-taiwan docbook, we discovered this
problem with Docbook/SGML not handling 2 byte characters correctly.
For example:
I have this line of text ("AA" and "BB" are two examples of 2 byte chars)
<PARA> AABBAABBAABBAABB </PARA>
When I compile this with output specified to text files. The correct
behavior to cut them into two lines would be:
AABBAABBAABB/n
AABB/n
However, sometimes the output comes out looking like:
AABBAABBAABBA/n
ABB/n
(Note the broken AA char in the last part of the first line)
This causes the whole doc to be broken and unreadable. Since
subsequent encoding/decoding is off-by-one. And the problem
can repeat several times in the documentation.
Is there any way to fix this? Is there an SGML tag that I can
specify? Or is this a lacking feature of Docbook?
--
+------------------------------------------------------------------+
| keichii@peorth.iteration.net | keichii@bsdconspiracy.net |
| http://peorth.iteration.net/~keichii | Yes, BSD is a conspiracy. |
+------------------------------------------------------------------+
To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-doc" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20001216030604.B46336>
