Date: Sat, 16 Dec 2000 03:06:04 -0600 From: "Michael C . Wu" <keichii@iteration.net> To: doc@freebsd.org, i18n@freebsd.org Subject: Docbook and CJK languages Message-ID: <20001216030604.B46336@peorth.iteration.net>
next in thread | raw e-mail | index | archive | help
While working on some freebsd-taiwan docbook, we discovered this problem with Docbook/SGML not handling 2 byte characters correctly. For example: I have this line of text ("AA" and "BB" are two examples of 2 byte chars) <PARA> AABBAABBAABBAABB </PARA> When I compile this with output specified to text files. The correct behavior to cut them into two lines would be: AABBAABBAABB/n AABB/n However, sometimes the output comes out looking like: AABBAABBAABBA/n ABB/n (Note the broken AA char in the last part of the first line) This causes the whole doc to be broken and unreadable. Since subsequent encoding/decoding is off-by-one. And the problem can repeat several times in the documentation. Is there any way to fix this? Is there an SGML tag that I can specify? Or is this a lacking feature of Docbook? -- +------------------------------------------------------------------+ | keichii@peorth.iteration.net | keichii@bsdconspiracy.net | | http://peorth.iteration.net/~keichii | Yes, BSD is a conspiracy. | +------------------------------------------------------------------+ To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-doc" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20001216030604.B46336>