Date: Thu, 13 Jul 2006 10:51:04 -0400 (EDT) From: "J.R. Oldroyd" <fbsd@opal.com> To: FreeBSD-gnats-submit@FreeBSD.org Subject: misc/100212: UTF-8 zero-width character patch Message-ID: <200607131451.k6DEp4Gq093701@linwhf.opal.com> Resent-Message-ID: <200607131500.k6DF0UAb084838@freefall.freebsd.org>
next in thread | raw e-mail | index | archive | help
>Number: 100212 >Category: misc >Synopsis: UTF-8 zero-width character patch >Confidential: no >Severity: serious >Priority: medium >Responsible: freebsd-bugs >State: open >Quarter: >Keywords: >Date-Required: >Class: change-request >Submitter-Id: current-users >Arrival-Date: Thu Jul 13 15:00:30 GMT 2006 >Closed-Date: >Last-Modified: >Originator: J.R. Oldroyd >Release: FreeBSD 6.1-STABLE i386 >Organization: >Environment: System: FreeBSD linwhf.opal.com 6.1-STABLE FreeBSD 6.1-STABLE #1: Thu May 18 16:03:24 EDT 2006 xxx@linwhf.opal.com:/usr/obj/usr/src/sys/LINWHF i386 >Description: This patch makes the so-called zero-width, non-spacing, or overstriking characters of the UTF-8 encoding exactly that. At the present time, these characters are coded with a width of 1 which is wrong. They should have a width of 0. >How-To-Repeat: Save this file: http://opal.com/freebsd/unicode/utf8demo.txt On an xterm, cat the file and examine the "Combining characters" and the "Thai (UCS Level 2)" sections. Without the patch, the non-spacing characters do not overstrike the previous character. With the patch, they do. This patch has been posted to -current and downloaded and reviewed many times following that posting: http://lists.freebsd.org/pipermail/freebsd-current/2006-June/064218.html >Fix: --- /usr/src/share/mklocale/UTF-8.src.orig Sat Mar 27 03:14:14 2004 +++ /usr/src/share/mklocale/UTF-8.src Mon Jun 26 23:15:34 2006 @@ -487,9 +487,9 @@ * U+0300 - U+036F : Combining Diacritical Marks */ -GRAPH 0x0300 - 0x034f 0x0360 - 0x036f -PRINT 0x0300 - 0x034f 0x0360 - 0x036f -SWIDTH1 0x0300 - 0x034f 0x0360 - 0x036f +GRAPH 0x0300 - 0x036f +PRINT 0x0300 - 0x036f +SWIDTH0 0x0300 - 0x036f MAPUPPER < 0x0345 0x0399 > @@ -593,7 +593,8 @@ UPPER 0x04e2 0x04e4 0x04e6 0x04e8 0x04ea 0x04ec 0x04ee UPPER 0x04f0 0x04f2 0x04f4 0x04f8 PRINT 0x0400 - 0x0486 0x0488 - 0x04ce 0x04d0 - 0x04f5 0x04f8 0x04f9 -SWIDTH1 0x0400 - 0x0486 0x0488 - 0x04ce 0x04d0 - 0x04f5 0x04f8 0x04f9 +SWIDTH1 0x0400 - 0x0482 0x048a - 0x04ce 0x04d0 - 0x04f5 0x04f8 0x04f9 +SWIDTH0 0x0483 - 0x0486 0x0488 - 0x0489 MAPUPPER < 0x0430 - 0x044f : 0x0410 > MAPUPPER < 0x0450 - 0x045f : 0x0400 > @@ -1016,7 +1017,8 @@ GRAPH 0x0e01 - 0x0e3a 0x0e3f - 0x0e5b PUNCT 0x0e3f 0x0e4f 0x0e5a 0x0e5b PRINT 0x0e01 - 0x0e3a 0x0e3f - 0x0e5b -SWIDTH1 0x0e01 - 0x0e3a 0x0e3f - 0x0e5b +SWIDTH0 0x0e31 0x0e34 - 0x0e3a 0x0e47 - 0x0e4e +SWIDTH1 0x0e01 - 0x0e30 0x0e32 - 0x0e33 0x0e3f - 0x0e46 0x0e4f - 0x0e5b /* @@ -1647,9 +1649,9 @@ * U+20D0 - U+20FF : Combining Diacritical Marks for Symbols */ -GRAPH 0x20d0 - 0x20ea -PRINT 0x20d0 - 0x20ea -SWIDTH1 0x20d0 - 0x20ea +GRAPH 0x20d0 - 0x20ff +PRINT 0x20d0 - 0x20ff +SWIDTH0 0x20d0 - 0x20ff /* @@ -1927,7 +1929,8 @@ PUNCT 0x309b 0x309c PRINT 0x3041 - 0x3096 0x3099 - 0x309f PHONOGRAM 0x3041 - 0x3096 0x309f -SWIDTH2 0x3041 - 0x3096 0x3099 - 0x309f +SWIDTH2 0x3041 - 0x3096 0x309b - 0x309f +SWIDTH0 0x3099 - 0x309a /* @@ -2149,9 +2152,9 @@ * U+FE20 - U+FE2F : Combining Half Marks */ -GRAPH 0xfe20 - 0xfe23 -PRINT 0xfe20 - 0xfe23 -SWIDTH1 0xfe20 - 0xfe23 +GRAPH 0xfe20 - 0xfe2f +PRINT 0xfe20 - 0xfe2f +SWIDTH0 0xfe20 - 0xfe2f /* @@ -2272,7 +2275,8 @@ PUNCT 0x1d100 - 0x1d126 0x1d12a - 0x1d164 0x1d16a - 0x1d16c PUNCT 0x1d183 0x1d184 0x1d18c - 0x1d1a9 0x1d1ae - 0x1d1dd PRINT 0x1d100 - 0x1d126 0x1d12a - 0x1d172 0x1d17b - 0x1d1dd -SWIDTH1 0x1d100 - 0x1d126 0x1d12a - 0x1d172 0x1d17b - 0x1d1dd +SWIDTH1 0x1d100 - 0x1d126 0x1d12a - 0x1d164 0x1d16a - 0x1d172 0x1d183 0x1d184 0x1d18c - 0x1d1a9 0x1d1ae - 0x1d1dd +SWIDTH0 0x1d165 - 0x1d169 0x1d17b - 0x1d182 0x1d185 - 0x1d18b 0x1d1aa - 0x1d1ad /* >Release-Note: >Audit-Trail: >Unformatted:
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200607131451.k6DEp4Gq093701>