Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 28 May 2007 00:41:42 +0200 (CEST)
From:      Wolfgang Zenker <wolfgang@lyxys.ka.sub.org>
To:        freebsd-i18n@freebsd.org
Subject:   Why no non-latin TODIGIT mappings in UTF-8.src ?
Message-ID:  <200705272241.l4RMfg07051300@juno.lyxys.ka.sub.org>

next in thread | raw e-mail | index | archive | help
Hello all,

I'm a bit surprised there are no TODIGIT mappings for non-latin scripts
in src/share/mklocale/UTF-8. Is there a technical reason why this would
be a bad idea or is it simply because noone did get around to define the
mappings yet?

Looking at am_ET.UTF-8.src, the mappings are defined using the UTF-8
encoding for the digit signs in their respective script and mapping
them to their numeric value.
So, e.g. for arabic the TODIGIT mappings would be

/* Arabic-Indic digits 0 - 9 */
TODIGIT         <0xd9a0 - 0xd9a9 : 0>

/* Extended Arabic-Indic digits 0 - 9 */
TODIGIT         <0xdbb0 - 0xdbb9 : 0>

By the way, the TODIGIT mapping in am_ET.UTF-8.src appears to be off by one,
as the Ethiopic digit 1 is 0x1369 in UCS-2, which maps to 0xe18da9 in UTF-8
while in am_ET.UTF-8.src it says 0xe18da8.

Wolfgang



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200705272241.l4RMfg07051300>