Date: Thu, 11 Oct 2018 18:30:13 +0000 (UTC) From: Yuri Pankov <yuripv@FreeBSD.org> To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org Subject: svn commit: r339313 - in head: share/ctypedef tools/tools/locale tools/tools/locale/etc Message-ID: <201810111830.w9BIUDw9040890@repo.freebsd.org>
next in thread | raw e-mail | index | archive | help
Author: yuripv Date: Thu Oct 11 18:30:12 2018 New Revision: 339313 URL: https://svnweb.freebsd.org/changeset/base/339313 Log: Restore some of the ctype definitions reported in the PR from pre-CLDR data, namely 0xE000-0xF8FF private use area, and 0xFF00-0xFFF half- and fullwidth punctuation. While here, update tools/tools/locale/README based on my experience rebuilding the locale data. PR: 225692 Reviewed by: bapt, cem (previous version) Approved by: re (gjb), kib (mentor) Differential Revision: https://reviews.freebsd.org/D17471 Modified: head/share/ctypedef/en_US.UTF-8.src head/tools/tools/locale/README head/tools/tools/locale/etc/common.UTF-8.src head/tools/tools/locale/etc/manual-input.UTF-8 Modified: head/share/ctypedef/en_US.UTF-8.src ============================================================================== --- head/share/ctypedef/en_US.UTF-8.src Thu Oct 11 18:27:19 2018 (r339312) +++ head/share/ctypedef/en_US.UTF-8.src Thu Oct 11 18:30:12 2018 (r339313) @@ -6241,6 +6241,12 @@ graph <MEETEI_MAYEK_LETTER_KOK>;...;<MEETEI_MAYEK_APUN digit <MEETEI_MAYEK_DIGIT_ZERO>;...;<MEETEI_MAYEK_DIGIT_NINE> ********************************************************************** +* 0xE000 - 0xF8FF Private Use Area (from pre-CLDR data) +********************************************************************** + +graph <PRIVATE_USE_AREA-E000>;...;<PRIVATE_USE_AREA-F8FF> + +********************************************************************** * 0xFB50 - 0xFDFF Arabic Presentation Forms (differential) ********************************************************************** @@ -6277,6 +6283,17 @@ punct <SMALL_COMMA>;...;<SMALL_COMMERCIAL_AT> ********************************************************************** blank <ZERO_WIDTH_NO-BREAK_SPACE> + +********************************************************************** +* 0xFF00 - 0xFFFF Half- and Fullwidth Punctuation (from pre-CLDR data) +********************************************************************** + +punct <FULLWIDTH_EXCLAMATION_MARK>;...;<FULLWIDTH_SOLIDUS>;/ + <FULLWIDTH_COLON>;...;<FULLWIDTH_COMMERCIAL_AT>;/ + <FULLWIDTH_LEFT_SQUARE_BRACKET>;...;<FULLWIDTH_GRAVE_ACCENT>;/ + <FULLWIDTH_LEFT_CURLY_BRACKET>;...;<HALFWIDTH_KATAKANA_MIDDLE_DOT>;/ + <FULLWIDTH_CENT_SIGN>;...;<FULLWIDTH_WON_SIGN>;/ + <HALFWIDTH_FORMS_LIGHT_VERTICAL>;...;<HALFWIDTH_WHITE_CIRCLE> ********************************************************************** * 0x10300 - 0x1032F Old Italic Modified: head/tools/tools/locale/README ============================================================================== --- head/tools/tools/locale/README Thu Oct 11 18:27:19 2018 (r339312) +++ head/tools/tools/locale/README Thu Oct 11 18:30:12 2018 (r339313) @@ -2,23 +2,37 @@ To generate the locales: -Tools needed: java, perl, devel/p5-Tie-IxHash, converters/p5-Text-Iconv and -textproc/p5-XML-Parser +Tools needed: + java (openjdk >= 8) + perl + converters/p5-Text-Iconv + devel/p5-Tie-IxHash + textproc/p5-XML-Parser -fetch cldr data from: http://cldr.unicode.org -extract in a directory ~/unicode/cldr/v30.0.3 for example -fetch unidata from http://www.unicode.org/Public/zipped/ (latest version) -extract in a directory ~/unicode/UNIDATA/9.0.0 for example +Fetch CLDR data from: http://unicode.org/Public/cldr/. You need all of the +core.zip, keyboards.zip, and tools.zip. -Note that the prebuilt cldr tools are not working on freebsd, it needs to -be rebuilt: -cd $CLDRDIR/tools/java -ant build +Extract: + mkdir -p ~/unicode/cldr/v33.0 + cd ~/unicode/cldr/v33.0 + unzip ~/core.zip ~/keyboards.zip ~/tools.zip -either modify tools/tools/locales/etc/unicode.conf or export variables: -CLDRDIR="~/unicode/cldr/v30.0.3" -UNIDATADIR="~/unicode/UNIDATA/9.0.0" +Fetch unidata (UCD.zip) from http://www.unicode.org/Public/zipped/latest. -run: -make POSIX -make install +Extract: + mkdir -p ~/unicode/UNIDATA/11.0.0 + cd ~/unicode/UNIDATA/11.0.0 + unzip ~/UCD.zip + +Either modify tools/tools/locales/etc/unicode.conf or export variables: + CLDRDIR=~/unicode/cldr/v33.0; export CLDRDIR + UNIDATADIR=~/unicode/UNIDATA/9.0.0; export UNIDATADIR + +Build the CLDR tools: + cd $CLDRDIR/tools/java + ant jar + +Run: + make POSIX + make + make install Modified: head/tools/tools/locale/etc/common.UTF-8.src ============================================================================== --- head/tools/tools/locale/etc/common.UTF-8.src Thu Oct 11 18:27:19 2018 (r339312) +++ head/tools/tools/locale/etc/common.UTF-8.src Thu Oct 11 18:30:12 2018 (r339313) @@ -6241,6 +6241,12 @@ graph <MEETEI_MAYEK_LETTER_KOK>;...;<MEETEI_MAYEK_APUN digit <MEETEI_MAYEK_DIGIT_ZERO>;...;<MEETEI_MAYEK_DIGIT_NINE> ********************************************************************** +* 0xE000 - 0xF8FF Private Use Area (from pre-CLDR data) +********************************************************************** + +graph <PRIVATE_USE_AREA-E000>;...;<PRIVATE_USE_AREA-F8FF> + +********************************************************************** * 0xFB50 - 0xFDFF Arabic Presentation Forms (differential) ********************************************************************** @@ -6277,6 +6283,17 @@ punct <SMALL_COMMA>;...;<SMALL_COMMERCIAL_AT> ********************************************************************** blank <ZERO_WIDTH_NO-BREAK_SPACE> + +********************************************************************** +* 0xFF00 - 0xFFFF Half- and Fullwidth Punctuation (from pre-CLDR data) +********************************************************************** + +punct <FULLWIDTH_EXCLAMATION_MARK>;...;<FULLWIDTH_SOLIDUS>;/ + <FULLWIDTH_COLON>;...;<FULLWIDTH_COMMERCIAL_AT>;/ + <FULLWIDTH_LEFT_SQUARE_BRACKET>;...;<FULLWIDTH_GRAVE_ACCENT>;/ + <FULLWIDTH_LEFT_CURLY_BRACKET>;...;<HALFWIDTH_KATAKANA_MIDDLE_DOT>;/ + <FULLWIDTH_CENT_SIGN>;...;<FULLWIDTH_WON_SIGN>;/ + <HALFWIDTH_FORMS_LIGHT_VERTICAL>;...;<HALFWIDTH_WHITE_CIRCLE> ********************************************************************** * 0x10300 - 0x1032F Old Italic Modified: head/tools/tools/locale/etc/manual-input.UTF-8 ============================================================================== --- head/tools/tools/locale/etc/manual-input.UTF-8 Thu Oct 11 18:27:19 2018 (r339312) +++ head/tools/tools/locale/etc/manual-input.UTF-8 Thu Oct 11 18:30:12 2018 (r339313) @@ -877,6 +877,12 @@ graph <MEETEI_MAYEK_LETTER_KOK>;...;<MEETEI_MAYEK_APUN digit <MEETEI_MAYEK_DIGIT_ZERO>;...;<MEETEI_MAYEK_DIGIT_NINE> ********************************************************************** +* 0xE000 - 0xF8FF Private Use Area (from pre-CLDR data) +********************************************************************** + +graph <PRIVATE_USE_AREA-E000>;...;<PRIVATE_USE_AREA-F8FF> + +********************************************************************** * 0xFB50 - 0xFDFF Arabic Presentation Forms (differential) ********************************************************************** @@ -913,6 +919,17 @@ punct <SMALL_COMMA>;...;<SMALL_COMMERCIAL_AT> ********************************************************************** blank <ZERO_WIDTH_NO-BREAK_SPACE> + +********************************************************************** +* 0xFF00 - 0xFFFF Half- and Fullwidth Punctuation (from pre-CLDR data) +********************************************************************** + +punct <FULLWIDTH_EXCLAMATION_MARK>;...;<FULLWIDTH_SOLIDUS>;/ + <FULLWIDTH_COLON>;...;<FULLWIDTH_COMMERCIAL_AT>;/ + <FULLWIDTH_LEFT_SQUARE_BRACKET>;...;<FULLWIDTH_GRAVE_ACCENT>;/ + <FULLWIDTH_LEFT_CURLY_BRACKET>;...;<HALFWIDTH_KATAKANA_MIDDLE_DOT>;/ + <FULLWIDTH_CENT_SIGN>;...;<FULLWIDTH_WON_SIGN>;/ + <HALFWIDTH_FORMS_LIGHT_VERTICAL>;...;<HALFWIDTH_WHITE_CIRCLE> ********************************************************************** * 0x10300 - 0x1032F Old Italic
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?201810111830.w9BIUDw9040890>