From owner-svn-src-user@FreeBSD.ORG Wed Sep 2 09:53:32 2009 Return-Path: <owner-svn-src-user@FreeBSD.ORG> Delivered-To: svn-src-user@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id D4199106568B; Wed, 2 Sep 2009 09:53:32 +0000 (UTC) (envelope-from edwin@FreeBSD.org) Received: from svn.freebsd.org (svn.freebsd.org [IPv6:2001:4f8:fff6::2c]) by mx1.freebsd.org (Postfix) with ESMTP id C44FF8FC08; Wed, 2 Sep 2009 09:53:32 +0000 (UTC) Received: from svn.freebsd.org (localhost [127.0.0.1]) by svn.freebsd.org (8.14.3/8.14.3) with ESMTP id n829rWp5088299; Wed, 2 Sep 2009 09:53:32 GMT (envelope-from edwin@svn.freebsd.org) Received: (from edwin@localhost) by svn.freebsd.org (8.14.3/8.14.3/Submit) id n829rWBb088297; Wed, 2 Sep 2009 09:53:32 GMT (envelope-from edwin@svn.freebsd.org) Message-Id: <200909020953.n829rWBb088297@svn.freebsd.org> From: Edwin Groothuis <edwin@FreeBSD.org> Date: Wed, 2 Sep 2009 09:53:32 +0000 (UTC) To: src-committers@freebsd.org, svn-src-user@freebsd.org X-SVN-Group: user MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Cc: Subject: svn commit: r196757 - user/edwin/locale/cldr/tools X-BeenThere: svn-src-user@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "SVN commit messages for the experimental " user" src tree" <svn-src-user.freebsd.org> List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/svn-src-user>, <mailto:svn-src-user-request@freebsd.org?subject=unsubscribe> List-Archive: <http://lists.freebsd.org/pipermail/svn-src-user> List-Post: <mailto:svn-src-user@freebsd.org> List-Help: <mailto:svn-src-user-request@freebsd.org?subject=help> List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/svn-src-user>, <mailto:svn-src-user-request@freebsd.org?subject=subscribe> X-List-Received-Date: Wed, 02 Sep 2009 09:53:32 -0000 Author: edwin Date: Wed Sep 2 09:53:32 2009 New Revision: 196757 URL: http://svn.freebsd.org/changeset/base/196757 Log: Make sure only highascii and A-Za-z0-9 gets translated. Modified: user/edwin/locale/cldr/tools/UTF82encoding.pl Modified: user/edwin/locale/cldr/tools/UTF82encoding.pl ============================================================================== --- user/edwin/locale/cldr/tools/UTF82encoding.pl Wed Sep 2 09:52:26 2009 (r196756) +++ user/edwin/locale/cldr/tools/UTF82encoding.pl Wed Sep 2 09:53:32 2009 (r196757) @@ -3,6 +3,11 @@ use strict; use Data::Dumper; +if ($#ARGV != 1) { + print "Usage: $0 <cldr dir> <input file>\n"; + exit; +} + open(FIN, "$ARGV[0]/posix/UTF-8.cm"); my @lines = <FIN>; chomp(@lines); @@ -18,11 +23,10 @@ foreach my $line (@lines) { next if ($#a != 1); $a[1] =~ s/\\x//g; - $cm{$a[1]} = $a[0]; + $a[0] =~ s/_/ /g; + $cm{$a[1]} = $a[0] if (!defined $cm{$a[1]}); } -print Dumper($cm{"4D"}), "\n"; - open(FIN, $ARGV[1]); @lines = <FIN>; chomp(@lines); @@ -37,6 +41,16 @@ foreach my $line (@lines) { my @l = split(//, $line); for (my $i = 0; $i <= $#l; $i++) { my $hex = sprintf("%X", ord($l[$i])); + + if (( $l[$i] gt "\x20") + && ($l[$i] lt "a" || $l[$i] gt "z") + && ($l[$i] lt "A" || $l[$i] gt "Z") + && ($l[$i] lt "0" || $l[$i] gt "9") + && ($l[$i] lt "\x80")) { + print $l[$i]; + next; + } + if (defined $cm{$hex}) { print $cm{$hex}; next;