From owner-svn-src-user@FreeBSD.ORG  Wed Sep  2 09:53:32 2009
Return-Path: <owner-svn-src-user@FreeBSD.ORG>
Delivered-To: svn-src-user@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id D4199106568B;
	Wed,  2 Sep 2009 09:53:32 +0000 (UTC)
	(envelope-from edwin@FreeBSD.org)
Received: from svn.freebsd.org (svn.freebsd.org [IPv6:2001:4f8:fff6::2c])
	by mx1.freebsd.org (Postfix) with ESMTP id C44FF8FC08;
	Wed,  2 Sep 2009 09:53:32 +0000 (UTC)
Received: from svn.freebsd.org (localhost [127.0.0.1])
	by svn.freebsd.org (8.14.3/8.14.3) with ESMTP id n829rWp5088299;
	Wed, 2 Sep 2009 09:53:32 GMT (envelope-from edwin@svn.freebsd.org)
Received: (from edwin@localhost)
	by svn.freebsd.org (8.14.3/8.14.3/Submit) id n829rWBb088297;
	Wed, 2 Sep 2009 09:53:32 GMT (envelope-from edwin@svn.freebsd.org)
Message-Id: <200909020953.n829rWBb088297@svn.freebsd.org>
From: Edwin Groothuis <edwin@FreeBSD.org>
Date: Wed, 2 Sep 2009 09:53:32 +0000 (UTC)
To: src-committers@freebsd.org, svn-src-user@freebsd.org
X-SVN-Group: user
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Cc: 
Subject: svn commit: r196757 - user/edwin/locale/cldr/tools
X-BeenThere: svn-src-user@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: "SVN commit messages for the experimental &quot; user&quot;
	src tree" <svn-src-user.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/svn-src-user>,
	<mailto:svn-src-user-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/svn-src-user>
List-Post: <mailto:svn-src-user@freebsd.org>
List-Help: <mailto:svn-src-user-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/svn-src-user>,
	<mailto:svn-src-user-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 02 Sep 2009 09:53:32 -0000

Author: edwin
Date: Wed Sep  2 09:53:32 2009
New Revision: 196757
URL: http://svn.freebsd.org/changeset/base/196757

Log:
  Make sure only highascii and A-Za-z0-9 gets translated.

Modified:
  user/edwin/locale/cldr/tools/UTF82encoding.pl

Modified: user/edwin/locale/cldr/tools/UTF82encoding.pl
==============================================================================
--- user/edwin/locale/cldr/tools/UTF82encoding.pl	Wed Sep  2 09:52:26 2009	(r196756)
+++ user/edwin/locale/cldr/tools/UTF82encoding.pl	Wed Sep  2 09:53:32 2009	(r196757)
@@ -3,6 +3,11 @@
 use strict;
 use Data::Dumper;
 
+if ($#ARGV != 1) {
+	print "Usage: $0 <cldr dir> <input file>\n";
+	exit;
+}
+
 open(FIN, "$ARGV[0]/posix/UTF-8.cm");
 my @lines = <FIN>;
 chomp(@lines);
@@ -18,11 +23,10 @@ foreach my $line (@lines) {
 	next if ($#a != 1);
 
 	$a[1] =~ s/\\x//g;
-	$cm{$a[1]} = $a[0];
+	$a[0] =~ s/_/ /g;
+	$cm{$a[1]} = $a[0] if (!defined $cm{$a[1]});
 }
 
-print Dumper($cm{"4D"}), "\n";
-
 open(FIN, $ARGV[1]);
 @lines = <FIN>;
 chomp(@lines);
@@ -37,6 +41,16 @@ foreach my $line (@lines) {
 	my @l = split(//, $line);
 	for (my $i = 0; $i <= $#l; $i++) {
 		my $hex = sprintf("%X", ord($l[$i]));
+
+		if ((		      $l[$i] gt "\x20")
+		 && ($l[$i] lt "a" || $l[$i] gt "z")
+		 && ($l[$i] lt "A" || $l[$i] gt "Z")
+		 && ($l[$i] lt "0" || $l[$i] gt "9")
+		 && ($l[$i] lt "\x80")) {
+			print $l[$i];
+			next;
+		}
+
 		if (defined $cm{$hex}) {
 			print $cm{$hex};
 			next;