From owner-svn-src-user@FreeBSD.ORG Wed Jul 29 21:54:35 2009 Return-Path: Delivered-To: svn-src-user@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 47F8E1065680; Wed, 29 Jul 2009 21:54:35 +0000 (UTC) (envelope-from edwin@FreeBSD.org) Received: from svn.freebsd.org (svn.freebsd.org [IPv6:2001:4f8:fff6::2c]) by mx1.freebsd.org (Postfix) with ESMTP id 25FE78FC16; Wed, 29 Jul 2009 21:54:35 +0000 (UTC) (envelope-from edwin@FreeBSD.org) Received: from svn.freebsd.org (localhost [127.0.0.1]) by svn.freebsd.org (8.14.3/8.14.3) with ESMTP id n6TLsYcQ083424; Wed, 29 Jul 2009 21:54:34 GMT (envelope-from edwin@svn.freebsd.org) Received: (from edwin@localhost) by svn.freebsd.org (8.14.3/8.14.3/Submit) id n6TLsYnT083423; Wed, 29 Jul 2009 21:54:34 GMT (envelope-from edwin@svn.freebsd.org) Message-Id: <200907292154.n6TLsYnT083423@svn.freebsd.org> From: Edwin Groothuis Date: Wed, 29 Jul 2009 21:54:34 +0000 (UTC) To: src-committers@freebsd.org, svn-src-user@freebsd.org X-SVN-Group: user MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Cc: Subject: svn commit: r195954 - user/edwin/locale/tools X-BeenThere: svn-src-user@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "SVN commit messages for the experimental " user" src tree" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 29 Jul 2009 21:54:36 -0000 Author: edwin Date: Wed Jul 29 21:54:34 2009 New Revision: 195954 URL: http://svn.freebsd.org/changeset/base/195954 Log: Add small tool to convert UTF-8 encoded strings back into CLDR "markup" language. Added: user/edwin/locale/tools/UTF82encoding.pl (contents, props changed) Added: user/edwin/locale/tools/UTF82encoding.pl ============================================================================== --- /dev/null 00:00:00 1970 (empty, because file is newly added) +++ user/edwin/locale/tools/UTF82encoding.pl Wed Jul 29 21:54:34 2009 (r195954) @@ -0,0 +1,64 @@ +#!/usr/bin/perl -w + +use strict; +use Data::Dumper; + +open(FIN, "$ARGV[0]/posix/UTF-8.cm"); +my @lines = ; +chomp(@lines); +close(FIN); + +my %cm = (); +foreach my $line (@lines) { + next if ($line =~ /^#/); + next if ($line eq ""); + next if ($line !~ /^; +chomp(@lines); +close(FIN); + +foreach my $line (@lines) { + if ($line =~ /^#/) { + print "$line\n"; + next; + } + + my @l = split(//, $line); + for (my $i = 0; $i <= $#l; $i++) { + my $hex = sprintf("%X", ord($l[$i])); + if (defined $cm{$hex}) { + print $cm{$hex}; + next; + } + + $hex = sprintf("%X%X", ord($l[$i]), ord($l[$i + 1])); + if (defined $cm{$hex}) { + $i += 1; + print $cm{$hex}; + next; + } + + $hex = sprintf("%X%X%X", + ord($l[$i]), ord($l[$i + 1]), ord($l[$i + 2 ])); + if (defined $cm{$hex}) { + $i += 2; + print $cm{$hex}; + next; + } + + print "\n--$hex--\n"; + } + print "\n"; + +}