From owner-svn-src-user@FreeBSD.ORG Sun Oct 4 21:07:20 2009 Return-Path: Delivered-To: svn-src-user@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 5AFF1106568B; Sun, 4 Oct 2009 21:07:20 +0000 (UTC) (envelope-from edwin@FreeBSD.org) Received: from svn.freebsd.org (svn.freebsd.org [IPv6:2001:4f8:fff6::2c]) by mx1.freebsd.org (Postfix) with ESMTP id 49E818FC13; Sun, 4 Oct 2009 21:07:20 +0000 (UTC) Received: from svn.freebsd.org (localhost [127.0.0.1]) by svn.freebsd.org (8.14.3/8.14.3) with ESMTP id n94L7JHn085348; Sun, 4 Oct 2009 21:07:19 GMT (envelope-from edwin@svn.freebsd.org) Received: (from edwin@localhost) by svn.freebsd.org (8.14.3/8.14.3/Submit) id n94L7JOc085345; Sun, 4 Oct 2009 21:07:19 GMT (envelope-from edwin@svn.freebsd.org) Message-Id: <200910042107.n94L7JOc085345@svn.freebsd.org> From: Edwin Groothuis Date: Sun, 4 Oct 2009 21:07:19 +0000 (UTC) To: src-committers@freebsd.org, svn-src-user@freebsd.org X-SVN-Group: user MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Cc: Subject: svn commit: r197754 - in user/edwin/locale: . usr.bin/unicode2utf8 X-BeenThere: svn-src-user@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "SVN commit messages for the experimental " user" src tree" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 04 Oct 2009 21:07:20 -0000 Author: edwin Date: Sun Oct 4 21:07:19 2009 New Revision: 197754 URL: http://svn.freebsd.org/changeset/base/197754 Log: Add man-page for the unicode2utf8 utility. Fix wording of the examples, the ru_RU word was Saturday, not Sunday. Added: user/edwin/locale/usr.bin/unicode2utf8/unicode2utf8.1 Modified: user/edwin/locale/README.locale Modified: user/edwin/locale/README.locale ============================================================================== --- user/edwin/locale/README.locale Sun Oct 4 19:44:41 2009 (r197753) +++ user/edwin/locale/README.locale Sun Oct 4 21:07:19 2009 (r197754) @@ -46,18 +46,19 @@ Gotchas Examples -------- -The word for the last day of the week in the en_US language - country -code would be in Unicode format: - - +The word for the second last day of the week in the en_US language +- country code would be in Unicode format: + + + Converted into UTF-8 this will be: - Sunday + Saturday Converted into ISO-8859 this will be: - Sunday + Saturday -The word for the last day of the week in the ru_RU language - -country code would be in Unicode format: +The word for the second last day of the week in the ru_RU language +- country code would be in Unicode format: Added: user/edwin/locale/usr.bin/unicode2utf8/unicode2utf8.1 ============================================================================== --- /dev/null 00:00:00 1970 (empty, because file is newly added) +++ user/edwin/locale/usr.bin/unicode2utf8/unicode2utf8.1 Sun Oct 4 21:07:19 2009 (r197754) @@ -0,0 +1,91 @@ +.\" Copyright (c) 2009 Edwin Groothuis +.\" All rights reserved. +.\" +.\" Redistribution and use in source and binary forms, with or without +.\" modification, are permitted provided that the following conditions +.\" are met: +.\" 1. Redistributions of source code must retain the above copyright +.\" notice, this list of conditions and the following disclaimer. +.\" 2. Redistributions in binary form must reproduce the above copyright +.\" notice, this list of conditions and the following disclaimer in the +.\" documentation and/or other materials provided with the distribution. +.\" +.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND +.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +.\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE +.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS +.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) +.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT +.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY +.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF +.\" SUCH DAMAGE. +.\" +.\" $FreeBSD$ +.\" +.Dd October 4, 2009 +.Dt unicode2utf8 1 +.Os +.Sh NAME +.Nm unicode2utf8 +.Nd converts a file with Unicode name definitions into UTF-8 character +definitions. +.Sh SYNOPSIS +.Nm +.Fl -cldr Ar directory +.Fl -input Ar filename +.Fl -output Ar filename +.Sh DESCRIPTION +The +.Nm +utility is made to convert the Unicode encoded strings in the +contents of the specified input file with the corresponding UTF-8 +character definitions. +.Pp +Lines starting with a # are copied as-is. +.Pp +The Unicode encoded strings are specified between a '<' and a '>' +sign. +They are looked up against the keys in the conversion table specified +in the file +.Pa posix/UTF-8.cm +in the with +.Fl -cldr +defined directory and the matching value is written out. +.Pp +Other characters are copied as-is. +.Sh OPTIONS +.Bl -tag -width indent +.It Fl -cldr Ar directory +The directory where the file +.Pa posix/UTF-8.cm +resides. +By default this should point to +.Pa /usr/share/misc , +but for maintainers of the FreeBSD locale database this could point +to their own extracted copy of the CLDR database. +.It Fl -input Ar filename +The source file with the Unicode encoded strings. +.It Fl -output Ar filename +The destination file with the Unicode encoded strings replaced with +their UTF-8 equivalents. +.El +.Sh EXIT STATUS +The +.Nm +utility exits 0 on success, and >0 if an error occurs. +.Sh SEE ALSO +.Xr iconv 1 , +.Xr bsdiconv 1 +.Bl -tag -width indent +.It http://cldr.unicode.org/ +Website of the Common Locale Database Repository, +the maintainers of the file +.Pa /usr/share/misc/UTF-8.cm +.El +.Sh AUTHORS +The +.Nm +utility and this manual page were written by +.An Edwin Groothuis Aq edwin@FreeBSD.org .