Date: Sun, 4 Oct 2009 21:07:19 +0000 (UTC) From: Edwin Groothuis <edwin@FreeBSD.org> To: src-committers@freebsd.org, svn-src-user@freebsd.org Subject: svn commit: r197754 - in user/edwin/locale: . usr.bin/unicode2utf8 Message-ID: <200910042107.n94L7JOc085345@svn.freebsd.org>
next in thread | raw e-mail | index | archive | help
Author: edwin Date: Sun Oct 4 21:07:19 2009 New Revision: 197754 URL: http://svn.freebsd.org/changeset/base/197754 Log: Add man-page for the unicode2utf8 utility. Fix wording of the examples, the ru_RU word was Saturday, not Sunday. Added: user/edwin/locale/usr.bin/unicode2utf8/unicode2utf8.1 Modified: user/edwin/locale/README.locale Modified: user/edwin/locale/README.locale ============================================================================== --- user/edwin/locale/README.locale Sun Oct 4 19:44:41 2009 (r197753) +++ user/edwin/locale/README.locale Sun Oct 4 21:07:19 2009 (r197754) @@ -46,18 +46,19 @@ Gotchas Examples -------- -The word for the last day of the week in the en_US language - country -code would be in Unicode format: - <LATIN CAPITAL LETTER S><LATIN SMALL LETTER U> - <LATIN SMALL LETTER N><LATIN SMALL LETTER D> +The word for the second last day of the week in the en_US language +- country code would be in Unicode format: + <LATIN CAPITAL LETTER S><LATIN SMALL LETTER A> + <LATIN SMALL LETTER T><LATIN SMALL LETTER U> + <LATIN SMALL LETTER R><LATIN SMALL LETTER D> <LATIN SMALL LETTER A><LATIN SMALL LETTER Y> Converted into UTF-8 this will be: - Sunday + Saturday Converted into ISO-8859 this will be: - Sunday + Saturday -The word for the last day of the week in the ru_RU language - -country code would be in Unicode format: +The word for the second last day of the week in the ru_RU language +- country code would be in Unicode format: <CYRILLIC SMALL LETTER ES><CYRILLIC SMALL LETTER U> <CYRILLIC SMALL LETTER BE><CYRILLIC SMALL LETTER BE> <CYRILLIC SMALL LETTER O><CYRILLIC SMALL LETTER TE> Added: user/edwin/locale/usr.bin/unicode2utf8/unicode2utf8.1 ============================================================================== --- /dev/null 00:00:00 1970 (empty, because file is newly added) +++ user/edwin/locale/usr.bin/unicode2utf8/unicode2utf8.1 Sun Oct 4 21:07:19 2009 (r197754) @@ -0,0 +1,91 @@ +.\" Copyright (c) 2009 Edwin Groothuis <edwin@FreeBSD.org> +.\" All rights reserved. +.\" +.\" Redistribution and use in source and binary forms, with or without +.\" modification, are permitted provided that the following conditions +.\" are met: +.\" 1. Redistributions of source code must retain the above copyright +.\" notice, this list of conditions and the following disclaimer. +.\" 2. Redistributions in binary form must reproduce the above copyright +.\" notice, this list of conditions and the following disclaimer in the +.\" documentation and/or other materials provided with the distribution. +.\" +.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND +.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +.\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE +.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS +.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) +.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT +.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY +.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF +.\" SUCH DAMAGE. +.\" +.\" $FreeBSD$ +.\" +.Dd October 4, 2009 +.Dt unicode2utf8 1 +.Os +.Sh NAME +.Nm unicode2utf8 +.Nd converts a file with Unicode name definitions into UTF-8 character +definitions. +.Sh SYNOPSIS +.Nm +.Fl -cldr Ar directory +.Fl -input Ar filename +.Fl -output Ar filename +.Sh DESCRIPTION +The +.Nm +utility is made to convert the Unicode encoded strings in the +contents of the specified input file with the corresponding UTF-8 +character definitions. +.Pp +Lines starting with a # are copied as-is. +.Pp +The Unicode encoded strings are specified between a '<' and a '>' +sign. +They are looked up against the keys in the conversion table specified +in the file +.Pa posix/UTF-8.cm +in the with +.Fl -cldr +defined directory and the matching value is written out. +.Pp +Other characters are copied as-is. +.Sh OPTIONS +.Bl -tag -width indent +.It Fl -cldr Ar directory +The directory where the file +.Pa posix/UTF-8.cm +resides. +By default this should point to +.Pa /usr/share/misc , +but for maintainers of the FreeBSD locale database this could point +to their own extracted copy of the CLDR database. +.It Fl -input Ar filename +The source file with the Unicode encoded strings. +.It Fl -output Ar filename +The destination file with the Unicode encoded strings replaced with +their UTF-8 equivalents. +.El +.Sh EXIT STATUS +The +.Nm +utility exits 0 on success, and >0 if an error occurs. +.Sh SEE ALSO +.Xr iconv 1 , +.Xr bsdiconv 1 +.Bl -tag -width indent +.It http://cldr.unicode.org/ +Website of the Common Locale Database Repository, +the maintainers of the file +.Pa /usr/share/misc/UTF-8.cm +.El +.Sh AUTHORS +The +.Nm +utility and this manual page were written by +.An Edwin Groothuis Aq edwin@FreeBSD.org .
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200910042107.n94L7JOc085345>