From owner-freebsd-i18n@FreeBSD.ORG Mon Sep 7 11:07:00 2009 Return-Path: Delivered-To: freebsd-i18n@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 9D783106568F for ; Mon, 7 Sep 2009 11:07:00 +0000 (UTC) (envelope-from owner-bugmaster@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id 8B9C08FC15 for ; Mon, 7 Sep 2009 11:07:00 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.3/8.14.3) with ESMTP id n87B706m010244 for ; Mon, 7 Sep 2009 11:07:00 GMT (envelope-from owner-bugmaster@FreeBSD.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.3/8.14.3/Submit) id n87B6xpr010240 for freebsd-i18n@FreeBSD.org; Mon, 7 Sep 2009 11:06:59 GMT (envelope-from owner-bugmaster@FreeBSD.org) Date: Mon, 7 Sep 2009 11:06:59 GMT Message-Id: <200909071106.n87B6xpr010240@freefall.freebsd.org> X-Authentication-Warning: freefall.freebsd.org: gnats set sender to owner-bugmaster@FreeBSD.org using -f From: FreeBSD bugmaster To: freebsd-i18n@FreeBSD.org Cc: Subject: Current problem reports assigned to freebsd-i18n@FreeBSD.org X-BeenThere: freebsd-i18n@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: FreeBSD Internationalization Effort List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 07 Sep 2009 11:07:00 -0000 Note: to view an individual PR, use: http://www.freebsd.org/cgi/query-pr.cgi?pr=(number). The following is a listing of current problems submitted by FreeBSD users. These represent problem reports covering all versions including experimental development code and obsolete releases. S Tracker Resp. Description -------------------------------------------------------------------------------- o conf/137870 i18n [locale] en_DK needed f conf/109367 i18n [locale] UTF8 encoded locales and problem collating ac f conf/91106 i18n [locale] date definitions in pl_PL locale are wrong 3 problems total. From owner-freebsd-i18n@FreeBSD.ORG Tue Sep 8 05:35:02 2009 Return-Path: Delivered-To: freebsd-i18n@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 174931065679 for ; Tue, 8 Sep 2009 05:35:02 +0000 (UTC) (envelope-from edwin@mavetju.org) Received: from k7.mavetju.org (ppp113-58.static.internode.on.net [150.101.113.58]) by mx1.freebsd.org (Postfix) with ESMTP id DEF288FC0C for ; Tue, 8 Sep 2009 05:35:00 +0000 (UTC) Received: by k7.mavetju.org (Postfix, from userid 1001) id B48A345147; Tue, 8 Sep 2009 15:34:51 +1000 (EST) Date: Tue, 8 Sep 2009 15:34:51 +1000 From: Edwin Groothuis To: freebsd-i18n@freebsd.org Message-ID: <20090908053451.GA4568@mavetju.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.4.2.3i Subject: WIP - share/{monet,msg,numeric,time}def X-BeenThere: freebsd-i18n@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: FreeBSD Internationalization Effort List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 08 Sep 2009 05:35:02 -0000 In the last couple of months I've spend some time with the data in the share/{monet,msg,numeric,time}def directories and the data from the CLDR (Common Locale Data Repository) project. The biggest issues with the way the current data in the *def directories is maintained is that it is partly high-ascii (specially for the non-US-ASCII and non-ISO8859-{1,2,15} character maps) and partly un-synchronized between the different character maps for the same locale. The first approach was to see if I could transform the data from the CLDR project into the format the FreeBSD project wanted to have it. It taught me a lot about the data stored in the CLDR project, but also that it isn't compatible enough to do it automatic. The second approach, still happening now, is going much better: Instead of storing the high-ascii and multiple charactermap translations in the SCM, we have per locale one file with a proper definition of the words and syntax used, which gets converted into UTF-8 and which then gets transformed to the required charactermaps. For example, the file share/msgdef/nl_NL.unicode: # yesexpr ^[].* # noexpr ^[].* # EOF gets converted into nl_NL.UTF-8: # yesexpr ^[jJyY].* # noexpr ^[nN].* # EOF and gets transformed into its ISO8859-1 and ISO8859-15 equivalents. Since this is low-ascii it is a boring example, but the idea is there. What are currently show-stoppers? - The conversion between .unicode and .UTF-8 is done via a Perl script and the CLDR database, which means that it won't be in the base system for now. So we need both the .unicode source files and the .UTF-8 files in the SCM system. - There is no iconv in the base operating system yet. Gabor@ is in the process of porting citrus-iconv from NetBSD, but it isn't available yet. So we also need the converted charactermaps in the SCM for now. I have access to his iconv and will feed the current issues back to him. These two show-stoppers right now cause that we will get a lot more data in the SCM system than what we have right now until they are resolved. The first one should not be difficult, the second one is with somebody who understands it :-) So the advantages, when everything is ready: - Human readable source files with Unicode style encoding. - All locales with the different character maps are generated from one source and thus up-to-date with each other. Once this part is working properly (and to others people satisfaction) we can update the contents with information from third party sources like the CLDR. But that is still a long time away for now. Edwin svn://svn.freebsd.org/base/user/edwin/locale -- Edwin Groothuis Website: http://www.mavetju.org/ edwin@mavetju.org Weblog: http://www.mavetju.org/weblog/