From owner-freebsd-current@freebsd.org Tue Oct 13 22:23:11 2015 Return-Path: Delivered-To: freebsd-current@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 3D7A6A135A2 for ; Tue, 13 Oct 2015 22:23:11 +0000 (UTC) (envelope-from baptiste.daroussin@gmail.com) Received: from mailman.ysv.freebsd.org (mailman.ysv.freebsd.org [IPv6:2001:1900:2254:206a::50:5]) by mx1.freebsd.org (Postfix) with ESMTP id 1C003394 for ; Tue, 13 Oct 2015 22:23:11 +0000 (UTC) (envelope-from baptiste.daroussin@gmail.com) Received: by mailman.ysv.freebsd.org (Postfix) id 18E22A135A1; Tue, 13 Oct 2015 22:23:11 +0000 (UTC) Delivered-To: current@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id F29FFA135A0 for ; Tue, 13 Oct 2015 22:23:10 +0000 (UTC) (envelope-from baptiste.daroussin@gmail.com) Received: from mail-wi0-x233.google.com (mail-wi0-x233.google.com [IPv6:2a00:1450:400c:c05::233]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 8BF96393 for ; Tue, 13 Oct 2015 22:23:10 +0000 (UTC) (envelope-from baptiste.daroussin@gmail.com) Received: by wicge5 with SMTP id ge5so77431928wic.0 for ; Tue, 13 Oct 2015 15:23:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=sender:date:from:to:subject:message-id:mime-version:content-type :content-disposition:user-agent; bh=Df2y67Tzvqn/8njmdB/wWHsnYNO0FitI1wzdr6CoucY=; b=govcRJbI/KwpjVR+E9MIeXn652Pn2bzJxxCKEsLp/KKquaqQDRsyUqF0wVuX8oTEIl JZ0Pmiw0Ai4bYGemvglJSqYkN2ImfJmL/57MZjJWnBWFGcnVgPKkK6x/xpRuwamyBfgS XdGxWf1IyNQ0GzPt3xaTKMlCJtFybGeG3D82llNbhPK810mVRF1pExCDW/633Xezxtc7 mvrC7kVbA0yCQznZjFqJACtu0zIvkaSUpoMyREHqvwEgKYHU+DMxNTpGOc7IPMezDsXg SnlHi6bAWIzR3FJq+pf+vfBN8RO3eNoTEBWRi/mxRlCiZJ5/D5YmblKNz7czk7Z0tbnS UVNQ== X-Received: by 10.180.88.227 with SMTP id bj3mr1260736wib.80.1444774988891; Tue, 13 Oct 2015 15:23:08 -0700 (PDT) Received: from ivaldir.etoilebsd.net ([2001:41d0:8:db4c::1]) by smtp.gmail.com with ESMTPSA id ka10sm6142769wjc.30.2015.10.13.15.23.07 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 13 Oct 2015 15:23:08 -0700 (PDT) Sender: Baptiste Daroussin Date: Wed, 14 Oct 2015 00:23:06 +0200 From: Baptiste Daroussin To: current@FreeBSD.org Subject: [CFT] Unicode collation string and reworked locale definitions Message-ID: <20151013222306.GE55137@ivaldir.etoilebsd.net> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="zaRBsRFn0XYhEU69" Content-Disposition: inline User-Agent: Mutt/1.5.24 (2015-08-30) X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 13 Oct 2015 22:23:11 -0000 --zaRBsRFn0XYhEU69 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Hi all, I have been working for a while on bringing in Unicode string collation support by merging code from Illumos (by Garrett D'Amore who kindly made sure his work was made under BSD license) and Dragonfly (by John Marino), and some ancient work done on FreeBSD by edwin@ but never merged. The result is available in the projects/collation branch. As a result of this work, is: - Locales are now generated with the new localedef(1) tool from CLDR POSIX files - The generated files are now identified as "BSD 1.0" format - Only "BSD 1.0" locales files are now read, all other version will be set to "C" - The localedef(1) tool has been imported from Illumos and modidied to use tree(3) instead of the CDDL avl(3) - A set of tool created by edwin@ and extended by marino@ for dragonfly has been added to be able to generate locales - Given our regex(3) does not support multibyte yet (actually it does not support some single-byte codeset) it has been forced to always use locale C - Remove colldef(1) and mklocale(1) - Finish implementing the numeric BSD extension for ctypes - Add a bunch of new locales: some arabian locales, hebrew locales, some regional locales, etc. - Make a bunch of ISO-8859-1 locales simple aliase on ISO-8859-15 where it makes sense - Add short version of locales - Add @euro aliases on the locales where that make sense Please test the branch and report issues. Note that yes that means the COLLATION_FIX patch on glib2 will not be necessary anymore and yes the icu patch on postgresql will not be necessary anymore Best regards, Bapt --zaRBsRFn0XYhEU69 Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iEYEARECAAYFAlYdhEoACgkQ8kTtMUmk6EybewCeJB8qvGs+bXukwSMtgnK82C7O z7wAn2P2pw1hP03fCKUfuAehUyIa1/ME =KCog -----END PGP SIGNATURE----- --zaRBsRFn0XYhEU69--