From owner-soc-status@FreeBSD.ORG Tue Aug 12 18:05:25 2014 Return-Path: Delivered-To: soc-status@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 3CE64799; Tue, 12 Aug 2014 18:05:25 +0000 (UTC) Received: from mail-wg0-x22e.google.com (mail-wg0-x22e.google.com [IPv6:2a00:1450:400c:c00::22e]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4DDBE34D8; Tue, 12 Aug 2014 17:47:59 +0000 (UTC) Received: by mail-wg0-f46.google.com with SMTP id m15so10230963wgh.29 for ; Tue, 12 Aug 2014 10:47:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:reply-to:from:date:message-id:subject:to:content-type; bh=MoZsUq/20S+Le/03+zO72znsHZfSv7p1gG00keqr5Gg=; b=D2DB+Kre5U+vNONuCj5/6cx9zcHGLaTDeM6HDf4y9A678CA2o9u3/EOAnIleSpUYzj VuccdLe3q7gBWkAFCfj3KSsZZIf8y3bH/tGVDc7I2hzZvDMrepSoJ+bHw86R15d8tzTL a81ekXgVSF1GAqo5l0DzwXQtZGtVq2DbE0sJWcqFbBQ1yo7dcz94NSFPjRXUD7inXEyY 7ifA1xdK0BZUtjacTZ4LUmgVDrzz2M9yjThruH//hGl+wOErVYNNk32j+irpxLj16/Cp 8twv2qivrm+RGdoWrINrac5/I9XcHW7nNsLWo0cXDE3JZPHemA4qyu26emFMMrGc9CQK fGTA== X-Received: by 10.195.17.164 with SMTP id gf4mr6584629wjd.45.1407865677516; Tue, 12 Aug 2014 10:47:57 -0700 (PDT) MIME-Version: 1.0 Received: by 10.194.40.33 with HTTP; Tue, 12 Aug 2014 10:47:37 -0700 (PDT) Reply-To: ghostmansd@gmail.com From: Dmitry Selyutin Date: Tue, 12 Aug 2014 21:47:37 +0400 Message-ID: Subject: Report #7: Unicode support To: soc-status@freebsd.org, Pedro Giffuni , David Chisnall Content-Type: text/plain; charset=UTF-8 X-BeenThere: soc-status@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Summer of Code Status Reports and Discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 12 Aug 2014 18:05:25 -0000 Hello everyone! Here are the last news about the Unicode support project[0]. You can always check my repository[1]. During these days I've been working on integrating changes into the tree. libc now supports UNICODE flag. If it is defined, then the entire libc is compiled with -D_UNICODE_SOURCE, thus supporting Unicode Collation Algorithm as well as Unicode normalization and canonicalization using hidden __ucsnorm() and __ucscanon() functions. Collation Database Library (libcolldb) moved into contrib/, though it has its own Makefile inside lib/ directory. Collation Database Library provides colldb script, which is used to transform Unicode collation files into Collation Database format. There are some things to be done: first I need to create a Makefile that will move contrib/colldb/colldb into /usr/bin (and probably copying it into Python package directory, since this script (if imported) allows to use bindings to libcolldb). This Makefile must also use colldb to create a new database from share/colldb/root.src and install it as /usr/share/colldb/root.db file. Since I'm not sure how to handle such things using BSD make, so I think I'll need your help, Pedro! ;-) Now it can be done manually. The other thing is a more extensive testing using files from Unicode CLDR repository. I've never used FreeBSD testing system, but hopefully it won't be harder than implementing Unicode Collation. [0] https://wiki.freebsd.org/SummerOfCode2014/Unicode [1] https://socsvn.freebsd.org/socsvn/soc2014/ghostmansd -- With best regards, Dmitry Selyutin