From owner-soc-status@FreeBSD.ORG Mon Jul 21 12:24:54 2014 Return-Path: Delivered-To: soc-status@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 316202ED; Mon, 21 Jul 2014 12:24:54 +0000 (UTC) Received: from mail-oa0-x230.google.com (mail-oa0-x230.google.com [IPv6:2607:f8b0:4003:c02::230]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id D90DA2E5F; Mon, 21 Jul 2014 12:24:53 +0000 (UTC) Received: by mail-oa0-f48.google.com with SMTP id m1so7110884oag.7 for ; Mon, 21 Jul 2014 05:24:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=RFB8gJSyltabW+6Qc1JPAy5XE/SljuAJfd4Dw102QZA=; b=WhbtpcNffSBq8NujFJ7czdg0fYnfLlFNbh+4BcTLFT7qy6booDRdAj0Nt3/QTF0ROY R8QPux1zWsODPHwMcWKbGPYNnSoSTJN/uk3qDzvQXRKKIIvqWGiD/4yCUNQf+I4okxVM Io5ZgCQDfSCO50PusGPRc8ZLGfOhLxJtlMYc1SrSWlPevlZ4q9pRh3hzpTzOla0gHNEj 1WzcVfAkR2kddEKoHKkqZkQpaN3KDill1FVLOszzPQX1oJgy5TUz+rc1SNQHSRKQ77Zz RhXdk3uWilael6mGzp/nkU/E+IEo7IYCPu8EsTqrC3fkDGQRAMnD4uaHBn4pkjvjJVQh BJhw== MIME-Version: 1.0 X-Received: by 10.60.123.103 with SMTP id lz7mr37938243oeb.18.1405945492217; Mon, 21 Jul 2014 05:24:52 -0700 (PDT) Received: by 10.182.216.197 with HTTP; Mon, 21 Jul 2014 05:24:52 -0700 (PDT) In-Reply-To: References: Date: Mon, 21 Jul 2014 14:24:52 +0200 Message-ID: Subject: Re: Report #4: Unicode support From: Oliver Pinter To: ghostmansd@gmail.com Content-Type: text/plain; charset=ISO-8859-1 Cc: soc-status@freebsd.org, Pedro Giffuni X-BeenThere: soc-status@freebsd.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Summer of Code Status Reports and Discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 21 Jul 2014 12:24:54 -0000 On 7/21/14, Dmitry Selyutin wrote: > Hello everyone, > > here comes my report on progress during these two weeks. Pedro, David, > excuse me for duplication, please: I should have just included you > into this letter instead of sending you two letters. I've just > realized that I've forgotten to write the report. :-( > > I've been intensively testing my normalization implementation and > discovered that it was working incorrectly. Moreover, it's code seems > to be completely cryptic, so I've rewritten it from the scratch. Now > it seems to work correctly (at least it passes Unicode tests). The > things that I've completely ignored are canonicalization and combining > characters classes. I've decided to publish it in git repo and > integrate it to head later, since it's a real pain to recompile the > entire system every several hours after changes in source code > (especially if changes are not large). Dimitry, take a look at this build script: http://svnweb.freebsd.org/socsvn/soc2014/op/tools/build_kernel_64bit_dirty.csh?revision=271052&view=co It defines a DNO_CLEAN make property, so only those file will rebuilded, which you modified. This speed up the build time. > > I've also thought about your message where you doubt about project > structure. We'll have `uniext.h' header, which is included if > UNICODE_ADDENDA macro is defined. This header defines the following > functions: strcanon, strcanon_l, wcscanon, strnorm, strnorm_l, > wcsnorm, wcclass. The last one was written as a helper function which > is used inside wcscanon and wcsnorm, but I thought that it also may be > useful as a standalone function. > > I've rewritten algorithms: now everithing is performed using binary > search and hashes, so it's really fast (before the search was linear). > Now it works really fast (e.g. for decomposition it works from 10 to > 12 times faster than Python's decomposition algorithm). I've also > tested it on the wide strings, and it works as expected (at least!). > So this part seems to be finished. The last thing to do is to place > everything in the right place into the FreeBSD source tree. > > Here is my testing repo: https://github.com/ghostmansd/uniext. Just > use `git clone https://github.com/ghostmansd/uniext'. > P.S. You need to use gmake if you want to use my Makefile (I don't > know BSD Makefile syntax well). However, all what you need is to add > `-Iinclude' flag to CFLAGS, compile everithing in `src', compile > `main.c' and link it all together. > > -- > With best regards, > Dmitry Selyutin > _______________________________________________ > soc-status@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/soc-status > To unsubscribe, send any mail to "soc-status-unsubscribe@freebsd.org" >