Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 21 Jul 2014 14:24:52 +0200
From:      Oliver Pinter <oliver.pntr@gmail.com>
To:        ghostmansd@gmail.com
Cc:        soc-status@freebsd.org, Pedro Giffuni <pfg@freebsd.org>
Subject:   Re: Report #4: Unicode support
Message-ID:  <CAPjTQNETf2Gp_rcjUoqL8oGyqW0CxCY9HwXTyd=4v9NwOwRrqg@mail.gmail.com>
In-Reply-To: <CAMqzjetbe7x-mWYjVL5OPu39pv4xG4Zmt%2Bj8Hyi1cvPRxmWVSw@mail.gmail.com>
References:  <CAMqzjetbe7x-mWYjVL5OPu39pv4xG4Zmt%2Bj8Hyi1cvPRxmWVSw@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On 7/21/14, Dmitry Selyutin <ghostman.sd@gmail.com> wrote:
> Hello everyone,
>
> here comes my report on progress during these two weeks. Pedro, David,
> excuse me for duplication, please: I should have just included you
> into this letter instead of sending you two letters. I've just
> realized that I've forgotten to write the report. :-(
>
> I've been intensively testing my normalization implementation and
> discovered that it was working incorrectly. Moreover, it's code seems
> to be completely cryptic, so I've rewritten it from the scratch. Now
> it seems to work correctly (at least it passes Unicode tests). The
> things that I've completely ignored are canonicalization and combining
> characters classes. I've decided to publish it in git repo and
> integrate it to head later, since it's a real pain to recompile the
> entire system every several hours after changes in source code
> (especially if changes are not large).

Dimitry, take a look at this build script:
http://svnweb.freebsd.org/socsvn/soc2014/op/tools/build_kernel_64bit_dirty.csh?revision=271052&view=co

It defines a DNO_CLEAN make property, so only those file will
rebuilded, which you modified. This speed up the build time.
>
> I've also thought about your message where you doubt about project
> structure. We'll have `uniext.h' header, which is included if
> UNICODE_ADDENDA macro is defined. This header defines the following
> functions: strcanon, strcanon_l, wcscanon, strnorm, strnorm_l,
> wcsnorm, wcclass. The last one was written as a helper function which
> is used inside wcscanon and wcsnorm, but I thought that it also may be
> useful as a standalone function.
>
> I've rewritten algorithms: now everithing is performed using binary
> search and hashes, so it's really fast (before the search was linear).
> Now it works really fast (e.g. for decomposition it works from 10 to
> 12 times faster than Python's decomposition algorithm). I've also
> tested it on the wide strings, and it works as expected (at least!).
> So this part seems to be finished. The last thing to do is to place
> everything in the right place into the FreeBSD source tree.
>
> Here is my testing repo: https://github.com/ghostmansd/uniext. Just
> use `git clone https://github.com/ghostmansd/uniext'.
> P.S. You need to use gmake if you want to use my Makefile (I don't
> know BSD Makefile syntax well). However, all what you need is to add
> `-Iinclude' flag to CFLAGS, compile everithing in `src', compile
> `main.c' and link it all together.
>
> --
> With best regards,
> Dmitry Selyutin
> _______________________________________________
> soc-status@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/soc-status
> To unsubscribe, send any mail to "soc-status-unsubscribe@freebsd.org"
>



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAPjTQNETf2Gp_rcjUoqL8oGyqW0CxCY9HwXTyd=4v9NwOwRrqg>