Date: Mon, 21 Jul 2014 14:24:52 +0200 From: Oliver Pinter <oliver.pntr@gmail.com> To: ghostmansd@gmail.com Cc: soc-status@freebsd.org, Pedro Giffuni <pfg@freebsd.org> Subject: Re: Report #4: Unicode support Message-ID: <CAPjTQNETf2Gp_rcjUoqL8oGyqW0CxCY9HwXTyd=4v9NwOwRrqg@mail.gmail.com> In-Reply-To: <CAMqzjetbe7x-mWYjVL5OPu39pv4xG4Zmt%2Bj8Hyi1cvPRxmWVSw@mail.gmail.com> References: <CAMqzjetbe7x-mWYjVL5OPu39pv4xG4Zmt%2Bj8Hyi1cvPRxmWVSw@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On 7/21/14, Dmitry Selyutin <ghostman.sd@gmail.com> wrote: > Hello everyone, > > here comes my report on progress during these two weeks. Pedro, David, > excuse me for duplication, please: I should have just included you > into this letter instead of sending you two letters. I've just > realized that I've forgotten to write the report. :-( > > I've been intensively testing my normalization implementation and > discovered that it was working incorrectly. Moreover, it's code seems > to be completely cryptic, so I've rewritten it from the scratch. Now > it seems to work correctly (at least it passes Unicode tests). The > things that I've completely ignored are canonicalization and combining > characters classes. I've decided to publish it in git repo and > integrate it to head later, since it's a real pain to recompile the > entire system every several hours after changes in source code > (especially if changes are not large). Dimitry, take a look at this build script: http://svnweb.freebsd.org/socsvn/soc2014/op/tools/build_kernel_64bit_dirty.csh?revision=271052&view=co It defines a DNO_CLEAN make property, so only those file will rebuilded, which you modified. This speed up the build time. > > I've also thought about your message where you doubt about project > structure. We'll have `uniext.h' header, which is included if > UNICODE_ADDENDA macro is defined. This header defines the following > functions: strcanon, strcanon_l, wcscanon, strnorm, strnorm_l, > wcsnorm, wcclass. The last one was written as a helper function which > is used inside wcscanon and wcsnorm, but I thought that it also may be > useful as a standalone function. > > I've rewritten algorithms: now everithing is performed using binary > search and hashes, so it's really fast (before the search was linear). > Now it works really fast (e.g. for decomposition it works from 10 to > 12 times faster than Python's decomposition algorithm). I've also > tested it on the wide strings, and it works as expected (at least!). > So this part seems to be finished. The last thing to do is to place > everything in the right place into the FreeBSD source tree. > > Here is my testing repo: https://github.com/ghostmansd/uniext. Just > use `git clone https://github.com/ghostmansd/uniext'. > P.S. You need to use gmake if you want to use my Makefile (I don't > know BSD Makefile syntax well). However, all what you need is to add > `-Iinclude' flag to CFLAGS, compile everithing in `src', compile > `main.c' and link it all together. > > -- > With best regards, > Dmitry Selyutin > _______________________________________________ > soc-status@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/soc-status > To unsubscribe, send any mail to "soc-status-unsubscribe@freebsd.org" >
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAPjTQNETf2Gp_rcjUoqL8oGyqW0CxCY9HwXTyd=4v9NwOwRrqg>