Date: Wed, 19 Sep 2007 11:12:07 +0900 From: Taku YAMAMOTO <taku@tackymt.homeip.net> To: Andrey Chernov <ache@nagual.pp.ru> Cc: i18n@FreeBSD.ORG, Petr Hroudn?? <petr.hroudny@gmail.com>, perky@FreeBSD.ORG, current@FreeBSD.ORG Subject: Re: Ctype patch for review Message-ID: <20070919111207.f37653fc.taku@tackymt.homeip.net> In-Reply-To: <20070917171633.GA31179@nagual.pp.ru> References: <20070916192924.GA12678@nagual.pp.ru> <ab8fc7f50709170129p6f436069iffaf697e83a34e3c@mail.gmail.com> <20070917092130.GA24424@nagual.pp.ru> <20070918020100.d43beb0b.taku@tackymt.homeip.net> <20070917171633.GA31179@nagual.pp.ru>
next in thread | previous in thread | raw e-mail | index | archive | help
On Mon, 17 Sep 2007 21:16:33 +0400 Andrey Chernov <ache@nagual.pp.ru> wrote: > On Tue, Sep 18, 2007 at 02:01:00AM +0900, YAMAMOTO, Taku wrote: > > Checking for __mb_cur_max is not enough for certain locales. > > For example, SJIS has following range for JIS X0201 (a.k.a. HALFWIDTH KANA). > > > > /* > > * JIS X201 > > */ > > PUNCT 0xa1-0xa5 > > SPACE 0xa0 > > BLANK 0xa0 > > SPECIAL 0xa1-0xdf > > PHONOGRAM 0xa6-0xdf > > SWIDTH1 0xa0-0xdf > > I don't understand your remark. MSKanji have __mb_cur_max = 2 and so those > ranges are wchar_t ranges. My patch restrict unsigned char ranges only. These characters ARE single byte. The problem is that a byte >= 0x80 does not always mean it composes a multi-byte character in that locale. -- -|-__ YAMAMOTO, Taku | __ < <taku@tackymt.homeip.net> - A chicken is an egg's way of producing more eggs. -
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20070919111207.f37653fc.taku>