Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 05 Feb 2018 20:21:08 +0000
From:      bugzilla-noreply@freebsd.org
To:        freebsd-bugs@FreeBSD.org
Subject:   [Bug 225692] iswprint() wrong for some FULL WIDTH characters in UTF-8 locale
Message-ID:  <bug-225692-8-ZpOKGMTwdK@https.bugs.freebsd.org/bugzilla/>
In-Reply-To: <bug-225692-8@https.bugs.freebsd.org/bugzilla/>

index | next in thread | previous in thread | raw e-mail

https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=225692

--- Comment #1 from Conrad Meyer <cem@freebsd.org> ---
iswprint(wc) is a thin shim around __istype(wc, _CTYPE_R);

__istype(wc, type) is a thin shim in include/_ctype.h:
  return (!!__maskrune(wc, _CTYPE_R));

__maskrune() is defined earlier in the same file:
  return ((wc < 0 || wc >= _CACHED_RUNES) ? ___runetype(wc) :
    _CurrentRuneLocale->__runetype[wc]) & _CTYPE_R;

(CACHED_RUNES is probably 1<<8.)

This tells me the type information is being looked up in ___runetype() and that
the _CTYPE_R bit must be unset for 0x2002/0xff08.

At some level, I thought we got this metadata from the unicode standard tables,
but maybe ours are out of date or this particular data is sourced
independently.

___runetype(wc) is a thin shim around ___runetype_l(wc, __get_locale());

___runetype_l() does a binary search in the _RuneRange table for the current
locale object.  If nothing is found, it returns 0.  This suggests the current
locale object does not have or does not have correct type metadata for at least
these two characters.

-- 
You are receiving this mail because:
You are the assignee for the bug.

home | help

Want to link to this message? Use this
URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?bug-225692-8-ZpOKGMTwdK>