Date: Mon, 05 Feb 2018 20:21:08 +0000 From: bugzilla-noreply@freebsd.org To: freebsd-bugs@FreeBSD.org Subject: [Bug 225692] iswprint() wrong for some FULL WIDTH characters in UTF-8 locale Message-ID: <bug-225692-8-ZpOKGMTwdK@https.bugs.freebsd.org/bugzilla/> In-Reply-To: <bug-225692-8@https.bugs.freebsd.org/bugzilla/>
index | next in thread | previous in thread | raw e-mail
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=225692 --- Comment #1 from Conrad Meyer <cem@freebsd.org> --- iswprint(wc) is a thin shim around __istype(wc, _CTYPE_R); __istype(wc, type) is a thin shim in include/_ctype.h: return (!!__maskrune(wc, _CTYPE_R)); __maskrune() is defined earlier in the same file: return ((wc < 0 || wc >= _CACHED_RUNES) ? ___runetype(wc) : _CurrentRuneLocale->__runetype[wc]) & _CTYPE_R; (CACHED_RUNES is probably 1<<8.) This tells me the type information is being looked up in ___runetype() and that the _CTYPE_R bit must be unset for 0x2002/0xff08. At some level, I thought we got this metadata from the unicode standard tables, but maybe ours are out of date or this particular data is sourced independently. ___runetype(wc) is a thin shim around ___runetype_l(wc, __get_locale()); ___runetype_l() does a binary search in the _RuneRange table for the current locale object. If nothing is found, it returns 0. This suggests the current locale object does not have or does not have correct type metadata for at least these two characters. -- You are receiving this mail because: You are the assignee for the bug.home | help
Want to link to this message? Use this
URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?bug-225692-8-ZpOKGMTwdK>
