Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 05 Feb 2018 20:21:08 +0000
From:      bugzilla-noreply@freebsd.org
To:        freebsd-bugs@FreeBSD.org
Subject:   [Bug 225692] iswprint() wrong for some FULL WIDTH characters in UTF-8 locale
Message-ID:  <bug-225692-8-ZpOKGMTwdK@https.bugs.freebsd.org/bugzilla/>
In-Reply-To: <bug-225692-8@https.bugs.freebsd.org/bugzilla/>
References:  <bug-225692-8@https.bugs.freebsd.org/bugzilla/>

next in thread | previous in thread | raw e-mail | index | archive | help
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D225692

--- Comment #1 from Conrad Meyer <cem@freebsd.org> ---
iswprint(wc) is a thin shim around __istype(wc, _CTYPE_R);

__istype(wc, type) is a thin shim in include/_ctype.h:
  return (!!__maskrune(wc, _CTYPE_R));

__maskrune() is defined earlier in the same file:
  return ((wc < 0 || wc >=3D _CACHED_RUNES) ? ___runetype(wc) :
    _CurrentRuneLocale->__runetype[wc]) & _CTYPE_R;

(CACHED_RUNES is probably 1<<8.)

This tells me the type information is being looked up in ___runetype() and =
that
the _CTYPE_R bit must be unset for 0x2002/0xff08.

At some level, I thought we got this metadata from the unicode standard tab=
les,
but maybe ours are out of date or this particular data is sourced
independently.

___runetype(wc) is a thin shim around ___runetype_l(wc, __get_locale());

___runetype_l() does a binary search in the _RuneRange table for the current
locale object.  If nothing is found, it returns 0.  This suggests the curre=
nt
locale object does not have or does not have correct type metadata for at l=
east
these two characters.

--=20
You are receiving this mail because:
You are the assignee for the bug.=



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?bug-225692-8-ZpOKGMTwdK>