Date: Fri, 30 Jun 2023 13:58:00 +0000 From: bugzilla-noreply@freebsd.org To: bugs@FreeBSD.org Subject: [Bug 272293] The mbrtoc32 and mbrtoc16 functions don't recognize the same multibyte sequences as mbrtowc Message-ID: <bug-272293-227@https.bugs.freebsd.org/bugzilla/>
next in thread | raw e-mail | index | archive | help
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D272293 Bug ID: 272293 Summary: The mbrtoc32 and mbrtoc16 functions don't recognize the same multibyte sequences as mbrtowc Product: Base System Version: 13.2-RELEASE Hardware: Any OS: Any Status: New Severity: Affects Some People Priority: --- Component: misc Assignee: bugs@FreeBSD.org Reporter: bruno@clisp.org Attachment #243081 text/plain mime type: Created attachment 243081 --> https://bugs.freebsd.org/bugzilla/attachment.cgi?id=3D243081&action= =3Dedit test case foo.c It is clear from ISO C 23 (description of mbrtowc: =C2=A7 7.31.6.3.2, descr= iption of mbrtoc32: =C2=A7 7.30.1.5, description of mbrtoc16: =C2=A7 7.30.1.3) that t= he notion of valid multibyte character is independent of which of these function a progr= am uses. When a multibyte character is valid according to one of these functio= ns, it should be valid according to the two others as well. This is not the case in FreeBSD 13.2. Test case: =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D foo.c =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D #include <locale.h> #include <stdio.h> #include <stdlib.h> #include <string.h> #include <wchar.h> #include <uchar.h> int main () { if (setlocale (LC_ALL, "zh_CN.GB18030") !=3D NULL) { mbstate_t state; wchar_t wc =3D (wchar_t) 0xBADFACE; memset (&state, '\0', sizeof (mbstate_t)); if (mbrtowc (&wc, "\224\071\375\067", 4, &state) =3D=3D 4) { printf ("mbrtowc return value =3D 4\n"); { char32_t c32 =3D (char32_t) 0xBADFACE; memset (&state, '\0', sizeof (mbstate_t)); size_t ret =3D mbrtoc32 (&c32, "\224\071\375\067", 4, &state); printf ("mbrtoc32 return value =3D %d\n", (int) ret); } { char16_t c16 =3D (char16_t) 0xBADFACE; memset (&state, '\0', sizeof (mbstate_t)); size_t ret =3D mbrtoc16 (&c16, "\224\071\375\067", 4, &state); printf ("mbrtoc16 return value =3D %d\n", (int) ret); } } } } =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D $ cc -Wall foo.c $ ./a.out Expected result (e.g. as seen on glibc 2.35): mbrtowc return value =3D 4 mbrtoc32 return value =3D 4 mbrtoc16 return value =3D 4 Actual result: mbrtowc return value =3D 4 mbrtoc32 return value =3D -2 mbrtoc16 return value =3D -2 I think I've seen this effect also with other encodings than GB18030. But t= he test case above is with GB18030. --=20 You are receiving this mail because: You are the assignee for the bug.=
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?bug-272293-227>