Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 30 Jun 2023 13:58:00 +0000
From:      bugzilla-noreply@freebsd.org
To:        bugs@FreeBSD.org
Subject:   [Bug 272293] The mbrtoc32 and mbrtoc16 functions don't recognize the same multibyte sequences as mbrtowc
Message-ID:  <bug-272293-227@https.bugs.freebsd.org/bugzilla/>

next in thread | raw e-mail | index | archive | help
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D272293

            Bug ID: 272293
           Summary: The mbrtoc32 and mbrtoc16 functions don't recognize
                    the same multibyte sequences as mbrtowc
           Product: Base System
           Version: 13.2-RELEASE
          Hardware: Any
                OS: Any
            Status: New
          Severity: Affects Some People
          Priority: ---
         Component: misc
          Assignee: bugs@FreeBSD.org
          Reporter: bruno@clisp.org
 Attachment #243081 text/plain
         mime type:

Created attachment 243081
  --> https://bugs.freebsd.org/bugzilla/attachment.cgi?id=3D243081&action=
=3Dedit
test case foo.c

It is clear from ISO C 23 (description of mbrtowc: =C2=A7 7.31.6.3.2, descr=
iption of
mbrtoc32: =C2=A7 7.30.1.5, description of mbrtoc16: =C2=A7 7.30.1.3) that t=
he notion of
valid multibyte character is independent of which of these function a progr=
am
uses. When a multibyte character is valid according to one of these functio=
ns,
it should be valid according to the two others as well.

This is not the case in FreeBSD 13.2.

Test case:
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D foo.c =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
#include <locale.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <wchar.h>
#include <uchar.h>
int main ()
{
  if (setlocale (LC_ALL, "zh_CN.GB18030") !=3D NULL)
    {
      mbstate_t state;
      wchar_t wc =3D (wchar_t) 0xBADFACE;
      memset (&state, '\0', sizeof (mbstate_t));
      if (mbrtowc (&wc, "\224\071\375\067", 4, &state) =3D=3D 4)
        {
          printf ("mbrtowc return value =3D 4\n");
          {
            char32_t c32 =3D (char32_t) 0xBADFACE;
            memset (&state, '\0', sizeof (mbstate_t));
            size_t ret =3D mbrtoc32 (&c32, "\224\071\375\067", 4, &state);
            printf ("mbrtoc32 return value =3D %d\n", (int) ret);
          }
          {
            char16_t c16 =3D (char16_t) 0xBADFACE;
            memset (&state, '\0', sizeof (mbstate_t));
            size_t ret =3D mbrtoc16 (&c16, "\224\071\375\067", 4, &state);
            printf ("mbrtoc16 return value =3D %d\n", (int) ret);
          }
        }
    }
}
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
$ cc -Wall foo.c
$ ./a.out

Expected result (e.g. as seen on glibc 2.35):
mbrtowc return value =3D 4
mbrtoc32 return value =3D 4
mbrtoc16 return value =3D 4

Actual result:
mbrtowc return value =3D 4
mbrtoc32 return value =3D -2
mbrtoc16 return value =3D -2

I think I've seen this effect also with other encodings than GB18030. But t=
he
test case above is with GB18030.

--=20
You are receiving this mail because:
You are the assignee for the bug.=



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?bug-272293-227>