From owner-freebsd-bugs Sat Jun 30 13:20:15 2001 Delivered-To: freebsd-bugs@hub.freebsd.org Received: from freefall.freebsd.org (freefall.freebsd.org [216.136.204.21]) by hub.freebsd.org (Postfix) with ESMTP id 03D4337B407 for ; Sat, 30 Jun 2001 13:20:02 -0700 (PDT) (envelope-from gnats@FreeBSD.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.11.3/8.11.3) id f5UKK1o71917; Sat, 30 Jun 2001 13:20:01 -0700 (PDT) (envelope-from gnats) Received: from berkeley.us.and.or.jp (berkeley.us.and.or.jp [210.136.4.34]) by hub.freebsd.org (Postfix) with ESMTP id A15F237B403 for ; Sat, 30 Jun 2001 13:15:02 -0700 (PDT) (envelope-from sa2c@us.and.or.jp) Received: by berkeley.us.and.or.jp (Postfix, from userid 3104) id D4C8F3E32; Sun, 1 Jul 2001 05:14:48 +0900 (JST) Message-Id: <20010630201448.D4C8F3E32@berkeley.us.and.or.jp> Date: Sun, 1 Jul 2001 05:14:48 +0900 (JST) From: sa2c@and.or.jp Reply-To: sa2c@and.or.jp To: FreeBSD-gnats-submit@freebsd.org X-Send-Pr-Version: 3.113 Subject: bin/28552: EUC support of wcstombs(3) is broken for codeset 3 and 4 Sender: owner-freebsd-bugs@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org >Number: 28552 >Category: bin >Synopsis: EUC support of wcstombs(3) is broken for codeset 3 and 4 >Confidential: no >Severity: serious >Priority: medium >Responsible: freebsd-bugs >State: open >Quarter: >Keywords: >Date-Required: >Class: sw-bug >Submitter-Id: current-users >Arrival-Date: Sat Jun 30 13:20:01 PDT 2001 >Closed-Date: >Last-Modified: >Originator: NIIMI Satoshi >Release: FreeBSD 4.3-STABLE i386 >Organization: >Environment: System: FreeBSD berkeley.us.and.or.jp 4.3-STABLE FreeBSD 4.3-STABLE #2: Thu Jun 21 18:28:33 JST 2001 sa2c@berkeley.us.and.or.jp:/usr/obj/usr/src/sys/BERKELEY i386 >Description: wcstombs(3) converts wide characters to multibyte characters incorrectly if the character is codeset 3 or codeset 4 of EUC character. The produced multibyte characters do not conform to EUC specifition. >How-To-Repeat: #include #include #include #include /* multibyte Japanese EUC characters */ const unsigned char teststr[] = { 0x41, 0xa4, 0xa2, 0x8e, 0xb1, 0x8f, 0xb0, 0xa1, 0 }; /* expected wide characters of above */ const wchar_t w_teststr[] = { 0x0041, 0xa4a2, 0x00b1, 0xb021, 0 }; void dumpmbs(const char *prompt, const unsigned char *p) { int c; printf("%s: ", prompt); do { c = *p++; printf("[%02x]", c); } while (c != 0); putchar('\n'); } void dumpwcs(const char *prompt, const wchar_t *wp) { wchar_t wc; printf("%s: ", prompt); do { wc = *wp++; printf("[%04x]", wc); } while (wc != 0); putchar('\n'); } int main(int argc, char **argv) { unsigned char buf[BUFSIZ]; wchar_t wbuf[BUFSIZ]; setlocale(LC_CTYPE, "ja_JP.EUC"); strncpy(buf, teststr, sizeof(buf) - 1); buf[sizeof(buf) - 1] = '\0'; dumpmbs("mbs", teststr); dumpwcs("wcs", w_teststr); mbstowcs(wbuf, teststr, BUFSIZ); dumpwcs("mbs->wcs", wbuf); wcstombs(buf, w_teststr, BUFSIZ); dumpmbs("wcs->mbs", buf); mbstowcs(wbuf, teststr, BUFSIZ); wcstombs(buf, wbuf, BUFSIZ); dumpmbs("mbs->wcs->mbs", buf); wcstombs(buf, w_teststr, BUFSIZ); mbstowcs(wbuf, buf, BUFSIZ); dumpwcs("wcs->mbs->wcs", wbuf); return 0; } >Fix: Index: euc.c =================================================================== RCS file: /home/ncvs/src/lib/libc/locale/euc.c,v retrieving revision 1.3.6.1 diff -u -u -r1.3.6.1 euc.c --- euc.c 2000/06/04 21:47:39 1.3.6.1 +++ euc.c 2001/06/30 19:47:16 @@ -123,6 +123,8 @@ #define _SS2 0x008e #define _SS3 0x008f +#define GR_BITS 0x80808080 /* XXX: to be fixed */ + static inline int _euc_set(c) u_int c; @@ -202,6 +204,8 @@ } *string++ = _SS2; --i; + /* SS2 designates G2 into GR */ + nm |= GR_BITS; } else if (m == CEI->bits[3]) { i = len = CEI->count[3]; @@ -212,6 +216,8 @@ } *string++ = _SS3; --i; + /* SS3 designates G3 into GR */ + nm |= GR_BITS; } else goto CodeSet1; /* Bletch */ while (i-- > 0) >Release-Note: >Audit-Trail: >Unformatted: To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-bugs" in the body of the message