From owner-freebsd-i18n Sun Aug 11 11:18:36 2002 Delivered-To: freebsd-i18n@freebsd.org Received: from mx1.FreeBSD.org (mx1.FreeBSD.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 9A96137B406; Sun, 11 Aug 2002 11:18:21 -0700 (PDT) Received: from portege.clkao.org (61-223-31-91.HINET-IP.hinet.net [61.223.31.91]) by mx1.FreeBSD.org (Postfix) with ESMTP id CAA3543E3B; Sun, 11 Aug 2002 11:18:19 -0700 (PDT) (envelope-from clkao@portege.clkao.org) Received: by portege.clkao.org (Postfix, from userid 1000) id EBEE8B85; Mon, 12 Aug 2002 02:18:05 +0800 (CST) Date: Mon, 12 Aug 2002 02:18:05 +0800 From: Chia-liang Kao To: standards@freebsd.org, i18n@freebsd.org Cc: keichii@freebsd.org, tjr@freebsd.org Subject: wcwidth and mklocale Message-ID: <20020811181805.GA1809@portege.clkao.org> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="4SFOXa2GPu3tIq4H" Content-Disposition: inline User-Agent: Mutt/1.5.1i Sender: owner-freebsd-i18n@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG --4SFOXa2GPu3tIq4H Content-Type: multipart/mixed; boundary="jRHKVT23PllUwdXP" Content-Disposition: inline --jRHKVT23PllUwdXP Content-Type: text/plain; charset=us-ascii Content-Disposition: inline attached is a patch implmenting SWIDTH[0-3] in mklocale and wcwidth in libc. it is obtained from netbsd/citrus code. the share_mklocale files shall be updated with the SWIDTH info as the ASCII one in the patch. should there be a default value for SWIDTH in each locale file? should we maintain the binary file format compatibility? I think ache is making the loading of locale files graceful. Cheers, CLK --jRHKVT23PllUwdXP Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename="swidth.diff" Content-Transfer-Encoding: quoted-printable Index: include/ctype.h =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D RCS file: /home/ncvs/src/include/ctype.h,v retrieving revision 1.18 diff -u -r1.18 ctype.h --- include/ctype.h 23 Mar 2002 17:24:53 -0000 1.18 +++ include/ctype.h 11 Aug 2002 18:06:39 -0000 @@ -65,6 +65,12 @@ #define _CTYPE_I 0x00080000L /* Ideogram */ #define _CTYPE_T 0x00100000L /* Special */ #define _CTYPE_Q 0x00200000L /* Phonogram */ +#define _CTYPE_SWM 0xc0000000L /* Mask to get screen width data */ +#define _CTYPE_SWS 30 /* Bits to shift to get width */ +#define _CTYPE_SW0 0x00000000L /* 0 width character */ +#define _CTYPE_SW1 0x40000000L /* 1 width character */ +#define _CTYPE_SW2 0x80000000L /* 2 width character */ +#define _CTYPE_SW3 0xc0000000L /* 3 width character */ =20 __BEGIN_DECLS int isalnum(int); Index: mklocale/lex.l =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D RCS file: /home/ncvs/src/usr.bin/mklocale/lex.l,v retrieving revision 1.6 diff -u -r1.6 lex.l --- mklocale/lex.l 28 Apr 2002 12:34:54 -0000 1.6 +++ mklocale/lex.l 11 Aug 2002 17:07:56 -0000 @@ -118,6 +118,10 @@ return(LIST); } PHONOGRAM { yylval.i =3D _CTYPE_Q|_CTYPE_R|_CTYPE_G; return(LIST); } +SWIDTH0 { yylval.i =3D _CTYPE_SW0; return(LIST); } +SWIDTH1 { yylval.i =3D _CTYPE_SW1; return(LIST); } +SWIDTH2 { yylval.i =3D _CTYPE_SW2; return(LIST); } +SWIDTH3 { yylval.i =3D _CTYPE_SW3; return(LIST); } =20 VARIABLE[\t ] { static char vbuf[1024]; char *v =3D vbuf; Index: libc/locale/iswctype.c =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D RCS file: /home/ncvs/src/lib/libc/locale/iswctype.c,v retrieving revision 1.1 diff -u -r1.1 iswctype.c --- libc/locale/iswctype.c 5 Aug 2002 10:45:23 -0000 1.1 +++ libc/locale/iswctype.c 11 Aug 2002 18:02:20 -0000 @@ -211,3 +211,12 @@ { return (__toupper(wc)); } + +#undef wcwidth +int +wcwidth(wc) + wchar_t wc; +{ + return ((unsigned)__maskrune((wc), _CTYPE_SWM) >> _CTYPE_SWS); +} + Index: share_mklocale//la_LN.US-ASCII.src =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D RCS file: /home/ncvs/src/share/mklocale/la_LN.US-ASCII.src,v retrieving revision 1.2 diff -u -r1.2 la_LN.US-ASCII.src --- share_mklocale//la_LN.US-ASCII.src 30 Nov 2001 05:05:53 -0000 1.2 +++ share_mklocale//la_LN.US-ASCII.src 11 Aug 2002 17:38:18 -0000 @@ -17,6 +17,7 @@ XDIGIT '0' - '9' 'a' - 'f' 'A' - 'F' BLANK ' ' '\t' PRINT 0x20 - 0x7e +SWIDTH1 0x20 - 0x7e =20 MAPLOWER <'A' - 'Z' : 'a'> MAPLOWER <'a' - 'z' : 'a'> --jRHKVT23PllUwdXP-- --4SFOXa2GPu3tIq4H Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.0.6 (FreeBSD) Comment: For info see http://www.gnupg.org iD8DBQE9Vqpdk1XldlEkA5YRAo/OAKCCtBkQnkI5lwXnWCxenyzxP41tQQCcDVex SBL9ueMLluk83Iq/FlkBBgk= =GOvH -----END PGP SIGNATURE----- --4SFOXa2GPu3tIq4H-- To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-i18n" in the body of the message From owner-freebsd-i18n Mon Aug 12 0:42:22 2002 Delivered-To: freebsd-i18n@freebsd.org Received: from mx1.FreeBSD.org (mx1.FreeBSD.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id F3FAB37B400; Mon, 12 Aug 2002 00:42:14 -0700 (PDT) Received: from nuit.iteration.net (nuit.iteration.net [198.92.249.80]) by mx1.FreeBSD.org (Postfix) with ESMTP id 8E7CF43E65; Mon, 12 Aug 2002 00:42:14 -0700 (PDT) (envelope-from keichii@nuit.iteration.net) Received: by nuit.iteration.net (Postfix, from userid 1001) id E03D311B5E6; Mon, 12 Aug 2002 00:40:04 -0700 (PDT) Date: Mon, 12 Aug 2002 00:40:04 -0700 From: "Michael C. Wu" To: Chia-liang Kao Cc: standards@freebsd.org, i18n@freebsd.org, keichii@freebsd.org, tjr@freebsd.org Subject: Re: wcwidth and mklocale Message-ID: <20020812074004.GA73751@nuit.iteration.net> Reply-To: "Michael C. Wu" References: <20020811181805.GA1809@portege.clkao.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20020811181805.GA1809@portege.clkao.org> User-Agent: Mutt/1.3.28i X-PGP-Fingerprint: 5025 F691 F943 8128 48A8 5025 77CE 29C5 8FA1 2E20 X-PGP-Key-ID: 0x8FA12E20 Sender: owner-freebsd-i18n@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG I will commit this in three days, pending discussion (or no discussion) Thanks, Michael On Mon, Aug 12, 2002 at 02:18:05AM +0800, Chia-liang Kao scribbled: | attached is a patch implmenting SWIDTH[0-3] in mklocale and wcwidth in | libc. it is obtained from netbsd/citrus code. | | the share_mklocale files shall be updated with the SWIDTH info as the | ASCII one in the patch. | | should there be a default value for SWIDTH in each locale file? should | we maintain the binary file format compatibility? I think ache is | making the loading of locale files graceful. | | Cheers, | CLK | Index: include/ctype.h | =================================================================== | RCS file: /home/ncvs/src/include/ctype.h,v | retrieving revision 1.18 | diff -u -r1.18 ctype.h | --- include/ctype.h 23 Mar 2002 17:24:53 -0000 1.18 | +++ include/ctype.h 11 Aug 2002 18:06:39 -0000 | @@ -65,6 +65,12 @@ | #define _CTYPE_I 0x00080000L /* Ideogram */ | #define _CTYPE_T 0x00100000L /* Special */ | #define _CTYPE_Q 0x00200000L /* Phonogram */ | +#define _CTYPE_SWM 0xc0000000L /* Mask to get screen width data */ | +#define _CTYPE_SWS 30 /* Bits to shift to get width */ | +#define _CTYPE_SW0 0x00000000L /* 0 width character */ | +#define _CTYPE_SW1 0x40000000L /* 1 width character */ | +#define _CTYPE_SW2 0x80000000L /* 2 width character */ | +#define _CTYPE_SW3 0xc0000000L /* 3 width character */ | | __BEGIN_DECLS | int isalnum(int); | Index: mklocale/lex.l | =================================================================== | RCS file: /home/ncvs/src/usr.bin/mklocale/lex.l,v | retrieving revision 1.6 | diff -u -r1.6 lex.l | --- mklocale/lex.l 28 Apr 2002 12:34:54 -0000 1.6 | +++ mklocale/lex.l 11 Aug 2002 17:07:56 -0000 | @@ -118,6 +118,10 @@ | return(LIST); } | PHONOGRAM { yylval.i = _CTYPE_Q|_CTYPE_R|_CTYPE_G; | return(LIST); } | +SWIDTH0 { yylval.i = _CTYPE_SW0; return(LIST); } | +SWIDTH1 { yylval.i = _CTYPE_SW1; return(LIST); } | +SWIDTH2 { yylval.i = _CTYPE_SW2; return(LIST); } | +SWIDTH3 { yylval.i = _CTYPE_SW3; return(LIST); } | | VARIABLE[\t ] { static char vbuf[1024]; | char *v = vbuf; | Index: libc/locale/iswctype.c | =================================================================== | RCS file: /home/ncvs/src/lib/libc/locale/iswctype.c,v | retrieving revision 1.1 | diff -u -r1.1 iswctype.c | --- libc/locale/iswctype.c 5 Aug 2002 10:45:23 -0000 1.1 | +++ libc/locale/iswctype.c 11 Aug 2002 18:02:20 -0000 | @@ -211,3 +211,12 @@ | { | return (__toupper(wc)); | } | + | +#undef wcwidth | +int | +wcwidth(wc) | + wchar_t wc; | +{ | + return ((unsigned)__maskrune((wc), _CTYPE_SWM) >> _CTYPE_SWS); | +} | + | Index: share_mklocale//la_LN.US-ASCII.src | =================================================================== | RCS file: /home/ncvs/src/share/mklocale/la_LN.US-ASCII.src,v | retrieving revision 1.2 | diff -u -r1.2 la_LN.US-ASCII.src | --- share_mklocale//la_LN.US-ASCII.src 30 Nov 2001 05:05:53 -0000 1.2 | +++ share_mklocale//la_LN.US-ASCII.src 11 Aug 2002 17:38:18 -0000 | @@ -17,6 +17,7 @@ | XDIGIT '0' - '9' 'a' - 'f' 'A' - 'F' | BLANK ' ' '\t' | PRINT 0x20 - 0x7e | +SWIDTH1 0x20 - 0x7e | | MAPLOWER <'A' - 'Z' : 'a'> | MAPLOWER <'a' - 'z' : 'a'> ------------------------- To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-i18n" in the body of the message From owner-freebsd-i18n Mon Aug 12 0:51:35 2002 Delivered-To: freebsd-i18n@freebsd.org Received: from mx1.FreeBSD.org (mx1.FreeBSD.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 39C0437B400; Mon, 12 Aug 2002 00:51:27 -0700 (PDT) Received: from nuit.iteration.net (nuit.iteration.net [198.92.249.80]) by mx1.FreeBSD.org (Postfix) with ESMTP id C729243E65; Mon, 12 Aug 2002 00:51:26 -0700 (PDT) (envelope-from keichii@nuit.iteration.net) Received: by nuit.iteration.net (Postfix, from userid 1001) id 248DA11B5AE; Mon, 12 Aug 2002 00:49:32 -0700 (PDT) Date: Mon, 12 Aug 2002 00:49:32 -0700 From: "Michael C. Wu" To: Chia-liang Kao Cc: standards@freebsd.org, i18n@freebsd.org, keichii@freebsd.org, tjr@freebsd.org Subject: Re: wcwidth and mklocale Message-ID: <20020812074932.GB73751@nuit.iteration.net> Reply-To: "Michael C. Wu" References: <20020811181805.GA1809@portege.clkao.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20020811181805.GA1809@portege.clkao.org> User-Agent: Mutt/1.3.28i X-PGP-Fingerprint: 5025 F691 F943 8128 48A8 5025 77CE 29C5 8FA1 2E20 X-PGP-Key-ID: 0x8FA12E20 Sender: owner-freebsd-i18n@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On second thought, since we have a major change upcoming, perhaps I will commit this tomorrow to allow for more testing. (No one uses this stuff currently anyways.) On Mon, Aug 12, 2002 at 02:18:05AM +0800, Chia-liang Kao scribbled: | attached is a patch implmenting SWIDTH[0-3] in mklocale and wcwidth in | libc. it is obtained from netbsd/citrus code. | | the share_mklocale files shall be updated with the SWIDTH info as the | ASCII one in the patch. | | should there be a default value for SWIDTH in each locale file? should | we maintain the binary file format compatibility? I think ache is | making the loading of locale files graceful. | | Cheers, | CLK | Index: include/ctype.h | =================================================================== | RCS file: /home/ncvs/src/include/ctype.h,v | retrieving revision 1.18 | diff -u -r1.18 ctype.h | --- include/ctype.h 23 Mar 2002 17:24:53 -0000 1.18 | +++ include/ctype.h 11 Aug 2002 18:06:39 -0000 | @@ -65,6 +65,12 @@ | #define _CTYPE_I 0x00080000L /* Ideogram */ | #define _CTYPE_T 0x00100000L /* Special */ | #define _CTYPE_Q 0x00200000L /* Phonogram */ | +#define _CTYPE_SWM 0xc0000000L /* Mask to get screen width data */ | +#define _CTYPE_SWS 30 /* Bits to shift to get width */ | +#define _CTYPE_SW0 0x00000000L /* 0 width character */ | +#define _CTYPE_SW1 0x40000000L /* 1 width character */ | +#define _CTYPE_SW2 0x80000000L /* 2 width character */ | +#define _CTYPE_SW3 0xc0000000L /* 3 width character */ | | __BEGIN_DECLS | int isalnum(int); | Index: mklocale/lex.l | =================================================================== | RCS file: /home/ncvs/src/usr.bin/mklocale/lex.l,v | retrieving revision 1.6 | diff -u -r1.6 lex.l | --- mklocale/lex.l 28 Apr 2002 12:34:54 -0000 1.6 | +++ mklocale/lex.l 11 Aug 2002 17:07:56 -0000 | @@ -118,6 +118,10 @@ | return(LIST); } | PHONOGRAM { yylval.i = _CTYPE_Q|_CTYPE_R|_CTYPE_G; | return(LIST); } | +SWIDTH0 { yylval.i = _CTYPE_SW0; return(LIST); } | +SWIDTH1 { yylval.i = _CTYPE_SW1; return(LIST); } | +SWIDTH2 { yylval.i = _CTYPE_SW2; return(LIST); } | +SWIDTH3 { yylval.i = _CTYPE_SW3; return(LIST); } | | VARIABLE[\t ] { static char vbuf[1024]; | char *v = vbuf; | Index: libc/locale/iswctype.c | =================================================================== | RCS file: /home/ncvs/src/lib/libc/locale/iswctype.c,v | retrieving revision 1.1 | diff -u -r1.1 iswctype.c | --- libc/locale/iswctype.c 5 Aug 2002 10:45:23 -0000 1.1 | +++ libc/locale/iswctype.c 11 Aug 2002 18:02:20 -0000 | @@ -211,3 +211,12 @@ | { | return (__toupper(wc)); | } | + | +#undef wcwidth | +int | +wcwidth(wc) | + wchar_t wc; | +{ | + return ((unsigned)__maskrune((wc), _CTYPE_SWM) >> _CTYPE_SWS); | +} | + | Index: share_mklocale//la_LN.US-ASCII.src | =================================================================== | RCS file: /home/ncvs/src/share/mklocale/la_LN.US-ASCII.src,v | retrieving revision 1.2 | diff -u -r1.2 la_LN.US-ASCII.src | --- share_mklocale//la_LN.US-ASCII.src 30 Nov 2001 05:05:53 -0000 1.2 | +++ share_mklocale//la_LN.US-ASCII.src 11 Aug 2002 17:38:18 -0000 | @@ -17,6 +17,7 @@ | XDIGIT '0' - '9' 'a' - 'f' 'A' - 'F' | BLANK ' ' '\t' | PRINT 0x20 - 0x7e | +SWIDTH1 0x20 - 0x7e | | MAPLOWER <'A' - 'Z' : 'a'> | MAPLOWER <'a' - 'z' : 'a'> ------------------------- To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-i18n" in the body of the message From owner-freebsd-i18n Mon Aug 12 0:56:38 2002 Delivered-To: freebsd-i18n@freebsd.org Received: from mx1.FreeBSD.org (mx1.FreeBSD.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id D410D37B400; Mon, 12 Aug 2002 00:56:35 -0700 (PDT) Received: from dilbert.robbins.dropbear.id.au (093.b.011.mel.iprimus.net.au [210.50.217.93]) by mx1.FreeBSD.org (Postfix) with ESMTP id D7C9A43E4A; Mon, 12 Aug 2002 00:56:33 -0700 (PDT) (envelope-from tim@robbins.dropbear.id.au) Received: from dilbert.robbins.dropbear.id.au (fidenxr69tdaatey@localhost [127.0.0.1]) by dilbert.robbins.dropbear.id.au (8.12.3/8.12.3) with ESMTP id g7C7uQ20015408; Mon, 12 Aug 2002 17:56:26 +1000 (EST) (envelope-from tim@dilbert.robbins.dropbear.id.au) Received: (from tim@localhost) by dilbert.robbins.dropbear.id.au (8.12.3/8.12.3/Submit) id g7C7uPDK015407; Mon, 12 Aug 2002 17:56:25 +1000 (EST) Date: Mon, 12 Aug 2002 17:56:23 +1000 From: Tim Robbins To: Chia-liang Kao Cc: standards@FreeBSD.ORG, i18n@FreeBSD.ORG Subject: Re: wcwidth and mklocale Message-ID: <20020812175623.A14727@dilbert.robbins.dropbear.id.au> References: <20020811181805.GA1809@portege.clkao.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5.1i In-Reply-To: <20020811181805.GA1809@portege.clkao.org>; from clkao@clkao.org on Mon, Aug 12, 2002 at 02:18:05AM +0800 Sender: owner-freebsd-i18n@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On Mon, Aug 12, 2002 at 02:18:05AM +0800, Chia-liang Kao wrote: > attached is a patch implmenting SWIDTH[0-3] in mklocale and wcwidth in > libc. it is obtained from netbsd/citrus code. ... Thanks. I'll get to work on the wcwidth stuff as soon as I've finished testing and have committed the wide char. stdio code. Tim To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-i18n" in the body of the message From owner-freebsd-i18n Mon Aug 12 3:29: 2 2002 Delivered-To: freebsd-i18n@freebsd.org Received: from mx1.FreeBSD.org (mx1.FreeBSD.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 8226D37B400; Mon, 12 Aug 2002 03:28:57 -0700 (PDT) Received: from nagual.pp.ru (pobrecita.freebsd.ru [194.87.13.42]) by mx1.FreeBSD.org (Postfix) with ESMTP id 7D44843E5E; Mon, 12 Aug 2002 03:28:56 -0700 (PDT) (envelope-from ache@pobrecita.freebsd.ru) Received: from pobrecita.freebsd.ru (ache@localhost [127.0.0.1]) by nagual.pp.ru (8.12.5/8.12.5) with ESMTP id g7CASeTq001457; Mon, 12 Aug 2002 14:28:41 +0400 (MSD) (envelope-from ache@pobrecita.freebsd.ru) Received: (from ache@localhost) by pobrecita.freebsd.ru (8.12.5/8.12.5/Submit) id g7CAScs0001455; Mon, 12 Aug 2002 14:28:38 +0400 (MSD) (envelope-from ache) Date: Mon, 12 Aug 2002 14:28:36 +0400 From: "Andrey A. Chernov" To: Chia-liang Kao Cc: standards@FreeBSD.ORG, i18n@FreeBSD.ORG, keichii@FreeBSD.ORG, tjr@FreeBSD.ORG Subject: Re: wcwidth and mklocale Message-ID: <20020812102835.GA1288@nagual.pp.ru> References: <20020811181805.GA1809@portege.clkao.org> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-md5; protocol="application/pgp-signature"; boundary="45Z9DzgjV8m4Oswq" Content-Disposition: inline In-Reply-To: <20020811181805.GA1809@portege.clkao.org> User-Agent: Mutt/1.5.1i Sender: owner-freebsd-i18n@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG --45Z9DzgjV8m4Oswq Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Mon, Aug 12, 2002 at 02:18:05 +0800, Chia-liang Kao wrote: > the share_mklocale files shall be updated with the SWIDTH info as the > ASCII one in the patch. Not exactly simple, 0x20 - 0x7e range is for ASCII only, other single byte charsets must match their PRINT. > should there be a default value for SWIDTH in each locale file?=20 If there is strong demand to run -current programs which use this feature= =20 on -stable, we'll need default as SWIDTH1 and switch defines: #define _CTYPE_SW0 0x40000000L #define _CTYPE_SW1 0x00000000L (0 from old locale will be SWIDTH1). But in that case we need to specify=20 SWIDTH0 in single byte charsets files, not SWIDTH1 only as in your=20 example, or both. It looks a bit ugly, so if there is no such demand,=20 better to avoid it. > should we maintain the binary file format compatibility?=20 File format is not changed, as I see. Isn't it? --=20 Andrey A. Chernov http://ache.pp.ru/ --45Z9DzgjV8m4Oswq Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.0.7 (FreeBSD) iQCVAwUBPVeN0+JgpPLZnQjrAQHZ6gQApngmUgyKGFxkYbOrN1yh1EF+wmpOU0N/ 2XycOC5CgvRCjoQeU4U0tw/glJ5ODToIAeYm9zjotZGaZLV+ymvP39RM1HwXwBwy 947gsVTBN5cltStWECTTL0hgmmqDY+enryZsYDSXHMT9by7RvtO6DaSDEMOZuBgZ wsIwW1L06EM= =BCsR -----END PGP SIGNATURE----- --45Z9DzgjV8m4Oswq-- To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-i18n" in the body of the message From owner-freebsd-i18n Mon Aug 12 21:47:27 2002 Delivered-To: freebsd-i18n@freebsd.org Received: from mx1.FreeBSD.org (mx1.FreeBSD.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 6564F37B401; Mon, 12 Aug 2002 21:47:21 -0700 (PDT) Received: from portege.clkao.org (if88.mc.ntu.edu.tw [140.112.125.88]) by mx1.FreeBSD.org (Postfix) with ESMTP id 91ED343E3B; Mon, 12 Aug 2002 21:47:20 -0700 (PDT) (envelope-from clkao@portege.clkao.org) Received: by portege.clkao.org (Postfix, from userid 1000) id 35C38B82; Tue, 13 Aug 2002 11:50:10 +0800 (CST) Date: Tue, 13 Aug 2002 11:50:10 +0800 From: Chia-liang Kao To: "Andrey A. Chernov" Cc: standards@FreeBSD.ORG, i18n@FreeBSD.ORG, keichii@FreeBSD.ORG, tjr@FreeBSD.ORG Subject: Re: wcwidth and mklocale Message-ID: <20020813035009.GA1084@portege.clkao.org> References: <20020811181805.GA1809@portege.clkao.org> <20020812102835.GA1288@nagual.pp.ru> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="bg08WKrSYDhXBjb5" Content-Disposition: inline In-Reply-To: <20020812102835.GA1288@nagual.pp.ru> User-Agent: Mutt/1.5.1i Sender: owner-freebsd-i18n@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG --bg08WKrSYDhXBjb5 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Mon, Aug 12, 2002 at 02:28:36PM +0400, Andrey A. Chernov wrote: > On Mon, Aug 12, 2002 at 02:18:05 +0800, Chia-liang Kao wrote: > > the share_mklocale files shall be updated with the SWIDTH info as the > > ASCII one in the patch. >=20 > Not exactly simple, 0x20 - 0x7e range is for ASCII only, other single byte > charsets must match their PRINT. >=20 > > should there be a default value for SWIDTH in each locale file?=20 >=20 > If there is strong demand to run -current programs which use this feature= =20 > on -stable, we'll need default as SWIDTH1 and switch defines: >=20 > #define _CTYPE_SW0 0x40000000L > #define _CTYPE_SW1 0x00000000L agreed to make swidth1 default this way. but this would make wcwidth slightly complicated than the current simple shift. how about we have: #define _CTYPE_SWM 0xe0000000L /* Mask to get screen width = data */ #define _CTYPE_SWS 30 /* Bits to shift to get widt= h */ #define _CTYPE_SW1 0x00000000L /* 1 width character / defa= ult */ #define _CTYPE_SW0 0x20000000L /* 0 width character */ #define _CTYPE_SW2 0x80000000L /* 2 width character */ #define _CTYPE_SW3 0xc0000000L /* 3 width character */ > (0 from old locale will be SWIDTH1). But in that case we need to specify= =20 > SWIDTH0 in single byte charsets files, not SWIDTH1 only as in your=20 > example, or both. It looks a bit ugly, so if there is no such demand,=20 > better to avoid it. and wcwidth checks _CTYPE_R (printable) first, checks if masked sw is 0 (default, return 1), then calculate the width by shift. so the locale source would be less ugly and we don't need to touch single byte charset locales. if people agree on this i'll cook a new patch. > > should we maintain the binary file format compatibility?=20 > File format is not changed, as I see. Isn't it? right, i was thinking about adding a field for storing default. but this way seems easier :-) Cheers, CLK --bg08WKrSYDhXBjb5 Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.0.6 (FreeBSD) Comment: For info see http://www.gnupg.org iD8DBQE9WIHxk1XldlEkA5YRAqRTAJ9SuGt/jLkCtlcU4ycaCdt2GhdYewCfbHev mXldLYDWx7TV+BqhPbo2ZZ8= =7jWF -----END PGP SIGNATURE----- --bg08WKrSYDhXBjb5-- To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-i18n" in the body of the message From owner-freebsd-i18n Thu Aug 15 8:50: 3 2002 Delivered-To: freebsd-i18n@freebsd.org Received: from mx1.FreeBSD.org (mx1.FreeBSD.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 2F7AB37B400; Thu, 15 Aug 2002 08:49:54 -0700 (PDT) Received: from portege.clkao.org (61-223-25-135.HINET-IP.hinet.net [61.223.25.135]) by mx1.FreeBSD.org (Postfix) with ESMTP id 6D41B43E6E; Thu, 15 Aug 2002 08:49:52 -0700 (PDT) (envelope-from clkao@portege.clkao.org) Received: by portege.clkao.org (Postfix, from userid 1000) id 6BB6EBEE; Thu, 15 Aug 2002 23:49:16 +0800 (CST) Date: Thu, 15 Aug 2002 23:49:16 +0800 From: Chia-liang Kao To: "Andrey A. Chernov" , standards@FreeBSD.ORG, i18n@FreeBSD.ORG Cc: keichii@FreeBSD.ORG, tjr@FreeBSD.ORG Subject: Re: wcwidth and mklocale Message-ID: <20020815154915.GA9607@portege.clkao.org> References: <20020811181805.GA1809@portege.clkao.org> <20020812102835.GA1288@nagual.pp.ru> <20020813035009.GA1084@portege.clkao.org> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="5G06lTa6Jq83wMTw" Content-Disposition: inline In-Reply-To: <20020813035009.GA1084@portege.clkao.org> User-Agent: Mutt/1.5.1i Sender: owner-freebsd-i18n@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG --5G06lTa6Jq83wMTw Content-Type: multipart/mixed; boundary="Bn2rw/3z4jIqBvZU" Content-Disposition: inline --Bn2rw/3z4jIqBvZU Content-Type: text/plain; charset=us-ascii Content-Disposition: inline attached is the revised patch that treats printable wchar without width specified as SWIDTH_1. Cheers, CLK --Bn2rw/3z4jIqBvZU Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename="swidth.diff" Content-Transfer-Encoding: quoted-printable Index: include/ctype.h =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D RCS file: /home/ncvs/src/include/ctype.h,v retrieving revision 1.18 diff -u -r1.18 ctype.h --- ctype.h 23 Mar 2002 17:24:53 -0000 1.18 +++ ctype.h 15 Aug 2002 15:31:33 -0000 @@ -65,6 +65,12 @@ #define _CTYPE_I 0x00080000L /* Ideogram */ #define _CTYPE_T 0x00100000L /* Special */ #define _CTYPE_Q 0x00200000L /* Phonogram */ +#define _CTYPE_SWM 0xe0000000L /* Mask to get screen width data */ +#define _CTYPE_SWS 30 /* Bits to shift to get width */ +#define _CTYPE_SW0 0x20000000L /* 0 width character */ +#define _CTYPE_SW1 0x00000000L /* 1 width character / default*/ +#define _CTYPE_SW2 0x80000000L /* 2 width character */ +#define _CTYPE_SW3 0xc0000000L /* 3 width character */ =20 __BEGIN_DECLS int isalnum(int); Index: lib/libc/locale/iswctype.c =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D RCS file: /home/ncvs/src/lib/libc/locale/iswctype.c,v retrieving revision 1.1 diff -u -r1.1 iswctype.c --- iswctype.c 5 Aug 2002 10:45:23 -0000 1.1 +++ iswctype.c 15 Aug 2002 15:43:19 -0000 @@ -211,3 +211,13 @@ { return (__toupper(wc)); } + +#undef wcwidth +int +wcwidth(wc) + wchar_t wc; +{ + int width =3D (unsigned)__maskrune((wc), _CTYPE_SWM) >> _CTYPE_SWS; + return width ? width : isprint(wc); +} + Index: usr.bin/mklocale/lex.l =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D RCS file: /home/ncvs/src/usr.bin/mklocale/lex.l,v retrieving revision 1.6 diff -u -r1.6 lex.l --- lex.l 28 Apr 2002 12:34:54 -0000 1.6 +++ lex.l 11 Aug 2002 17:07:56 -0000 @@ -118,6 +118,10 @@ return(LIST); } PHONOGRAM { yylval.i =3D _CTYPE_Q|_CTYPE_R|_CTYPE_G; return(LIST); } +SWIDTH0 { yylval.i =3D _CTYPE_SW0; return(LIST); } +SWIDTH1 { yylval.i =3D _CTYPE_SW1; return(LIST); } +SWIDTH2 { yylval.i =3D _CTYPE_SW2; return(LIST); } +SWIDTH3 { yylval.i =3D _CTYPE_SW3; return(LIST); } =20 VARIABLE[\t ] { static char vbuf[1024]; char *v =3D vbuf; --Bn2rw/3z4jIqBvZU-- --5G06lTa6Jq83wMTw Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.0.6 (FreeBSD) Comment: For info see http://www.gnupg.org iD8DBQE9W817k1XldlEkA5YRAs6OAJ4gVkrRdcT4qhxIH4PpTsgLVzCUEgCdHa2d vDrI0eg8z0sDJUxYRvJfTC4= =6F6u -----END PGP SIGNATURE----- --5G06lTa6Jq83wMTw-- To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-i18n" in the body of the message From owner-freebsd-i18n Thu Aug 15 12:50:17 2002 Delivered-To: freebsd-i18n@freebsd.org Received: from mx1.FreeBSD.org (mx1.FreeBSD.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 7720D37B400; Thu, 15 Aug 2002 12:50:13 -0700 (PDT) Received: from nagual.pp.ru (pobrecita.freebsd.ru [194.87.13.42]) by mx1.FreeBSD.org (Postfix) with ESMTP id 5091643E3B; Thu, 15 Aug 2002 12:50:12 -0700 (PDT) (envelope-from ache@pobrecita.freebsd.ru) Received: from pobrecita.freebsd.ru (ache@localhost [127.0.0.1]) by nagual.pp.ru (8.12.5/8.12.5) with ESMTP id g7FJnnup009824; Thu, 15 Aug 2002 23:49:54 +0400 (MSD) (envelope-from ache@pobrecita.freebsd.ru) Received: (from ache@localhost) by pobrecita.freebsd.ru (8.12.5/8.12.5/Submit) id g7FJnjRW009823; Thu, 15 Aug 2002 23:49:46 +0400 (MSD) (envelope-from ache) Date: Thu, 15 Aug 2002 23:49:41 +0400 From: "Andrey A. Chernov" To: Chia-liang Kao Cc: standards@FreeBSD.ORG, i18n@FreeBSD.ORG, keichii@FreeBSD.ORG, tjr@FreeBSD.ORG Subject: Re: wcwidth and mklocale Message-ID: <20020815194941.GA9773@nagual.pp.ru> References: <20020811181805.GA1809@portege.clkao.org> <20020812102835.GA1288@nagual.pp.ru> <20020813035009.GA1084@portege.clkao.org> <20020815154915.GA9607@portege.clkao.org> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-md5; protocol="application/pgp-signature"; boundary="huq684BweRXVnRxX" Content-Disposition: inline In-Reply-To: <20020815154915.GA9607@portege.clkao.org> User-Agent: Mutt/1.5.1i Sender: owner-freebsd-i18n@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG --huq684BweRXVnRxX Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Thu, Aug 15, 2002 at 23:49:16 +0800, Chia-liang Kao wrote: > +wcwidth(wc) > + wchar_t wc; > +{ > + int width =3D (unsigned)__maskrune((wc), _CTYPE_SWM) >> _CTYPE_SWS; > + return width ? width : isprint(wc); Should it be iswprint() instead here? isprint() is not supposed to be=20 called directly on wchar_t. --=20 Andrey A. Chernov http://ache.pp.ru/ --huq684BweRXVnRxX Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.0.7 (FreeBSD) iQCVAwUBPVwF1OJgpPLZnQjrAQH00AQAmsyEKHfsBoXDzBCqfpPy3cSc47GdYsbQ QdFuTSQTUVtG3qzBdIulFV+KidQYfhv5Ty2vx1ifXm/3UuSeUDXePiMKqqmMr6fg odoBQl6nN4eNTdv0g6Z7TAJj1jKj/WvN0D1BQ/ecKM/AN293CoG07tTR3DCulk+R HBiwJD60CRA= =WsJf -----END PGP SIGNATURE----- --huq684BweRXVnRxX-- To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-i18n" in the body of the message From owner-freebsd-i18n Fri Aug 16 0: 2:28 2002 Delivered-To: freebsd-i18n@freebsd.org Received: from mx1.FreeBSD.org (mx1.FreeBSD.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 1AD0B37B400; Fri, 16 Aug 2002 00:02:22 -0700 (PDT) Received: from portege.clkao.org (61-216-79-243.HINET-IP.hinet.net [61.216.79.243]) by mx1.FreeBSD.org (Postfix) with ESMTP id 19B0543E72; Fri, 16 Aug 2002 00:02:21 -0700 (PDT) (envelope-from clkao@portege.clkao.org) Received: by portege.clkao.org (Postfix, from userid 1000) id D754448D; Fri, 16 Aug 2002 15:02:00 +0800 (CST) Date: Fri, 16 Aug 2002 15:02:00 +0800 From: Chia-liang Kao To: "Andrey A. Chernov" Cc: standards@FreeBSD.ORG, i18n@FreeBSD.ORG, keichii@FreeBSD.ORG, tjr@FreeBSD.ORG Subject: Re: wcwidth and mklocale Message-ID: <20020816070200.GA643@portege.clkao.org> References: <20020811181805.GA1809@portege.clkao.org> <20020812102835.GA1288@nagual.pp.ru> <20020813035009.GA1084@portege.clkao.org> <20020815154915.GA9607@portege.clkao.org> <20020815194941.GA9773@nagual.pp.ru> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="0F1p//8PRICkK4MW" Content-Disposition: inline In-Reply-To: <20020815194941.GA9773@nagual.pp.ru> User-Agent: Mutt/1.5.1i Sender: owner-freebsd-i18n@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG --0F1p//8PRICkK4MW Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable right, should be iswprint. i should have used -Wall when testing. On Thu, Aug 15, 2002 at 11:49:41PM +0400, Andrey A. Chernov wrote: > On Thu, Aug 15, 2002 at 23:49:16 +0800, Chia-liang Kao wrote: > > +wcwidth(wc) > > + wchar_t wc; > > +{ > > + int width =3D (unsigned)__maskrune((wc), _CTYPE_SWM) >> _CTYPE_SWS; > > + return width ? width : isprint(wc); >=20 > Should it be iswprint() instead here? isprint() is not supposed to be=20 > called directly on wchar_t. Cheers, CLK --0F1p//8PRICkK4MW Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.0.6 (FreeBSD) Comment: For info see http://www.gnupg.org iD8DBQE9XKNok1XldlEkA5YRAuimAJ0Z616/3vJOHiGxIrUqButo74hWwwCZAcSY SssfbJnkdn7AkQ9kLOAFPI0= =2t83 -----END PGP SIGNATURE----- --0F1p//8PRICkK4MW-- To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-i18n" in the body of the message