Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 5 Dec 1996 02:45:20 -0800 (PST)
From:      iwasaki@pc.jaring.my
To:        freebsd-gnats-submit@freebsd.org
Subject:   bin/2161: Bugs in mklocale(1) make isgraph(3) confused in the Japanese EUC locale!!
Message-ID:  <199612051045.CAA29752@freefall.freebsd.org>
Resent-Message-ID: <199612051050.CAA29915@freefall.freebsd.org>

next in thread | raw e-mail | index | archive | help

>Number:         2161
>Category:       bin
>Synopsis:       Bugs in mklocale(1) make isgraph(3) confused in the Japanese EUC locale!!
>Confidential:   no
>Severity:       critical
>Priority:       high
>Responsible:    freebsd-bugs
>State:          open
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Thu Dec  5 02:50:01 PST 1996
>Last-Modified:
>Originator:     Mitsuru IWASAKI
>Organization:
AISDEL; SIRIM
>Release:        2.1.6 and 3.0-current
>Environment:
N/A

>Description:
NOTE: This problem is relating with I18N col(1) for
      the Japanese manual pages project which is aiming at
      2.2-RELEASE (as the xperimnt).
      We would like this patch applied to 2.2!!

Bugs in mklocale(1) can make isgraph(3) confused
in the Japanese EUC locale. 
If we have a correct /usr/share/locale/ja_JP.EUC/LC_CTYPE,
we can see messages from following test program, "OK,OK,OK...",
but unfortunately our FreeBSD have bugs in mklocale(1)
(which is derived from 4.4BSD-Lite).
The bugs can isgraph(3) to be confused during handling 
the Japanese scripts because of corrupted ja_JP.EUC/LC_CTYPE.
>How-To-Repeat:
To compile and execute the following test program.

   % cc testeuc.c -o testeuc -lxpg4
   % testeuc (Oops!, "NG" messages for some characters.)

----CUT----CUT----CUT----CUT----CUT----CUT----CUT----CUT----
#include <stdio.h>
#include <locale.h>
#include <rune.h>
#include <limits.h>

testeuc(string)
char const *string;
{
	char const *result;
	rune_t ch;

	printf("Japanese characters [%s]\n", string);

	while ((ch = sgetrune(string, MB_LEN_MAX, &result))) {
		string = result == string ? string + 1 : result;
		/*
		all of the characters in this test program should be
		printing characters.
		*/
		if (isgraph(ch)) {
			printf("OK\n");
		} else {
			/* The Japanese LOCALE_CTYPE is corrupted!! */
			printf("NG\n");
		}
	}
}

void main()
{
	/* These are printing characters, so should be OK (cf. man euc). */
	/* But, some of the characters cannot be handled correctly. */
	/* NOTE: each samples has four characters in the Japanese EUC locale. */

	static char *euc_codeset_1_ok = "AZ09\0";
	static char *euc_codeset_2_ok = "\xa4\xa2\xa5\xa2\xb0\xa1\xa1\xbc\0";
	static char *euc_codeset_2_ng = "\xa1\xa2\xa1\xa3\xa3\xc1\xa3\xda\0";

	(void) setlocale(LC_CTYPE, "ja_JP.EUC");

	printf("euc_codeset_1_ok\n"); testeuc(euc_codeset_1_ok);
	printf("euc_codeset_2_ok\n"); testeuc(euc_codeset_2_ok);
	printf("euc_codeset_2_ng\n"); testeuc(euc_codeset_2_ng);
}

>Fix:
To solve this problem, after applying following patch to
/usr/src/usr.bin/mklocale/yacc.y, and
type ''make cleandir obj depend all'' in /usr/src/usr.bin/mklocale, then
``make install'' to re-install the new LC_CTYPE for ja_JP.EUC
created by the fixed mklocale(1).

To confirm, please try testeuc again!

----CUT----CUT----CUT----CUT----CUT----CUT----CUT----CUT----
--- yacc.y.orig.1204	Thu Dec  5 00:42:55 1996
+++ yacc.y	Thu Dec  5 00:44:11 1996
@@ -479,7 +479,7 @@
 	    for (i = r->max+1; i <= list->max; ++i)
 		r->types[i - r->min] = flag;
 	}
-	r->max = r->max;
+	r->max = list->max;
 	free(list);
     }
 
@@ -661,7 +661,7 @@
 	    list->types[x] = htonl(list->types[x]);
 
 	if (!list->map) {
-	    if (fwrite((char *)&list->types,
+	    if (fwrite((char *)list->types,
 		       (list->max - list->min + 1) * sizeof(unsigned long),
 		       1, fp) != 1) {
 		perror(locale_file);


>Audit-Trail:
>Unformatted:



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199612051045.CAA29752>