From owner-svn-src-projects@freebsd.org  Sun Nov  1 12:00:57 2015
Return-Path: <owner-svn-src-projects@freebsd.org>
Delivered-To: svn-src-projects@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id 04AACA235AC
 for <svn-src-projects@mailman.ysv.freebsd.org>;
 Sun,  1 Nov 2015 12:00:57 +0000 (UTC)
 (envelope-from bapt@FreeBSD.org)
Received: from repo.freebsd.org (repo.freebsd.org
 [IPv6:2610:1c1:1:6068::e6a:0])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (Client did not present a certificate)
 by mx1.freebsd.org (Postfix) with ESMTPS id B765513DF;
 Sun,  1 Nov 2015 12:00:56 +0000 (UTC)
 (envelope-from bapt@FreeBSD.org)
Received: from repo.freebsd.org ([127.0.1.37])
 by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id tA1C0tnc067375;
 Sun, 1 Nov 2015 12:00:55 GMT (envelope-from bapt@FreeBSD.org)
Received: (from bapt@localhost)
 by repo.freebsd.org (8.15.2/8.15.2/Submit) id tA1C0tEi067372;
 Sun, 1 Nov 2015 12:00:55 GMT (envelope-from bapt@FreeBSD.org)
Message-Id: <201511011200.tA1C0tEi067372@repo.freebsd.org>
X-Authentication-Warning: repo.freebsd.org: bapt set sender to
 bapt@FreeBSD.org using -f
From: Baptiste Daroussin <bapt@FreeBSD.org>
Date: Sun, 1 Nov 2015 12:00:55 +0000 (UTC)
To: src-committers@freebsd.org, svn-src-projects@freebsd.org
Subject: svn commit: r290233 - in projects/collation: lib/libc/locale
 usr.bin/localedef
X-SVN-Group: projects
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
X-BeenThere: svn-src-projects@freebsd.org
X-Mailman-Version: 2.1.20
Precedence: list
List-Id: "SVN commit messages for the src &quot; projects&quot;
 tree" <svn-src-projects.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/svn-src-projects>, 
 <mailto:svn-src-projects-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/svn-src-projects/>
List-Post: <mailto:svn-src-projects@freebsd.org>
List-Help: <mailto:svn-src-projects-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/svn-src-projects>, 
 <mailto:svn-src-projects-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 01 Nov 2015 12:00:57 -0000

Author: bapt
Date: Sun Nov  1 12:00:55 2015
New Revision: 290233
URL: https://svnweb.freebsd.org/changeset/base/290233

Log:
  libc: Fix (and improve) nl_langinfo (CODESET)
  
  The output of "locale charmap" is identical to the result of
  nl_langinfo (CODESET) for any given locale.  The logic for returning the
  codeset was very simplistic.  It just returned portion of the locale name
  after the period (e.g. en_FR.ISO8859-1 returned "ISO8859-1").
  
  When softlinks were added to locales, this broke.  e.g.:
     en_US returned ""
     en_FR.UTF8 returned "UTF8"
     en_FR.UTF-8 returned "UTF-8"
     zh_Hant_HK.Big5HKSCS returned "Big5HKSCS"
     zh_Hant_TW.Big5 returned "Big5"
     es_ES@euro returned ""
  
  In order to fix this properly, the named locale cannot be used to
  determine the encoding.  This information was almost available in the
  rune data.  Unfortunately, all the single byte encodings were listed
  as "NONE" encoding.
  
  So I adjusted localedef tool to provide more information about the
  encoding.  For example, instead of "NONE", the LC_CTYPE used by
  fr_FR.ISO8859-15 is now encoded as "NONE:ISO8859-15".  The locale
  handlers now check if the first four characters of the encoding is
  "NONE" and if so, treats it as a single-byte encoding.
  
  The nl_langinfo handling of CODESET was adjusting accordingly.  Now the
  following is returned:
     en_US returns "ISO8859-1"
     fr_FR.UTF8 returns "UTF-8"
     fr_FR.UTF-8 returns "UTF-8"
     zh_Hant_HK.Big5HKSCS returns "Big5"
     zh_Hant_TW.Big5 returns "Big5"
     es_ES@euro returns "ISO8859-15"
  
  as before, "C" and "POSIX" locales return "US-ASCII".  This is a big
  improvement.  The result of nl_langinfo can never be a zero-length
  string and it will always exclusively one of the values of the
  character maps of /usr/src/tools/tools/locale/etc/final-maps.
  
  Submitted by:	marino
  Obtained from:	DragonflyBSD

Modified:
  projects/collation/lib/libc/locale/nl_langinfo.c
  projects/collation/lib/libc/locale/setrunelocale.c
  projects/collation/usr.bin/localedef/wide.c

Modified: projects/collation/lib/libc/locale/nl_langinfo.c
==============================================================================
--- projects/collation/lib/libc/locale/nl_langinfo.c	Sun Nov  1 08:40:15 2015	(r290232)
+++ projects/collation/lib/libc/locale/nl_langinfo.c	Sun Nov  1 12:00:55 2015	(r290233)
@@ -37,7 +37,10 @@ __FBSDID("$FreeBSD$");
 #include <locale.h>
 #include <stdlib.h>
 #include <string.h>
+#include <runetype.h>
+#include <wchar.h>
 
+#include "mblocal.h"
 #include "lnumeric.h"
 #include "lmessages.h"
 #include "lmonetary.h"
@@ -54,14 +57,25 @@ nl_langinfo_l(nl_item item, locale_t loc
 
    switch (item) {
 	case CODESET:
-		ret = "";
-		if ((s = querylocale(LC_CTYPE_MASK, loc)) != NULL) {
-			if ((cs = strchr(s, '.')) != NULL)
-				ret = cs + 1;
-			else if (strcmp(s, "C") == 0 ||
-				 strcmp(s, "POSIX") == 0)
-				ret = "US-ASCII";
-		}
+		s = XLOCALE_CTYPE(loc)->runes->__encoding;
+		if (strcmp(s, "EUC-CN") == 0)
+			ret = "eucCN";
+		else if (strcmp(s, "EUC-JP") == 0)
+			ret = "eucJP";
+		else if (strcmp(s, "EUC-KR") == 0)
+			ret = "eucKR";
+		else if (strcmp(s, "EUC-TW") == 0)
+			ret = "eucTW";
+		else if (strcmp(s, "BIG5") == 0)
+			ret = "Big5";
+		else if (strcmp(s, "MSKanji") == 0)
+			ret = "SJIS";
+		else if (strcmp(s, "NONE") == 0)
+			ret = "US-ASCII";
+		else if (strncmp(s, "NONE:", 5) == 0)
+			ret = (char *)(s + 5);
+		else
+			ret = (char *)s;
 		break;
 	case D_T_FMT:
 		ret = (char *) __get_current_time_locale(loc)->c_fmt;

Modified: projects/collation/lib/libc/locale/setrunelocale.c
==============================================================================
--- projects/collation/lib/libc/locale/setrunelocale.c	Sun Nov  1 08:40:15 2015	(r290232)
+++ projects/collation/lib/libc/locale/setrunelocale.c	Sun Nov  1 12:00:55 2015	(r290233)
@@ -129,7 +129,7 @@ __setrunelocale(struct xlocale_ctype *l,
 
 	rl->__sputrune = NULL;
 	rl->__sgetrune = NULL;
-	if (strcmp(rl->__encoding, "NONE") == 0)
+	if (strncmp(rl->__encoding, "NONE", 4) == 0)
 		ret = _none_init(l, rl);
 	else if (strcmp(rl->__encoding, "UTF-8") == 0)
 		ret = _UTF8_init(l, rl);

Modified: projects/collation/usr.bin/localedef/wide.c
==============================================================================
--- projects/collation/usr.bin/localedef/wide.c	Sun Nov  1 08:40:15 2015	(r290232)
+++ projects/collation/usr.bin/localedef/wide.c	Sun Nov  1 12:00:55 2015	(r290233)
@@ -37,7 +37,6 @@
 #include <sys/cdefs.h>
 __FBSDID("$FreeBSD$");
 
-#include <ctype.h>
 #include <stdlib.h>
 #include <wchar.h>
 #include <string.h>
@@ -62,7 +61,8 @@ static int tomb_mbs(char *, wchar_t);
 
 static int (*_towide)(wchar_t *, const char *, unsigned) = towide_none;
 static int (*_tomb)(char *, wchar_t) = tomb_none;
-static const char *_encoding = "NONE";
+static char _encoding_buffer[20] = {'N','O','N','E'};
+static const char *_encoding = _encoding_buffer;
 static int _nbits = 7;
 
 /*
@@ -642,9 +642,9 @@ set_wide_encoding(const char *encoding)
 
 	_towide = towide_none;
 	_tomb = tomb_none;
-	_encoding = "NONE";
 	_nbits = 8;
 
+	snprint(_encoding_buffer, sizeof(_encoding_buffer), "NONE:%s", encoding);
 	for (i = 0; mb_encodings[i].name; i++) {
 		if (strcasecmp(encoding, mb_encodings[i].name) == 0) {
 			_towide = mb_encodings[i].towide;