Date: Thu, 17 Jan 2008 16:13:29 +0100 From: =?UTF-8?B?UmFmYcOrbCBDYXJyw6k=?= <funman@videolan.org> To: questions@freebsd.org Subject: Some UTF-8 characters are not representable on FreeBSD7 Message-ID: <20080117161329.69fe4135@zod.zod>
next in thread | raw e-mail | index | archive | help
[-- Attachment #1 --]
Hello,
I noticed I couldn't use some characters with libncursesw: namely ⚑ ⚐
and ⏏.
I run into some tests and found that some characters were reported as
unprintable, while on Linux all was fine.
I found it extremely strange since those characters would show up in my
terminal (gnome-terminal) when I pasted them.
Here are the results of the test I ran on Linux and FreeBSD:
[fun@zod ~]% uname -a ;./test
FreeBSD zod 7.0-BETA3 FreeBSD 7.0-BETA3 #0: Sun Dec 2 02:30:18 CET
2007 root@zod:/media/externe/usr/src/sys/ZOD i386 Locale:
fr_FR.UTF-8 OK a : 1
OK ⚑ : 0
OK ö : 1
OK ↑ : 1
OK © : 1
OK ⚐ : 0
OK é : 1
OK ⏏ : 0
[fun@zod ~]% uname -a ; LANG=fr_FR.ISO8859-15 ./test
FreeBSD zod 7.0-BETA3 FreeBSD 7.0-BETA3 #0: Sun Dec 2 02:30:18 CET
2007 root@zod:/media/externe/usr/src/sys/ZOD i386 Locale:
fr_FR.ISO8859-15 OK a : 1
OK ⚑ : 1
OK ö : 1
OK ↑ : 1
OK © : 1
OK ⚐ : 1
OK é : 1
OK ⏏ : 1
16:03 funman@altair ~% uname -a ; ./test
Linux altair 2.6.22-2-amd64 #1 SMP Thu Aug 30 23:43:59 UTC 2007 x86_64
GNU/Linux Locale: fr_FR.UTF-8
OK a : 32768
OK ⚑ : 1
OK ö : 1
OK ↑ : 1
OK © : 1
OK ⚐ : 1
OK é : 1
OK ⏏ : 1
A value of 0 means unprintable, a positive value means printable (there
is a graphical representation).
And here is the test I used:
#include <stdio.h>
#include <locale.h>
#include <stdlib.h>
#include <wchar.h>
int main(void)
{
printf( "Locale: %s\n", setlocale( LC_ALL, getenv( "LANG" ) ) );
#define MAX 8
const char const tab[MAX][6] = {
"a", "⚑", "ö", "↑", "©", "⚐", "é", "⏏"
};
int i;
wchar_t wc;
for( i = 0; i < MAX; i++ )
{
printf("%s ", mbtowc( &wc, tab[i], 6 ) ? "OK" : "KO" );
printf("%s : %d\n", tab[i], iswgraph( wc ) );
}
return 0;
}
I suppose this is a bug in UTF-8 locale, I tested with different
$LANG finished by "UTF-8" and the result was the same.
Am I right that an Unicode character should always have a graphical
representation in an UTF-8 locale ?
Thanks
--
Rafaël Carré
[-- Attachment #2 --]
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.4 (FreeBSD)
iD8DBQFHj3CcYWCeGMCv8Q8RAhTGAKCvuh60BrgBl8fQHEWgg+LFmj+fAACgzBaH
614hND+LTvD6IrwtSVH3Xtc=
=RJlK
-----END PGP SIGNATURE-----
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20080117161329.69fe4135>
