Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 07 Jul 2004 15:19:51 +0900
From:      Alexander Nedotsukov <bland@FreeBSD.org>
To:        NAKAJI Hiroyuki <nakaji@tutrp.tut.ac.jp>
Cc:        gnome@FreeBSD.org
Subject:   Re: converters/libiconv change request for net/samba3
Message-ID:  <40EB9607.6020906@FreeBSD.org>
In-Reply-To: <87fz84lfaw.fsf@roddy.acest.tutrp.tut.ac.jp>
References:  <87acyd8zg0.fsf@roddy.acest.tutrp.tut.ac.jp> <40EA57EB.4060607@FreeBSD.org> <871xjp8sim.fsf@roddy.acest.tutrp.tut.ac.jp> <87fz84lfaw.fsf@roddy.acest.tutrp.tut.ac.jp>

next in thread | previous in thread | raw e-mail | index | archive | help
NAKAJI Hiroyuki wrote:

>I am very lucky to get some informations from my friends. They say that
>libiconv is not complete and it needs refinement.
>
>1. Miracle Linux, one of the Linux distribution company in Japan which
>supports Samba i18n, has a web page about iconv problem. Please check it.
>
>http://www.miraclelinux.com/english/technet/samba30/iconv_issues.html
>
>2. Mr. Iijima gave me a sample explanation.
>
><cite>
>(1)YEN SIGN: When ISO 646 was localized to JIS X0201, JIS committee changed
>\x5C from backslash to yen sign. Most Japanese people, however, have used
>for a long time yen sign in place of backslash as metacharacter such as
>pathname separator on DOS/Windows or on C source code or shellscripts.
>
>Therefore, Microsoft did a trick. Microsoft mapped JIS X0201's \x5C to
>Unicode backslash (U+005C) whereas they left its glyph as yen sign.
>
>(2)OVERLINE: The same story above applies to \x7E. JIS X0201 now states
>that \x7E is overline by default but can be replaced with tilde.
>
>The whole mapping table is available at:
>http://www.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/WINDOWS/CP932.TXT
></cite>
>
>Thanks.
>  
>
Well. I wasn't too specific last time saying about yen sign and overline 
symbol, sorry.
Take a look at this:
http://www.unicode.org/Public/MAPPINGS/OBSOLETE/EASTASIA/JIS/JIS0201.TXT
and this:
http://www.unicode.org/Public/MAPPINGS/OBSOLETE/EASTASIA/JIS/SHIFTJIS.TXT
Then compare to cp932. In short all the above Microsoft tricks are 
present in last table and GNU libiconv handle them correctly (though it 
have a problem with another symbols). I consider proposed patch as a 
hack for another mappings to behave same way. And this doesn't looks 
good for me. If Microsoft called some hacked Shift_JIS version Shift_JIS 
it doesn't make it valid for the rest of the world.
I'll be happy to commit round trip issue fix to cp932 and add optional 
eucJP-ms support but leave everyting else as it is now.
Btw, are you guys pretty sure you problem comes form libiconv? I have 
few japanese windows workstations here and if you like can check what's 
wrong with them. Just give me a simple instructions how to reproduce a 
problem in this case. Why I asking because I already saw false reports 
about libiconv problems when people tried to convert windows client 
encoding to samba's host encoding and this is not always possible. For 
instance you can not have 1:1 mapping between cp932 and eucJP.

All the best,
Alexander.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?40EB9607.6020906>