From owner-freebsd-gnome@FreeBSD.ORG Wed Jul 7 06:19:59 2004 Return-Path: Delivered-To: freebsd-gnome@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 93E3016A4CE for ; Wed, 7 Jul 2004 06:19:59 +0000 (GMT) Received: from smtp2.jp.viruscheck.net (smtp2.jp.viruscheck.net [154.33.69.53]) by mx1.FreeBSD.org (Postfix) with ESMTP id 6943C43D48 for ; Wed, 7 Jul 2004 06:19:59 +0000 (GMT) (envelope-from bland@FreeBSD.org) Received: from scan4.jp.viruscheck.net ([154.33.69.39] helo=mail1.jp.viruscheck.net) by smtp2.jp.viruscheck.net with esmtp (Exim 3.36 #1) id 1Bi5mg-0001af-00; Wed, 07 Jul 2004 15:19:58 +0900 Received: from [220.221.3.201] (helo=noc.orchid) by mail1.jp.viruscheck.net with esmtp (Exim 3.36 #3) id 1Bi5mg-0006fb-00; Wed, 07 Jul 2004 15:19:58 +0900 Received: from [89.60.10.11] (horse.orchid [89.60.10.11]) by noc.orchid (8.12.11/8.12.11) with ESMTP id i676JpJN005940; Wed, 7 Jul 2004 15:19:57 +0900 (JST) (envelope-from bland@FreeBSD.org) Message-ID: <40EB9607.6020906@FreeBSD.org> Date: Wed, 07 Jul 2004 15:19:51 +0900 From: Alexander Nedotsukov User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8a1) Gecko/20040520 X-Accept-Language: en-us, en MIME-Version: 1.0 To: NAKAJI Hiroyuki References: <87acyd8zg0.fsf@roddy.acest.tutrp.tut.ac.jp> <40EA57EB.4060607@FreeBSD.org> <871xjp8sim.fsf@roddy.acest.tutrp.tut.ac.jp> <87fz84lfaw.fsf@roddy.acest.tutrp.tut.ac.jp> In-Reply-To: <87fz84lfaw.fsf@roddy.acest.tutrp.tut.ac.jp> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit cc: gnome@FreeBSD.org Subject: Re: converters/libiconv change request for net/samba3 X-BeenThere: freebsd-gnome@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: GNOME for FreeBSD -- porting and maintaining List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 07 Jul 2004 06:19:59 -0000 NAKAJI Hiroyuki wrote: >I am very lucky to get some informations from my friends. They say that >libiconv is not complete and it needs refinement. > >1. Miracle Linux, one of the Linux distribution company in Japan which >supports Samba i18n, has a web page about iconv problem. Please check it. > >http://www.miraclelinux.com/english/technet/samba30/iconv_issues.html > >2. Mr. Iijima gave me a sample explanation. > > >(1)YEN SIGN: When ISO 646 was localized to JIS X0201, JIS committee changed >\x5C from backslash to yen sign. Most Japanese people, however, have used >for a long time yen sign in place of backslash as metacharacter such as >pathname separator on DOS/Windows or on C source code or shellscripts. > >Therefore, Microsoft did a trick. Microsoft mapped JIS X0201's \x5C to >Unicode backslash (U+005C) whereas they left its glyph as yen sign. > >(2)OVERLINE: The same story above applies to \x7E. JIS X0201 now states >that \x7E is overline by default but can be replaced with tilde. > >The whole mapping table is available at: >http://www.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/WINDOWS/CP932.TXT > > >Thanks. > > Well. I wasn't too specific last time saying about yen sign and overline symbol, sorry. Take a look at this: http://www.unicode.org/Public/MAPPINGS/OBSOLETE/EASTASIA/JIS/JIS0201.TXT and this: http://www.unicode.org/Public/MAPPINGS/OBSOLETE/EASTASIA/JIS/SHIFTJIS.TXT Then compare to cp932. In short all the above Microsoft tricks are present in last table and GNU libiconv handle them correctly (though it have a problem with another symbols). I consider proposed patch as a hack for another mappings to behave same way. And this doesn't looks good for me. If Microsoft called some hacked Shift_JIS version Shift_JIS it doesn't make it valid for the rest of the world. I'll be happy to commit round trip issue fix to cp932 and add optional eucJP-ms support but leave everyting else as it is now. Btw, are you guys pretty sure you problem comes form libiconv? I have few japanese windows workstations here and if you like can check what's wrong with them. Just give me a simple instructions how to reproduce a problem in this case. Why I asking because I already saw false reports about libiconv problems when people tried to convert windows client encoding to samba's host encoding and this is not always possible. For instance you can not have 1:1 mapping between cp932 and eucJP. All the best, Alexander.