From owner-freebsd-fs@FreeBSD.ORG  Mon Jan 17 02:25:55 2005
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 4EDFD16A4CE
	for <freebsd-fs@freebsd.org>; Mon, 17 Jan 2005 02:25:55 +0000 (GMT)
Received: from wproxy.gmail.com (wproxy.gmail.com [64.233.184.200])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 6429643D3F
	for <freebsd-fs@freebsd.org>; Mon, 17 Jan 2005 02:25:54 +0000 (GMT)
	(envelope-from chiahsing@gmail.com)
Received: by wproxy.gmail.com with SMTP id 68so690165wri
        for <freebsd-fs@freebsd.org>; Sun, 16 Jan 2005 18:25:53 -0800 (PST)
DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws;
	s=beta; d=gmail.com;
	h=received:message-id:date:from:reply-to:to:subject:cc:in-reply-to:mime-version:content-type:content-transfer-encoding:references;
	b=rBsAioiUN7SUoILomxWUv/zVhcowg1+NR1vSzbQvM2iOGNj0R8w1ePojBG5unW+Xe3gyQLufSdOAvb/8yTIB7Z55JfgPw1ia18izH1de2KpupfsB/3Ggc2UO37OeSJVB3tMMyApW2hvLg1J48Lt2CL23EwPTbe6tvZQ7JWrjhj8=
Received: by 10.54.52.40 with SMTP id z40mr71151wrz;
        Sun, 16 Jan 2005 18:25:52 -0800 (PST)
Received: by 10.54.31.21 with HTTP; Sun, 16 Jan 2005 18:25:52 -0800 (PST)
Message-ID: <ad2ada4b05011618258c80402@mail.gmail.com>
Date: Sun, 16 Jan 2005 18:25:52 -0800
From: David Yu <chiahsing@gmail.com>
To: Christopher Vance <christopher@nu.org>
In-Reply-To: <20050117020039.GB630@nu.org>
Mime-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
References: <ad2ada4b05011613445601befe@mail.gmail.com>
	 <20050117020039.GB630@nu.org>
cc: freebsd-fs@freebsd.org
cc: freebsd-current@freebsd.org
Subject: Re: NTFS unicode converting problem
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
Reply-To: David Yu <chiahsing@gmail.com>
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 17 Jan 2005 02:25:55 -0000

In my case, those Chinese filenames are still in UCS-2LE, and all
characters are in plane 0. I tried to modify codes so that it can
directly store the conversion result into the dirent structure, but
the convchr() funtion failed everytime for Chinese character while
there is no problem with Ascii characters. I thought converting from
UCS-2 to UTF-8 should be very easy?

On Mon, 17 Jan 2005 13:00:39 +1100, Christopher Vance
<christopher@nu.org> wrote:
> On Sun, Jan 16, 2005 at 01:44:04PM -0800, David Yu wrote:
> >Hi, it seems that NTFS in FreeBSD uses a 16-bit long wchar to store
> >filename. When I wanted to convert some Chinese filename into UTF-8,
> >the conversion  was failed because a single Chinese character needs 3
> >bytes in UTF-8. Is anyone already working on this problem? If not, I
> >would like to do something about it. Any suggestions?
> 
> From memory, old Windows used UCS-2, while newer Windows uses UTF-16.
> Was the bad character in plane 0 or higher?
> 
> --
> Christopher Vance
>