From owner-freebsd-fs@FreeBSD.ORG Thu Jan 20 02:00:00 2005 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id F2CB516A4CE for ; Thu, 20 Jan 2005 01:59:59 +0000 (GMT) Received: from wproxy.gmail.com (wproxy.gmail.com [64.233.184.202]) by mx1.FreeBSD.org (Postfix) with ESMTP id 2BC4943D58 for ; Thu, 20 Jan 2005 01:59:58 +0000 (GMT) (envelope-from chiahsing@gmail.com) Received: by wproxy.gmail.com with SMTP id 70so39830wra for ; Wed, 19 Jan 2005 17:59:57 -0800 (PST) DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=beta; d=gmail.com; h=received:message-id:date:from:reply-to:to:subject:in-reply-to:mime-version:content-type:content-transfer-encoding:references; b=EIXRhWi+s9IRtAVhkk43lGjZKnkMVnYj+yCSOO/7ygBmcFb4NxFsvSChBIkdUN9K1XrDZEGOaj9RgjmgyDhsympXL8Mo0WsLBEtzrJjmNzJJaQx0j9FeavwkDNpLm39bK12C0Uk5WTfOxAHGhgXDt+E5I8Jk7cBHu+SkB2EhPzY= Received: by 10.54.13.43 with SMTP id 43mr287799wrm; Wed, 19 Jan 2005 17:59:57 -0800 (PST) Received: by 10.54.31.21 with HTTP; Wed, 19 Jan 2005 17:59:57 -0800 (PST) Message-ID: Date: Wed, 19 Jan 2005 17:59:57 -0800 From: David Yu To: freebsd-fs@freebsd.org, freebsd-current@freebsd.org In-Reply-To: Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit References: <20050117020039.GB630@nu.org> <20050117032255.GC630@nu.org> Subject: Re: NTFS unicode converting problem X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list Reply-To: David Yu List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 20 Jan 2005 02:00:00 -0000 I wrote a patch for mounting NTFS as UTF-8. The patch is at http://www.cse.ucsd.edu/~chyu/ntfs-utf8.diff . After patch, you need to recompile the kernel module libiconv and ntfs (or the whole kernel if you do not use kernel module). This patch should solve the problem that the original libiconv in the kernel cannot convert characters to UTF-8 longer than 2 bytes per char. I ported the UTF-8 <-> UCS-2 converter from GNU libiconv into the kernel. However, I don't know how to implement "casetype" in the conversion function, this may cause some problem in case insensitive matching used in msdosfs. I think this patch should just be a temporary workaround. The libiconv in the kernel should be rewritten for a good structure.