From owner-freebsd-questions@FreeBSD.ORG Fri Apr 22 09:16:21 2005 Return-Path: Delivered-To: freebsd-questions@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id DD75E16A4CE for ; Fri, 22 Apr 2005 09:16:21 +0000 (GMT) Received: from proxy.ddcom.co.jp (proxy.ddcom.co.jp [211.121.191.163]) by mx1.FreeBSD.org (Postfix) with SMTP id 00B1043D3F for ; Fri, 22 Apr 2005 09:16:21 +0000 (GMT) (envelope-from rees@ddcom.co.jp) Received: (qmail 11415 invoked by alias); 22 Apr 2005 09:28:24 -0000 Received: from unknown (HELO matthew) (10.10.10.11) by mail.ddcom.local with SMTP; 22 Apr 2005 09:28:24 -0000 Date: Fri, 22 Apr 2005 18:16:24 +0900 From: Joel To: questions@freebsd.org In-Reply-To: <20050421231414.GC86130@gargantuan.com> References: <20050421231414.GC86130@gargantuan.com> Message-Id: <20050422163355.1998.REES@ddcom.co.jp> MIME-Version: 1.0 Content-Type: text/plain; charset="ISO-2022-JP" Content-Transfer-Encoding: 7bit X-Mailer: Becky! ver. 2.00.06 Subject: Re: special characters and how they are represented X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 22 Apr 2005 09:16:22 -0000 > hi folks. this may seem uber-simple to some of you, but i'm ignorant > regarding this. your help is appreciated. Just so you know, this is not a particularly trivial issue. (But things are improving.) > so, i have this album from Mvtley Cr|e (that looks right in vim, my > editor for mutt), and i have ripped it to FLAC and put it on my file > server. on the server, however, the directory name doesn't look like > that. well, it does if i pipe ls through more (ls | more). here are > the scenarios: > > 1) ls --> this shows "M?tley_Cr?e" as directory name > 2) ls | more --> this looks right, with umlaut over o and u > 3) ls M --> this shows "M\366tley_Cr\374e" (backslash366 & > backslash374, respectively), using csh as my shell w/set complete and > set autolist > > my question is... why the differences? Well, I could guess that the CD file system uses one encoding and your OS uses another and each application makes different assumptions? Also, some text editors will recognize your non-Latin characters, find the font, and display them. Some deal well with code points they don't recognize and show the numeric value of the code point (\nnn). Some print garbage. There first thing to do is to figure out what the character encoding on the CD is. But that likely requires you to know what character encoding(s) your tools are expecting, so the other first thing to do is figure out what encodings your tools and OS are expecting. I know Linux is moving away from euc to Unicode, and is still in process. last I remember, but I am not up to date on what freebsd is doing in this respect. (Lazy of me, I know.) > is there a way to force > consistent behavior across all three scenarios? Probably take a little work in your LOCALEs, may not be completely successful. -- Joel Rees digitcom, inc. $B3t<02q **