Date: Fri, 22 Apr 2005 18:16:24 +0900 From: Joel <rees@ddcom.co.jp> To: questions@freebsd.org Subject: Re: special characters and how they are represented Message-ID: <20050422163355.1998.REES@ddcom.co.jp> In-Reply-To: <20050421231414.GC86130@gargantuan.com> References: <20050421231414.GC86130@gargantuan.com>
next in thread | previous in thread | raw e-mail | index | archive | help
> hi folks. this may seem uber-simple to some of you, but i'm ignorant > regarding this. your help is appreciated. Just so you know, this is not a particularly trivial issue. (But things are improving.) > so, i have this album from Mvtley Cr|e (that looks right in vim, my > editor for mutt), and i have ripped it to FLAC and put it on my file > server. on the server, however, the directory name doesn't look like > that. well, it does if i pipe ls through more (ls | more). here are > the scenarios: > > 1) ls --> this shows "M?tley_Cr?e" as directory name > 2) ls | more --> this looks right, with umlaut over o and u > 3) ls M<TAB> --> this shows "M\366tley_Cr\374e" (backslash366 & > backslash374, respectively), using csh as my shell w/set complete and > set autolist > > my question is... why the differences? Well, I could guess that the CD file system uses one encoding and your OS uses another and each application makes different assumptions? Also, some text editors will recognize your non-Latin characters, find the font, and display them. Some deal well with code points they don't recognize and show the numeric value of the code point (\nnn). Some print garbage. There first thing to do is to figure out what the character encoding on the CD is. But that likely requires you to know what character encoding(s) your tools are expecting, so the other first thing to do is figure out what encodings your tools and OS are expecting. I know Linux is moving away from euc to Unicode, and is still in process. last I remember, but I am not up to date on what freebsd is doing in this respect. (Lazy of me, I know.) > is there a way to force > consistent behavior across all three scenarios? Probably take a little work in your LOCALEs, may not be completely successful. -- Joel Rees <rees@ddcom.co.jp> digitcom, inc. $B3t<02q<R%G%8%3%`(B Kobe, Japan +81-78-672-8800 ** <http://www.ddcom.co.jp> **
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20050422163355.1998.REES>