Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 23 May 2011 09:50:48 -0600
From:      Modulok <modulok@gmail.com>
To:        Frank Bonnet <f.bonnet@esiee.fr>
Cc:        freebsd-questions@freebsd.org
Subject:   Re: Filename containing French characters ?
Message-ID:  <BANLkTikEMQBm0743qaRsw-d%2B0RtWFxwEjw@mail.gmail.com>
In-Reply-To: <990E8670-2137-4F80-8D9D-BCEB05C6ECAA@esiee.fr>
References:  <990E8670-2137-4F80-8D9D-BCEB05C6ECAA@esiee.fr>

next in thread | previous in thread | raw e-mail | index | archive | help
Short answer, use a glob pattern. Assume I have a file named '=E0 fichier.t=
xt':

    ls -l
    -rw-r--r--  1 Modulok  Modulok       12 May 23 09:01 ?? fichier.txt

    mv ?\ fichier.txt aFile.txt

Long answer, for those who want to follow along and fix their terminal to
display UTF-8, keep reading...

Step 1: Make a funky file to play along with this min-tutorial:
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D

Create a text file with an editor that supports non-ASCII characters. I
created a file named 'filename' which containing this (no newline!):

        =E0 fichier.txt

Step 2: Create the actual file with content
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D

I used echo and cat like so in the tcsh shell:

        echo "hello world" > "`cat filename`"


Step 3: Show the file in ls
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D

As you can see below, the first character of the filename is displayed as t=
wo
question marks. This is the terminal's way of showing filenames that it can=
not
display correctly. There are two question marks, because this is a two-byte
character. This does *not* mean the filename starts with a literal question
mark:


    -rw-r--r--  1 Modulok  Modulok       12 May 23 09:01 ?? fichier.txt

Step 4: (optional) Fix the terminal
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D

At this point, let's just fix the terminal so that UTF-8 characters are
displayed correctly. We want to see the French accented '=E0', and not a bu=
nch of
question marks. To do this, you edit '/etc/login.conf' as root. Add two lin=
es
at the bottom of the 'default' section. My default section now looks like t=
his:


    default:\
            :passwd_format=3Dmd5:\
            :copyright=3D/etc/COPYRIGHT:\

            ...and so on...

            :charset=3Den_US.UTF-8:\
            :lang=3Den_US.UTF-8:

If you're a French operation yours should probably look like this instead:

    default:\
            :passwd_format=3Dmd5:\
            :copyright=3D/etc/COPYRIGHT:\

            ...and so on...

            :charset=3Dfr_FR.UTF-8:\
            :lang=3Dfr_FR.UTF-8:

I'm not certain on these for all countries, but the above examples work. We
then need to rebuild the actual login database. Execute the following comma=
nd
as root:

    cap_mkdb /etc/login.conf

This generates /etc/login.conf.db from /etc/login.conf. Now log out and the=
n
back in!


Step 5: Back to the funky file
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D

You should now see the actual accent characters correctly in the terminal.
(Assuming your terminal supports this):

    -rw-r--r--  1 Modulok  Modulok       12 May 23 09:01 =E0 fichier.txt

In some ternimals, we cannot type these characters. So you can access the
filename through a shell glob pattern. In most shells, the glob pattern '?'
matches any single character. The forward slash escapes the space in the
filename.

    mv ?\ fichier.txt aFile.txt


Hope this helps (and doesn't get too mangled.)
-Modulok-



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?BANLkTikEMQBm0743qaRsw-d%2B0RtWFxwEjw>