Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 27 Aug 1998 03:19:51 +0200 (CEST)
From:      Stefan Bethke <stb@hanse.de>
To:        Archie Cobbs <archie@whistle.com>
Cc:        freebsd-hackers@FreeBSD.ORG
Subject:   Re: Warning: Change to netatalk's file name handling
Message-ID:  <Pine.BSF.3.96.980827030908.15225C-100000@transit.hanse.de>
In-Reply-To: <199808262315.QAA08454@bubba.whistle.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Wed, 26 Aug 1998, Archie Cobbs wrote:

> Stefan Bethke writes:
> > > Netatalk seems like the wrong place to modify behavior to solve this
> > > problem, which is a display problem, not an encoding problem.
> > 
> > Where is the encoding defined for character values in the ranges between
> > \0x01 to \0x1f, and \0x7f to \0xff in terms of UFS, POSIX, whatever?
> > 
> > If you were right, it would be OK for afpd to store all chars literally.
> > While this does work, it is definitly awkward to work with in the shell,
> > and possibly so together with other applications as Samba as well. Its not
> > merely an display issue; its an interoperability issue. I feel that too
> > many things expect file names to confine to printable ascii, and unless
> > this changes, I opt to fix what in my eyes is an obvious bug in afpd (that
> > is, escaping \0x80 to \0xff, but leaving \0x01 to \0x1f and \0x7f
> > untouched).
> 
> I guess that makes sense, if netatalk is already escaping 0x80-0xff in
> the same way..

It does.

> This goes deeper of course... ie, a byte is not the same thing as
> a character, and the question is, what is the character set and
> what is the encoding between a character and one or more bytes?
> Julian's right, in that you should query Jeremy about this for
> more complete info (but probably only for amusement value :-)

OK. AFP (from 1.0 on) defined the encoding for file names to be the Mac
encoding, minus \0x00. I don't really know about non-latin scripts, but I
believe the Mac uses it's own escape mechanism to switch to those encodings.

AFAIK, there is no character set or encoding defined for file names in
FreeBSD, in UNIX, or in POSIX. The only implicit definition is plain ASCII.

Even if we were to translate from the Mac enconding to (say) ISO-8859-1,
this would loose some of the chars legal in Mac filenames, causing grief to
the typical unsuspecting graphics designer.

So we need afpd to confine to ASCII, and, as I would suggest, to printable
ASCII, as this will make most peoples' live easier (for ASCII, byte values
from \0x00 to \0x1F and \0x7F do not produce a glyph, so it is practically
useless to store them as-is).

> [ On the InterJet, for example, you can have it set to Japanese mode,
> and shared files appear with the same name under AppleTalk and
> Windows, ie, Samba and Netatalk use the same character encoding. ]

That is definitly cool. I hope Julian can provide me with either the patches
or the contact, so I can (at least) evaluate and turn down the patches :-)

Stefan

--
Stefan Bethke
Muehlendamm 12            Phone: +49-40-256848, +49-177-3504009
D-22087 Hamburg           <stefan.bethke@hanse.de>
Hamburg, Germany          <stb@freebsd.org>


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-hackers" in the body of the message



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?Pine.BSF.3.96.980827030908.15225C-100000>