Date: Fri, 28 Aug 1998 00:36:04 +0200 (CEST) From: Stefan Bethke <stb@hanse.de> To: Terry Lambert <tlambert@primenet.com> Cc: archie@whistle.com, freebsd-hackers@FreeBSD.ORG Subject: Re: Warning: Change to netatalk's file name handling Message-ID: <Pine.BSF.3.96.980828001312.24324C-100000@transit.hanse.de> In-Reply-To: <199808272210.PAA27928@usr02.primenet.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, 27 Aug 1998, Terry Lambert wrote: > > > Netatalk seems like the wrong place to modify behavior to solve this > > > problem, which is a display problem, not an encoding problem. > > > > Where is the encoding defined for character values in the ranges between > > \0x01 to \0x1f, and \0x7f to \0xff in terms of UFS, POSIX, whatever? > > ISO 8859? Is this a standardized encoding for POSIX file names, or just a convention? If it only is a convention, what will non-latin script users think about it? How do we discriminate between different 8859 encodings? (Yeah, I see your point about "locales".) > > If you were right, it would be OK for afpd to store all chars literally. > > While this does work, it is definitly awkward to work with in the shell, > > and possibly so together with other applications as Samba as well. Its not > > merely an display issue; its an interoperability issue. I feel that too > > many things expect file names to confine to printable ascii, and unless > > this changes, I opt to fix what in my eyes is an obvious bug in afpd (that > > is, escaping \0x80 to \0xff, but leaving \0x01 to \0x1f and \0x7f > > untouched). > > Per interoperability: This presumes, incorrectly, that Mac's support > the same idiotic idea of code pages as SAMBA must. Macs, in this sense, use a single "code page." I believe there is an escape mechanism to change the encoding to non-latin scripts, but I will have to look that up in Inside Mac. For AFP 2.1 (which netatalk claims to support to the extent the Macs use it), there is a single encoding defined, without any escape mechanism. > > It won't change anything to the worse; the only problem is that existing > > files with file names containing control characters (custom icons on folders > > being the single source of such name probably) will stop working and will > > need manual assistance from an operator. > > It will break a number of things. It already breaks the file name > length limitation in SAMBA. Duplicating this break into Appletalk is, > IMO, a bad idea. I don't know much about SMB/CIFS/Samba. What is the filename length limit (as opposed, possibly, to the pathname limit)? AFP has a filename length limitation to 31 bytes/chars. All Unix-based AFP servers I know of choose to drop files with longer names. Also, at least two commercial products use the same mechanism for escaping non-ASCII chars. > If you are going to push this hard, you should consider Internataional > representation ofile names by client locale, and how it is already > handled. Would you mind to point me to any information shedding light on standardisation efforts for file name representation? In terms of "locale", this would mean that "Mac" or "AFP" would be it's own locale in terms of file name character encoding? After all, I see three possible ways: - improve interoperability by confining to printable ASCII (or ISO-8859-1, or...) and not escaping other glyphs, thus breaking AFP conformance; - escaping all glyphs (or rather their encoding) in a way that preserves the full AFP filename encoding space (for filenames, this is 0x01 to 0xff, with ":" being illegal as it is the path delimiter), but using printable ASCII where possible (this is, I believe, what netatalk tries to do, but doesn't, due to a stupid bug). - translate the AFP filename encoding space into some larger glyph encoding space, such as Unicode, or, more specifically, UTF-8. The last one probably is the way to go, but this would require (at least to me) some testimonial that Unicode in general and UTF-8 in particular is the way to go for file names in FreeBSD. This of course would probably start other interop problems with NFS and alike, and it would require samba to deal with CP bogosities in its own right instead of putting it in the face of every other app. > Novell servers are another case where the server assumes all clients > exist in a given locale; this would be a mistake to buy into... Yep. Cheers, Stefan -- Stefan Bethke Muehlendamm 12 Phone: +49-40-256848, +49-177-3504009 D-22087 Hamburg <stefan.bethke@hanse.de> Hamburg, Germany <stb@freebsd.org> To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-hackers" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?Pine.BSF.3.96.980828001312.24324C-100000>