Date: Mon, 10 Jun 1996 01:14:08 +0200 (SAT) From: Robert Nordier <rnordier@iafrica.com> To: terry@lambert.org (Terry Lambert) Cc: hackers@freebsd.org Subject: Re: bit 7 in filenames Message-ID: <199606092314.BAA00185@eac.iafrica.com> In-Reply-To: <199606092059.NAA02136@phaeton.artisoft.com> from "Terry Lambert" at Jun 9, 96 01:59:35 pm
next in thread | previous in thread | raw e-mail | index | archive | help
Terry Lambert wrote: > > > The vfatfs (== rewritten msdosfs) will not actually create files > > containing illegal DOS filename characters. > > > > Currently, however, it offers a `translate' option which does a > > semi-intelligent mapping between characters valid on BSD and DOS. > > > > (Invalid DOS filename characters are those below 0x20, as well as > > the following sixteen: > > > > " * + , . / : ; < = > ? [ \ ] | > > > > All other characters including 0x20 and characters >= 0x80 are > > legal.) > > > > With the translate option enabled, Bruce's example would be > > acceptable, would be mapped to (say) > > > > /msdosfs/a2345678 this is a very long not to mention invalid > > msdos path.name > > > > (which DOS itself would accept) and would result in the file > > > > A2345678.NAM > > > > on a FAT filesystem. > > Actually, the IFS documentation with the SDK states that a directory > name can contain: > > o $ \ % ' - _ @ ~ ` ! ( ) > ^ ^ > | `- blank space > `- degree symbol > > A file name may contain: > > o $ \ % ' - _ @ ~ ` ! ( ) Thanks. Though the whole business of Microsoft documentation versus Microsoft practice tends to be rather a sore point. Over the last few years, I've disassembled and commented probably several thousand lines of MS-DOS 3.30, 5.00, and 6.22 code, including large chunks of 'io.sys' and 'msdos.sys', as well as (relevant stuff) much of 'format.com' and some parts of 'fdisk.exe'. And recently I've also been running seemingly endless tests on 'scandisk', in the course of developing the 'fsck_msdos' utility. Some observations to come out of this are: (a) If Microsoft documents any technical details about DOS, it almost invariably gets them wrong. (b) No two programmers at Microsoft seem to have the same idea about what is and isn't legal, at least for the FAT FS. That the Microsoft programmers don't seem to know what the $\%'-_@~`!() the real technical details are, half the time, tends to be evident in all sorts of ways. For instance, the problems that have arisen relating to use of the 0xe5 character in filenames _should_ have been at least somewhat predictable. And evidently whoever implemented filename checking in 'scandisk' has his own personal ideas about what is (0x7f) and what isn't (0x00) acceptable.... [List of further boring and abstruse technical details reluctantly omitted.] There is also the further issue of compatibility with various non-Microsoft versions of DOS. These include not only the IBM and ex-Digital Research stuff, but systems like Mike Podanoffsky's RxDOS and Pat Villani's DOS-C (used by the `Free-DOS Project'). I think the point is that, ultimately, it has to be a matter of `do as we do', not `do as we say'. Which doesn't, of course, mean that knowing what the party line is, isn't useful and even interesting. As regards specifics, a creat("a c e g .i k", 0666); is certainly acceptable to MS-DOS 6.22, so it is hard to know quite what to make of the directory/file-naming distinction for the space character, for instance. > > The following special characters can also be used in long file names > (but not short ones): > > : + , ; = [ ] > > Blank spaces can be anywhere in the long name, but blank spaces and > periods at the end of a long name are ignored. > > Case is preserved on storage, but ignored on lookup (DOS has seperate > interfaces for directory lookup as opposed to file opening). > > > I can also give you the "short name generation rules" (which aren't > really documented anywhere). They require directory iteration and > use of a monotonically increasing numeric "tail" substitution into > the file name (not affecting the extension, if any). > > > I have somewhat of an advantage, having been involved in a project > that ported the Heidemann framework and some of the FS modules and > most of the BSD FS kernel environment to Windows 95. 8-). I'd certainly appreciate all the information you can supply, if you don't mind taking the trouble. I've tested a lot of Linux, Mach, NetBSD, and GNU DOS FS-related code in the last few months, and what is particularly evident is a lack of rigorous attention to detail. Besides that, even in the generalities, I'm such I could learn a lot from your experience. > The conversion to parsed-path stuctures greatly aids in use of > Unicode and DOS code-page interoperability... you will need to > incorporate a number of patches if you expect to be able to > support two name binding, lookup, or Unicode storage (We have a > UFS where we have made these modifications). Yes, this is an area in the new vfatfs implementation that still needs work. -- Robert Nordier
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199606092314.BAA00185>