From owner-freebsd-arch Fri Jan 1 14:03:11 1999 Return-Path: Received: (from majordom@localhost) by hub.freebsd.org (8.8.8/8.8.8) id OAA00531 for freebsd-arch-outgoing; Fri, 1 Jan 1999 14:03:11 -0800 (PST) (envelope-from owner-freebsd-arch@FreeBSD.ORG) Received: from ns1.yes.no (ns1.yes.no [195.204.136.10]) by hub.freebsd.org (8.8.8/8.8.8) with ESMTP id OAA00523 for ; Fri, 1 Jan 1999 14:03:08 -0800 (PST) (envelope-from eivind@bitbox.follo.net) Received: from bitbox.follo.net (bitbox.follo.net [195.204.143.218]) by ns1.yes.no (8.9.1a/8.9.1) with ESMTP id XAA00855 for ; Fri, 1 Jan 1999 23:02:44 +0100 (CET) Received: (from eivind@localhost) by bitbox.follo.net (8.8.8/8.8.6) id XAA88406 for freebsd-arch@freebsd.org; Fri, 1 Jan 1999 23:02:44 +0100 (MET) Received: from critter.freebsd.dk (critter.freebsd.dk [212.242.40.131]) by hub.freebsd.org (8.8.8/8.8.8) with ESMTP id GAA03366 for ; Thu, 31 Dec 1998 06:08:12 -0800 (PST) (envelope-from phk@critter.freebsd.dk) Received: from critter.freebsd.dk (localhost [127.0.0.1]) by critter.freebsd.dk (8.9.1/8.8.5) with ESMTP id PAA94810 for ; Thu, 31 Dec 1998 15:07:17 +0100 (CET) To: arch@FreeBSD.ORG Subject: DEVFS, the time has come... From: Poul-Henning Kamp Date: Thu, 31 Dec 1998 15:07:17 +0100 Message-ID: <94808.915113237@critter.freebsd.dk> Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.ORG ... to make up our mind about it. There are a number of options open for us, and to make sure we talk about the substance let me stress once and for all that this dicussion should be about the >concept< of a devfs, not the code currently in the tree. Once concensus has been reached about what and how a DEVFS should be and react, we can start to study the code in the tree to see if it fits or what it will take to get a devfs that does. DEVFS, HERITAGE, CONCEPT and ADVANTAGES: I talked to Dennis Ritchie about the history of device nodes (CHR/BLK) when I ate breakfast with him in New Orleans. Orginally it was very crude but simple: Inode number 7 was the printer, 8 was the disk, and so on. The root-dir was around 40 at the time. This soon became a problem, so the device nodes were introduced, and we've had them ever since. Device nodes are a shortcut from one namespace (filesystem) to another (cdevsw/bdevsw arrays of device drivers). There is according to Dennis Ritchie no compelling reason to have a separate namespace for anything, if you could put it in the filesystem (plan#9 anybody ?) So what a devfs does is to remove the cdevsw and bdevsw namespaces of device drivers and instead attach not device drivers but instances of devices into the filesystem directly. The advantages of having a DEVFS is trying to solve are the following: 1. Static mapping of major devices must be maintained, this means that 3rd party drivers need to be catalogued and assigned numbers for, and major numbers are a limited resource: there are only 256 of them. 2. Dynamic creation of the needed device nodes, instead of magic shell scripts (MAKEDEV), such that the found devices are available in /dev, and if the device is not there on the next boot, the nodes in /dev are gone again. This is a major tickbox item for people working on Plug&Play, Cardbus, PCMCIA and other dynamic configuration technologies. 3. Avoid the NFOOBAR definitions in the drivers. A Devfs would allow the device driver to attach sufficient information to the vnode that it can find both its "softc" structure and a unit number from the vnode. Currently only the minor device number is available, and that only allows the unit number to be found, the softc struct must be found by taking (a subset of) the minor number as an array index. DEVFS will also remove the need to check the validity of the minor number in the drivers. 4. Clone devices. Rather than define 64 ptys in the system, the system should make a new ones on demand. This is hard, if not downright impossible, to do if you have to mknod something to be allowed to use the thing. There is no doubt that the sources will be cleaner and have less implicit cross dependencies with a well implemented DEVFS; for instance the code in UFS which special cases device nodes can be removed. There are some issues relating to devices and chroot jails, but they are well understood and no major trouble to implement, and I hope we can just ignore them in this discussion for now, as they are a subset of the general problems, and present no new or unique aspects of these problems. I don't currently know of anybody disagreeing in any of the above, but feel free to raise arguments against it if you have any. The PROBLEM: The sticky issue about DEVFS, at least in FreeBSD, is called "persistence". NON-PERSISTENT DEVFS: A "non-persistent" devfs boots up with all found devices visible, (probably mode 0600 root.wheel, 0700 for directories) and a script in /etc/dev.rc will contain the policy for the devices: chmod 660 cua* ; chown uucp.dialer cua* chmod 600 tty[dil]* ; chown root.wheel tty[dil]* chmod 640 fd[0-9]* ; chown root.operator fd[0-9]* ... If you remove a device, it's gone. No way to get it back short of a reboot, and it will be back after the next reboot if the driver is there and finds its hardware, so to remove a device as a policy, you would need to put the rm command in /etc/dev.rc. If you "whiteout" a device, you can get it back with "undelete" and don't need a reboot for it. You can create symbolic and hard links and directories in the DEVFS, but they will be gone on the next reboot, so if you want them around all the time, put them /etc/dev.rc There are no need for any special userland tools. I belive this completely decribes all aspects of a non-persistent DEVFS. PERSISTENT DEVFS: A "persistent" devfs will use some kind of stable storage to track the devices with. First time a device is found, the driver will suggest what mode, owner and group to set on the device. If root goes into the filesystem and does chmod 600 cua* the currently found cua* devices will get this new mode, and will have that mode also after a reboot. Including the case where for half a year the hardware is not there and then suddenly some hardware comes back that fits the bill. Once recorded in the persistence database, there is no way to say "restore defaults mode/owner/group" short of setting it manually. If a new cua device appears, it will still come up with the default driver based permissions, the wildcard aspect of the above command is not recorded, so it is left to the root to manually to enforce his policies. If a device is removed, it will not come back after reboot, so undelete will have to work on removed as well as whiteout'ed devices, effectively making whiteout and unlink the same thing. You can create symbolic and hard links and directories int he DEVFS, and they will be there after reboot. Implementing the peristence in the filesystem is messy, intricate and will take up significant amounts of code. SOME of the issues not addressed in this description: * format of persistance database: ascii file, binary file shadow inodes ? * how to manually list/edit the persistance database ? (tradeoff between ascii parsing code in the kernel vs. specialized reading/writing/editing userland tools.) * modifying the persistance databse for devices not currently found. (as above for specialized tools) * garbage collecting the persistance database. (ditto) * What happens if eg a symlink in the database collides with a newfound device, which entry takes precedence ? * If the persistance database lives in a filesystem, how does the kernel locate it at boot time ? HISTORY: We have had a DEVFS implementation in the tree for 32 months by now. That means from before 2.0.5 was released. The reason we still don't have a DEVFS as standard is that this persistence vs. non-persistence has not been sorted out. It is high time to get this thing settled and move on. OPINION: My personal preference is to take a non-persistent DEVFS. I have never changed the mode or deleted something in a /dev directory without it being a matter of policy. I think any such policy is far better expressed in a shell script run at boot time, where I can use all the facilities of the shell to implement my policy. Having my policy in only one place (unless I myself choose to split it), in a well known form (shell script), where I can put comments on it, and even have it under version control makes me feel good. In particular I like the idea of having wildcard names help make sure my policy also covers any devices added later in time. I can trust the contents of /dev to be in a known and well defined state after a reboot, a state which is conceptually easy to understand and readable in standard syntax for the user. No new tools to learn and know about. I do not feel as confident this would be the case with a persistent DEVFS. I don't like the concept of "shadow databases" expressed through in pseudoform through another database mechanism. I would need to be able to edit or at the very least read the persistence database (sound of agonized cries from AIX users heard in the background). How would I edit an entry in the persistance database for a device I do not currently have in my system ? What happens if I edit the database and somebody else does a chown at the same time? Can I add dormant entries to the database so that any devices appearing later will be set according to my policy? Can I use wildcards for it? It sounds to me like it will be much harder to implement a policy and enforce it, for a persistent DEVFS. It is obviously out of the question to implement the full shell syntax in the kernel, so either we need a special userland process to translate to and from a standard format, or a special toolset to list/edit the binary database. We're in essence talking about adding another namespace, a prospect that makes removing the cdevsw/bdevsw namespace pretty pointless in my book. How about the case where people try out some gadget, forgets about it for a number of months and buy some other gadget instead which the same driver recognizes, then sudenly some old stuff appears out of nowhere which may not even apply to that device, and since the device is there in the database, not even the device driver gets a chance to say what it feels about the issue ? Or even worse the device was removed so the "new" hardware looks like it doesn't work because nothing shows up in /dev ? I'm not fielding the support line on this issue. The fact that it is so much simpler to express the functioning of a non-persistent DEVFS, that so many so thorny issues are tangled up in the persistent DEVFS, makes me think that any advantages of a persistent DEVFS (I see none) are run over, rolled flat, scraped up and thrown out by the Keep It Simple Principle. How many people would ever know the difference anyway? Very few, I presume. I think most people stick with the default permissions, and the few who don't probably know what they are doing. They will therefore be perfectly capable of getting either of the two models to do what they want, maybe with the same bias as me that having a shell script to do it in is both cleaner and easier. Summary: I cannot see who in our user community will benefit from persistance in DEVFS, I don't see what benefits it brings, and I think it is overly complicated hard to implement right and errorprone in action. [My only concern with a non-persistent DEVFS is the permissions on device nodes that appear due to an event (e.g, a card insertion), and I think the can be adequately addressed by having a flag for a DEVFS mount that stop new nodes from automatically appearing in that instance of DEVFS. -EE] -- Poul-Henning Kamp FreeBSD coreteam member phk@FreeBSD.ORG "Real hackers run -current on their laptop." "ttyv0" -- What UNIX calls a $20K state-of-the-art, 3D, hi-res color terminal To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message