Date: Thu, 24 Sep 1998 10:03:10 +1000 From: Bruce Evans <bde@zeta.org.au> To: bright@hotjobs.com, peter@netplex.com.au Cc: current@FreeBSD.ORG Subject: Re: Current is Really Broken(tm) Message-ID: <199809240003.KAA08756@godzilla.zeta.org.au>
next in thread | raw e-mail | index | archive | help
>The main problem with DEVFS is inherent in the way that it creates and >removes nodes on the fly for things like disk partitions that actually >exist. > >The problem is that DEVFS doesn't know if /dev/wd1s1e (for example) exists >until the disklabel is read. The disk label isn't read until the device >is opened. To open the device, you have to read the /dev/wd1s1e node.. >Catch-22. (Opening /dev/wd1s1 also causes the disklabel to be read, but >that's not the name of the device in /etc/fstab) Not a real problem. Use something like "test< /dev/wd1" to open the device. The real problem is in the implementation of rebuilding the device entries after the disk goes away... >Another other way is to do what the SLICE code did. It proactively probed >the disks after it was safe to do so, and used that information to populate It actually probed the disks when it was UNsafe to do so (inside an interrupt handler). >Under the SLICE code, drivers have to do a different IO request mechanism, >arrange a callback so they can probe themselves for partitioning and >geometry information later in the boot and handle removeable media etc by >communicating with devfs for each diskchange etc. I don't understand the attraction of callbacks. Only a limited number of things can be done in interrupt context. Accessing filesystem layers is not one of them. Cam seems to have similar complications. When I tried adding missing devfs initialization to scsi_da.c, it failed because daregister() seems to be the only reasonable place to complete per-device initialization, but daregister() is an interrupt handler. Initialization can easily be handled in process context using a kernel process. Not so for reinitialization after a device goes away or appears asynchronously. At best, you'll get notified asynchronously and have to push the handling to process context. >A hack solution is to hack fsck, mount, etc so that if you have /dev/wd0s1e >intended to be mounted on /home, and /dev/wd0s1e doesn't exist, then have >them first open/close /dev/wd0s1 to cause the disklabel to get read. This doesn't work, because /dev/wd0s1 doesn't exist either. /dev/rwd0 exists. It only sort of works. I see the following behaviour: $ cd /devfs_mountpoint $ ls rwd0s1 ls: rwd0s1: No such file or directory $ ls rwd0s1 rwd0s1 $ cp rwd0s1 null cp: rwd0s1: Input/output error # Above error this is caused by devfs revoking rwd0s1 underneath us # when dsopen() rebuilds the device entries. $ cp rwd0s1 null ^C # Above non-error is caused by some bug in last-close that prevents # dsopen() from seeing the need to rebuild the device entries. Things work better if some partition is kept open, e.g. using "sleep 2000000000 </dev/rwd0". Then dsopen() doesn't attempt to handle the device going away and doesn't rebuild the device entries. >I think that's about the size of the situation.. There isn't all that much >wrong with DEVFS, just that it's got some pretty hairy gotchas. There are/ >were problems with it's vnode usage, but I think they've been covered. Naah, the ones that affect slicing are still there. The slice code wants devfs_remove_dev() to work like unlink() and not affect existing opens. Devfs wants to completely remove the device, and begins by revoking the vnode. Rebuilding the device entries in open is the easiest case (it is only done when there are no open partitions on the disk except for the one being opened). The hardest case is rebuilding them for a DIOCSYNCSLICEINFO ioctl. Then the revoke hangs on mounted partitions. Bruce To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-current" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199809240003.KAA08756>