Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 24 Sep 1998 10:03:10 +1000
From:      Bruce Evans <bde@zeta.org.au>
To:        bright@hotjobs.com, peter@netplex.com.au
Cc:        current@FreeBSD.ORG
Subject:   Re: Current is Really Broken(tm)
Message-ID:  <199809240003.KAA08756@godzilla.zeta.org.au>

next in thread | raw e-mail | index | archive | help
>The main problem with DEVFS is inherent in the way that it creates and 
>removes nodes on the fly for things like disk partitions that actually 
>exist.
>
>The problem is that DEVFS doesn't know if /dev/wd1s1e (for example) exists 
>until the disklabel is read.  The disk label isn't read until the device 
>is opened.  To open the device, you have to read the /dev/wd1s1e node..  
>Catch-22.  (Opening /dev/wd1s1 also causes the disklabel to be read, but 
>that's not the name of the device in /etc/fstab)

Not a real problem.  Use something like "test< /dev/wd1" to open the
device.  The real problem is in the implementation of rebuilding the
device entries after the disk goes away...

>Another other way is to do what the SLICE code did.  It proactively probed
>the disks after it was safe to do so, and used that information to populate

It actually probed the disks when it was UNsafe to do so (inside an
interrupt handler).

>Under the SLICE code, drivers have to do a different IO request mechanism,
>arrange a callback so they can probe themselves for partitioning and
>geometry information later in the boot and handle removeable media etc by
>communicating with devfs for each diskchange etc.

I don't understand the attraction of callbacks.  Only a limited number of
things can be done in interrupt context.  Accessing filesystem layers is
not one of them.  Cam seems to have similar complications.  When I tried
adding missing devfs initialization to scsi_da.c, it failed because
daregister() seems to be the only reasonable place to complete per-device
initialization, but daregister() is an interrupt handler.

Initialization can easily be handled in process context using a kernel
process.  Not so for reinitialization after a device goes away or appears
asynchronously.  At best, you'll get notified asynchronously and have to
push the handling to process context.

>A hack solution is to hack fsck, mount, etc so that if you have /dev/wd0s1e
>intended to be mounted on /home, and /dev/wd0s1e doesn't exist, then have
>them first open/close /dev/wd0s1 to cause the disklabel to get read.  

This doesn't work, because /dev/wd0s1 doesn't exist either.  /dev/rwd0
exists.  It only sort of works.  I see the following behaviour:

	$ cd /devfs_mountpoint
	$ ls rwd0s1
	ls: rwd0s1: No such file or directory
	$ ls rwd0s1
	rwd0s1
	$ cp rwd0s1 null
	cp: rwd0s1: Input/output error
	# Above error this is caused by devfs revoking rwd0s1 underneath us
	# when dsopen() rebuilds the device entries.
	$ cp rwd0s1 null
	^C
	# Above non-error is caused by some bug in last-close that prevents
	# dsopen() from seeing the need to rebuild the device entries.

Things work better if some partition is kept open, e.g. using
"sleep 2000000000 </dev/rwd0".  Then dsopen() doesn't attempt to handle
the device going away and doesn't rebuild the device entries.

>I think that's about the size of the situation..  There isn't all that much
>wrong with DEVFS, just that it's got some pretty hairy gotchas. There are/
>were problems with it's vnode usage, but I think they've been covered.

Naah, the ones that affect slicing are still there.  The slice code wants
devfs_remove_dev() to work like unlink() and not affect existing opens.
Devfs wants to completely remove the device, and begins by revoking
the vnode.  Rebuilding the device entries in open is the easiest case
(it is only done when there are no open partitions on the disk except
for the one being opened).  The hardest case is rebuilding them for a
DIOCSYNCSLICEINFO ioctl.  Then the revoke hangs on mounted partitions.

Bruce

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-current" in the body of the message



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199809240003.KAA08756>