FreeBSD Mail Archives

Date:      Wed, 29 Mar 95 11:25:07 MST
From:      terry@cs.weber.edu (Terry Lambert)
To:        dufault@hda.com (Peter Dufault)
Cc:        darrenr@vitruvius.arbld.unimelb.edu.au, freebsd-hackers@freefall.cdrom.com
Subject:   Re: Configuring driver added via LKM
Message-ID:  <9503291825.AA20071@cs.weber.edu>
In-Reply-To: <199503291017.FAA03749@hda.com> from "Peter Dufault" at Mar 29, 95 05:17:46 am

> No, I disagree.  That logic shouldn't be LKM code that isn't present
> in a config'd driver, it should be a standard driver entry point
> similar to probe and attach.

In the case of Win95, this entry point is called by what they call
a "volume tracking driver"; basically, something that knows about
stuff "going away".

The problem with LKM unload is non-trivial.  There are several types
of unload mechanisms it's desirable to implement.

The unload for file systems either needs to imply a forcible unmount
or deny the unload.  I took Sun's approach when first implementing
this, and the unload was allowed to return "EBUSY".  This is
grossly underpowered for the types of applications that are now
cropping up.

In reality, you want a rundown mechanism... basically a "schedule
for unload" rather than an unload.  This could allow a file system
(for instance) to run down one of two ways: 1) commit buffers and
forcibly unmount, invalidating any descriptors open on the volume
in the same way NFS does -- "stale handle", OR 2) prevent further
operations on the drive but do not interfere with those in progress;
this could take a significant amount of time to complete.

The forcible run down (approach #1 above) has a lot of merit for
hardware that you can press a button on that means "request eject"
instead of "eject by purely mechanical means" (a purely mechanical
eject is doomed to fail for file systems anyway, since there is no
mechanism to notify the file system that a commit should take place;
the only real alternative for ensuring media integrity is to use
only synchronus writes on the media).  The forcible rundown also
has a lot of merit for developers wanting to replace a module
with an updated version NOW instead of some time in the future.

The problem with this is that there *is* a requirement that the
driver know that it is or can be loaded as a module.

If we take the system call case, we have to ensure that the system
call is not entered when the unload taked place.  To do this, we
have to track entrance and exit to the call, and only unload the
call when the entrancy count is 0.

For this to be effective, we have to ensure that the call will not
continue to be entered, but since the entry is no isosynchronus, this
means that short of rewriting trap.c to be aware of the ability to
load calls and taking a hit on non-loaded calls, we have to have a
shunt based on a call-global flag that allows the call itself to
return ENOSYS to callers while it is still loaded.  The run down
entry sets this flag to prevent subsequent access.

Similarly, since the unload is now based on an event internal to the
system call, to wit, the decrementing of the entrancy count to 0,
the decrement itself must make two additional compares; one for the
1->0 transition, and one for the rundown flag to allow it to cause
the unload to be retriggered.  This is safest if done by causing a
wakeup of a different context so that the unload is handled not in
the code being unloaded prior to the call return (we could only
safely do that if the kernel was non-preemptible and non-reentrant,
since it would mean running code in an area designated as reallocable
for other uses -- at odds with our long term goals).

The alternative to this approach is to enter in the system call table
the address of a function in the LKM system call load component that
calls the loaded system call itself.  This includes not only the
compares and entrancy tracking that would otherwise be exposed in
the driver code, but also causes the addition of function call
overhead to each call into the loaded call -- although if we could
designate at load time that this module was load-only, the pointers
could be fixed up so that the overhead could be avoided.

Finally, there is a problem in the system call with signal handling;
the current signal implementation in the trap code is to declare a
global jump buffer in the process context to be used in case of
a system call interrupted by signal.  The problem with this is, of
course, that an exit via this path violates the single entry/exit
criteria that allow us to do the entrancy tracking in the first
place.  The function wrapper approach would solve this at some high
cost, but so would making the jump buffer a stack variable and
passing it into the system call itself as an argument; this is
actually a superior approach on both counts, since it avoids the
jump buffer diddling that would otherwise have to occur in the wrapper
function to get the correct call back into the trap level code on
signal interrupt.

Of course, that's just for hiding the LKM internals from the system
calls code itself, as has been proposed by the statement that the
driver should not have to have special code for it to be an LKM.  8-).

> I would probably (incorrectly but expeditiously in that it isn't
> really an isa problem) implement this by changing the definition
> of isa_driver to include "goaway(struct isa_device *isdp)", but I
> think that the "goaway" entry point in "kern_devconf" is supposed
> to do this.
> 
> In isa.c I would add something like:
> 
> isa_install_driver(struct isa_device *isdp, u_int *mp);
> isa_remove_driver(struct isa_device *isdp, u_int *mp);
> 
> "isa_install_driver" will pretty much just call config_isa_dev_c.
> 
> "isa_remove_driver" will call the driver goaway entry point, and
> if it returns 0, removes the isr if it was specified.  The goaway
> entry point will stop all activity if it can, deregister itself
> from kern_devconf, and so on.  At that point you can safely unload
> the LKM.

It's a little more complicated than that, unless you want to block
the process requesting the unload until the unload can be safely
completed (as has been shown for system calls).

In the case of a device that hooks an interrupt, it is not safe to
unhook the interrupt until such time as a detach procedure can be
run to guarantee that an interrupt on the hooked interrupt will not
fire on the device and have no driver to handle it.

What this basically means is that you must have an "action that may
result in an interrupt ending event" count -- similar to the entrancy
count for system calls.

That means that all outstanding requests to (say) a PCMCIA AIC7770
based SCSI controller must have completed.

This is, in effect, precisely the same rundown conditions for the
system calls entrancy.  The added complication here is that the
device may go away because it was taken away, or the device must
act as if it had been taken away by not causing additional activity
by having had pending operations full run down.

Bottom line in this case is that additional entry points won't
necessarily cut it; what is needed is a callback notification
mechanism to indicate run down complete.  Since we already know we
are going to be dealing with things like PCMCIA and other "hot plug"
devices (like SCSI tape drives that are externally attached and me
be powered on and off seperately from the machine) this should be
a more geberal mechanism than one that would apply only to LKMs.

> If you had these facilities you could pretty quickly come up with
> a utility that would install a driver's .o file directly without
> any LKM glue.  That would be nice for testing drivers.

Again, I think this would be predicated on wrapping the function
entry points with hidden LKM glue at a significant overhead
otherwise, unless you make it clear that the driver was a load
only driver and would not be unloaded.

I think in general that the features that make a driver a good
"plug-n-play" citizen will do the same for making it a good LKM
citizen; the ability to do both types of rundown is a required
feature in either case, and the question of whether you unload
an inactive driver or leave it in the kernel is rather moot.

The final missing piece is the ability to non-destructively attach
the driver while it is loaded; I don't think the attach is clean
enough for this, and there is no "resource manger" that will
notice the new "plug-n-play" device and do the apropriate callbacks
to "plug-n-play" aware drivers to try and get them to notice the
new hardware.

> What I haven't figured out is how this is supposed to play with
> kern_devconf or with the reconfig code already in isa.c supporting
> removable devices.  After all, this isn't isa specific and I think
> that kern_devconf is trying to address these issues.

Unfortunately, a lot of this is legacy code; in reality, the whole
device issue really wants to be glossed over entirely by allowing
the drivers to dynamically create and delete device nodes.  In
other words, a devfs with an implied mount that you can't get rid
of.

This little promiscuous wiring into the file system still needs to
be done; at the same time, you probably want to murder specfs, but
you may want to keep it around anyway to allow static /dev's to be
on / partitions for diskless clients running older OSs.

The final hook is for volume management so that when a device wants
to go away, the file system is forcibly shut down and buffers flushed
to allow the device to go away without blowing up the stateful mount
on the device.


What is really needed is for a group to go through a real architectural
overview and planning of what should be done and to get a clean API
layering designed.  The current approach of incrementally handling
what is wanted  at the moment, followed by the liberally pouring of
glue to fill in the cracks, is probably a mistake.


					Terry Lambert
					terry@cs.weber.edu
---
Any opinions in this posting are my own and not those of my present
or previous employers.

Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?9503291825.AA20071>

Header And Logo

Peripheral Links

Site Navigation

Header And Logo

Peripheral Links

Search

Site Navigation