Date: Thu, 23 Oct 1997 16:40:01 +0000 (GMT) From: Terry Lambert <tlambert@primenet.com> To: hackers@FreeBSD.ORG Subject: Re: FreeBSD 3.0 kernel API ?! Message-ID: <199710231640.JAA23174@usr02.primenet.com> In-Reply-To: <199710220334.UAA23820@kithrup.com> from "Sean Eric Fagan" at Oct 21, 97 08:34:20 pm
next in thread | previous in thread | raw e-mail | index | archive | help
> >> How will you deal with struct ifnet when we rename all the member > >> variables from their current names to "opaque_variable_01" through > >> "opaque_variable_NN"? Even if you can depend on the structure, you > >> can't reasonably expect the kernel internal interface to not change. > >This sort of change I think is, to put it bluntly, fucked. > > Terry's problem is that he is forgetting that non-kernel bits are part of > the OS in unix. I'm not forgetting this. I'm fine with it. > This means that some non-kernel bits have to know the format (and location) > of some kernel data structures, sometimes. This, I stringly disagree with. Kernel internal structures are *internal*, that's the whole point in calling them that. 8-). What this means is that they should be accessed via accessor functions instead of directly. The best mechanism would be descriptor based to not externalize any structure changes, *ever*. > (While there are many cases where you can abstract this into a usable > API, there are many other cases where you can't -- because what you > want to get at is, indeed, exactly what the kernel is using.) Then I would argue that you are cutting the interface in the wrong place: in the middle of the iterator instead of above it. Let's take an example: the proc struct. I want to iterate the processes on the system, and provide some information as a result of this iteration. This may be because I'm the 'w' command, or it may be that I'm the 'ps' command, or it may be that I'm a to-be-written session manager. Right now, this can be done one of several ways: 1) open /dev/kmem via lib kvm (and need to know the proc struct size and layout) and grovel 2) popen() an existing command that does #1 already (and compound the difficulty of fixing #1) 3) iterate /proc (and need to have it mounted) Of these, the best programatic interface is currently #3. But it fails to operate, as 'ps' currently does, on system dump images. Let's forget for the moment that this functionality belongs in the system dump analysis tool instead of the regular commands. How do we make our putatively "new, improved" 'ps' command do these things? The easiest way would be to associate the iterator interface, not with the 'ps' program (and duplicate the code in all programs like it), but to provide access via an iterator mechanism (as in #3)... only not to depend on a cannonizing-data-exposure interface (like procfs). You don't want to depend on data-exposure because it can only expose the data of a running kernel. And libkvm is a non-cannonizing-data-exposure interface. So what do you do? The obvious soloution is to somehow make an association of an iterator interface with the image that creates the data. You can do this with ELF by making the interface, effectivly, a shared library which resides in the kernel image and exports data by descriptor. With ELF, you can do this. The trick it to make the kernel loader not load sections with a section attribute that indicates they are this interface, and to make the dlopen() interface take a section type argument (actually, you change dlopen() to wrap another interface with the third argument -- otherwise you lose backward compatability and have to do things like actually looking to the future -- ugh!). Another alternative would be to *not* ignore what we did before: remove the system dump analysis capability from 'ps' and 'w' and ..., and put it in a system dump analyzer tool. Then do conversion to approach #3. This makes procfs manadatory (I really don't think this is such a bad thing, myself). > This is further complicated by the fact that some utilities people have > decided are part of the OS are not maintained by us. These utilities must track system changes. That's all there is to it. If it becomes too burdensome, then FreeBSD must pay for the ability to make changes away from the mainstream by picking up maintenance. Can you say a.out? ...I knew you could. Maintenance of old GNU tools is the payment FreeBSD must render for not going to ELF with the rest of the world. Similarly, maintenance of things which grovel /dev/kmem (whether or not via libkvm is irrelevant) falls to FreeBSD as well. Which is the number one reason to eliminate the interface (number two is so you don't have to rebuild libkvm and everything which uses it each time a trivial kernel structure change takes place). Back to the original poster: There are two valid complaints you have, both of which require the core team to establish policy: 1) How do I write portable kernel code on other platforms which can be run on FreeBSD? Right now, if you use kernel structures, you must define "KERNEL (as someone else pointed out, this is a bogus name space incursion, and should be "_KERNEL). This will alleviate your complaint, assuming you are building "live" code. 2) How do I test kernel code in user space? The ability to test kernel codde in user space is a developement environmnet option. The FreeBSD core team is responsible for decisions of policy regarding whether or not this option is to be offered by FreeBSD. My personal take (since I'm not one of them) is that it's a desirable future goal to be able to develop and test kernel code in user space, and that to some extent, this will require a conversion to ELF to be able to externalize the kernel interfaces. I would like to see a formal DDI/DKI definition, and I'd like to see that definition result in a user space test harness and transport layer. But what does this buy you beyond LKM's? It buys you the ability to do source level debugging. Right now, to get source level kernel debugging, it requires two machines. So the answer for right now is to use LKM's, put the code into the running kernel, and use another machine to get source debugging. Hopefully, this subject is now exhausted. Regards, Terry Lambert terry@lambert.org --- Any opinions in this posting are my own and not those of my present or previous employers.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199710231640.JAA23174>