Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 10 Mar 2003 07:33:27 -0800
From:      Terry Lambert <tlambert2@mindspring.com>
To:        Mark Murray <mark@grondar.org>
Cc:        Damien Tougas <damien@tougas.net>, Andrew Boothman <andrew@cream.org>, John Baldwin <jhb@FreeBSD.ORG>, freebsd-chat@FreeBSD.ORG
Subject:   Re: A question about kernel modules
Message-ID:  <3E6CB047.5D517419@mindspring.com>
References:  <200303080856.h288ubIg021994@grimreaper.grondar.org>

next in thread | previous in thread | raw e-mail | index | archive | help
Mark Murray wrote:
> Terry Lambert writes:
> > > configuring my kernel. If there are good reasons why I should not use them
> > > (or not use them in specific situations), I would be intersted in knowing
> > > what those are.
> >
> > The dependency tracking sucks, and so does demand-loading.  If
> > you look at the module code with the idea of loading *at least
> > one* ethernet driver before you could load IP, and having to
> > load IP before you could load TCP, and then look at the code to
> > see what this would take, you will be enlightened.
> 
> That is not the way you would do it. Each Ethernet module would
> require IP, which in turn would co-require TCP and co-require UDP
> etc.  If you tried to load IP with no ethernet driver, you'd get
> the loopback device, and TCP/UDP etc.

You've sort of got that backward, I think.  I can have all sorts
of protocols that *aren't* TCP that run on top of IP, and I can
have all sorts of protocols that aren't IP that run over an
ethernet interface.

The ethernet interface might just be a promiscuous mode interface,
with a BPF sitting on top of it, and nothing else (for example),
or that Novell answer to the lack of a sliding window in SPX,
TCP/IPX, might be an option, too.

The point is that it's all about producer/consumer relationships,
and consumers need producers, but not the other way around.


> Other tweaks would need to be done if you need IPX instead of IP.

Yes.  The direct calls to ip_output() would need to be indirected
through the PCB, so it could be properly stacked.  That's just a
minimal thing, though.


> > There are also certain options which cause structure sizes to
> > change, which are associated with particular things.  As an
> > example, the IPSEC stuff can't really be modularized, because
> > there's per connection state that has to be there for it to
> > be happy.
> 
> Possibly.

I can demonstrate this one.  The real problem with IPSEC in the
TCP/IPv4 case is that it was poorly retrofit from the KAME code
into the IPv4.  This caused a number of problems, one of which
was your total number of connections supportable shrunk *a lot*
when you enabled IPSEC, even if none of your connections were
using it (we had done this for VPN support).


> > Another issue having to do with structure size is that if the
> > module you are trying to load was not compiled with the same
> > options as the kernel you are trying to load it into, even if
> > all the version stuff matches, including the proposed new
> > versioning data, the structure sizes expected by the module
> > and by the kernel can be different.  A good example of this is
> > something like "WITNESS" or "INVARIANTS", etc..
> 
> Hmm. Maybe its time for a export of certain compile-time options.

You mean as sysctl's?  Or you mean in an exported options file?

I've always though having the config available in the kernel (it's
there if the option is used), is probaby enough of an "export",
if you are using it at compile time.  You can even get enough out
of it to check at runtime, if you ned to, though the data requires
a little too much processing for me to want to use the code in a
module.

What Warner Losh was suggesting the other day, about versioning the
kernel API, seems a lot more to my taste.  One of the things Poul
did recently, bringing in the C99 syntax requirements, is, I think,
a step in the direction of decoupling.  If it's going to become a
dependency anyway, might as well do it right (for example, it's
possible to get rid of one level of indirection in the VFS stack
descriptors, using that approach, though you have to put it back
in for a stacking layer like Heidemann's network transport layer,
since peer machines may not have the same descriptor structure
ordering).

My personal preference is Julian's repeated suggestions that the
structure sizes not vary based on options.  That's still pretty
hard to justify for something like "WITNESS", though, because a
lot of the undesirable overhead isn't hookable or simply testable.


> > The normal performance cost is all interfaces being indirected
> > through a pointer.  For most interfaces, this overhead is there
> > anyway, so that all access is uniform.  For other things, like
> > schedulers, for example, the functions are linked directly, so
> > they have to be resolved at compile time.
> 
> Certian things (like the scheduler) would be harder to make into
> modules. This is true.

Actually, I think it would be pretty trivial.  Jeff has talked about
it, and it's probably less than a day to hack it up.  You would need
to keep at least one scheduler static (I suggest the standard one),
and then on load, link a struct containing a string name and version
and function decriptor list, onto a linked list of schedulers.  Then
to pick one, you set the name into the environment in the loader, and
if it doesn't match anything, you get the default, and if it does,
then you get whichever one you selected.  It'd be really easy.

At this point, I don't think anyone is really equipped to perform
scheduler benchmarking; the closest is the "worldstone" ("make world")
that people like Bruce and Jeff have used, and it's not really a
good example of dynamic load.  Without a benchmark, it's probably too
controversial to change the scheduler entry points into pointer
indirects instead of pre-resolved function calls, because you'd never
be able to sufficiently determine the overhead to everyone's idea of
satisfaction.  8-(.

-- Terry

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-chat" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?3E6CB047.5D517419>