FreeBSD Mail Archives

Date:      Sat, 29 Apr 2000 10:24:52 -0400 (EDT)
From:      Robert Watson <rwatson@FreeBSD.org>
To:        Luigi Rizzo <luigi@info.iet.unipi.it>
Cc:        Archie Cobbs <archie@whistle.com>, committers@FreeBSD.org, freebsd-net@FreeBSD.org
Subject:   Re: Proposal for ethernet, bridging, netgraph
Message-ID:  <Pine.NEB.3.96L.1000429095802.1309A-100000@fledge.watson.org>
In-Reply-To: <200004291010.MAA36762@info.iet.unipi.it>

On Sat, 29 Apr 2000, Luigi Rizzo wrote:

> Robert, thanks for your code. Will try to have a look at it during
> the weekend. I have/need a few comments if you dont mind:

Steve Gunn has also pointed out a couple of bugs which may or may not have
been inheritted from the original code, with regards to the IP packet
sanity checking section, and a missing m_pullup().

> > a few weeks, so don't guaranty it will apply cleanly.  I also threw in by
> > bridge0 interface support, which allows BPF to be used to monitor packets
> 
> this is interesting. I wonder, though, if your 'bridge0' interface
> is one for the whole machine, or you have one for each cluster
> of interfaces as defined in net.link.ether.bridge_cfg

You are correct -- I haven't had a chance to really understand the cluster
behavior, but really we should be allocating one bridge interface per
cluster. 

What I'd actually be interested is a more generalized approach, in which
virtual networks are defined, and actually physical and logical resources
are assigned to the virtual network.  This would be particularly useful
from the perspective of the jail code--right now IP addresses are assigned
to jails, but imagine instead that each jail was assigned a virtual
interface, on which it could perform all the normal activities (raw
sockets, BPF, binding, etc) but limited by the scope of the resources
assigned to the virtual interface from outside the jail.  In this way, we
could far more effectively and with greater flexibility define the network
resources for a jail.

To some extent, netgraph may already be able to offer us some of this
functionality--I wonder if it would make sense to push both the jail code,
and filtering capabilities, into netgraph nodes.  I don't think netgraph
is currently capable of expressing the ``you have only yay much of a
packet, to get more ask for more,'' which would give you the same
performance problem with the multiple edX case.

> > o IPFW divert/fwd are not implemented
> > 
> > These are both troubled due to the code paths associated with bridging vs. 
> > packet forwarding vs. local delivery, and the possibility of duplicate
> > delivery.  I'm beginning to suspect that the real solution here is the
> 
> because divert/fwd (at least as defined now) is a functionality above
> level 2, i think the easy (and maybe most correct) way to implement
> it is to interpret divert/fwd as a DENY for bridged packet, and
> when the same packet gets to ip_input() do the required action.
> The tricky part is that when a bridged packet matches a divert/fwd
> rule, its tag should be changed by bdg_forward() so that it is
> passed to ether_input() even if it does not have a local addr.

That was my conclusion also.  However, part of the problem here is that
unlike IP forwarding layer stuff, you don't ``know'' if the packet was
intended to pass to a node on the other side or not, and as such if divert
were used to perform a packet transformation (say, a proxy of some sort),
that would fairly seriously break the semantic that both sides of the
bridge see similar things, as well as making it unclear when the proxy
should be applied without a priori knowledge.  I.e., you could no longer
just say, ``Proxy all TCP connections'' because it's not clear when a SYN
packet is destined for two nodes on the same segment, or two nodes
seperated by the bridge.

> > o IPFW DUMMYNET still implicit ACCEPT
> > 
> > When using IPFW and DUMMYNET with BRIDGE, PIPE commands implicitely ACCEPT
> > after the packet has suffered from traffic shaping.  This is bad, should
> > be in our ERRATA for various releases, and probably fixed.  If fixed, it
> > should be documented as such.
> 
> you are a committer, arent you :)

I'm more interested in the, ``fix'' solution, and haven't tried to tackle
it yet as I wasn't clear on the reasons for not implementing it.  Were
there specific problems doing so that I have not yet come across, or is it
just a ``change the gotos appropriately'' kind of arrangement?  I'm not
familiar with the ERRATA process but will inquire.

> > BTW, on the DUMMYNET front, my feeling is that rather than using mbuf
> > queueing routines for managing queues, it would be better/easier to use
> > some sort of DUMMYNET queue structure that maintained meta-data, as
> 
> this was done to avoid changing the interface of ip_input() and
> ip_output(), which i really don't think we want to change.
> Yes we can replace the fake mbuf with some other data structure,
> but other than making the code a bit more readable i don;t think
> it fundamentally changes things.

Yes -- it's not a big functional change, but I think it would sort out a
few of the layering problems I experienced when cleaning up the
bridge/ipfw code.  I.e., when the packet pops out of a pipe, being able to
determine it's an IP packet based on the ethernet type and pass it back
into the ipfw code based on that knowledge.  If the mbuf packet header
pointer is set to the ethernet header (I suspect not) then that could also
be used.

> > On an unrelated note, it would be a good idea if we did real spanning tree
> > stuff--I have a copy of the appropriate IEEE spec, but haven't had a
> > chance to review it for complexity/et al as yet due to travel.
> 
> the spanning tree is not much complex, it is just boring code to
> write and debug...

That was my impression.  :-)

Any thoughts on filtering of outgoing packets to bridged ethernet
segments?  It depends I suppose, conceptually, on whether you think of the
bridged segment as a number of adjacent ethernet segments, each
potentially with its own network configuration (IP, IPX, et al), or if you
think of it as, in effect, one segment.  I.e., your IP address would be a
property of bridge0, built from dc0 and dc1.  In the first case, you might
expect bridge filtering to occur only on the copies of the packets bridged
from the segment where it was sent out onto other segments.  In the second
case, you might expect outgoing packets to be filtered when heading to any
segment.

  Robert N M Watson 

robert@fledge.watson.org              http://www.watson.org/~robert/
PGP key fingerprint: AF B5 5F FF A6 4A 79 37  ED 5F 55 E9 58 04 6A B1
TIS Labs at Network Associates, Safeport Network Services

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-net" in the body of the message

Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?Pine.NEB.3.96L.1000429095802.1309A-100000>

Header And Logo

Peripheral Links

Site Navigation

Header And Logo

Peripheral Links

Search

Site Navigation