Date: Sat, 29 Apr 2000 10:24:52 -0400 (EDT) From: Robert Watson <rwatson@FreeBSD.org> To: Luigi Rizzo <luigi@info.iet.unipi.it> Cc: Archie Cobbs <archie@whistle.com>, committers@FreeBSD.org, freebsd-net@FreeBSD.org Subject: Re: Proposal for ethernet, bridging, netgraph Message-ID: <Pine.NEB.3.96L.1000429095802.1309A-100000@fledge.watson.org> In-Reply-To: <200004291010.MAA36762@info.iet.unipi.it>
next in thread | previous in thread | raw e-mail | index | archive | help
On Sat, 29 Apr 2000, Luigi Rizzo wrote: > Robert, thanks for your code. Will try to have a look at it during > the weekend. I have/need a few comments if you dont mind: Steve Gunn has also pointed out a couple of bugs which may or may not have been inheritted from the original code, with regards to the IP packet sanity checking section, and a missing m_pullup(). > > a few weeks, so don't guaranty it will apply cleanly. I also threw in by > > bridge0 interface support, which allows BPF to be used to monitor packets > > this is interesting. I wonder, though, if your 'bridge0' interface > is one for the whole machine, or you have one for each cluster > of interfaces as defined in net.link.ether.bridge_cfg You are correct -- I haven't had a chance to really understand the cluster behavior, but really we should be allocating one bridge interface per cluster. What I'd actually be interested is a more generalized approach, in which virtual networks are defined, and actually physical and logical resources are assigned to the virtual network. This would be particularly useful from the perspective of the jail code--right now IP addresses are assigned to jails, but imagine instead that each jail was assigned a virtual interface, on which it could perform all the normal activities (raw sockets, BPF, binding, etc) but limited by the scope of the resources assigned to the virtual interface from outside the jail. In this way, we could far more effectively and with greater flexibility define the network resources for a jail. To some extent, netgraph may already be able to offer us some of this functionality--I wonder if it would make sense to push both the jail code, and filtering capabilities, into netgraph nodes. I don't think netgraph is currently capable of expressing the ``you have only yay much of a packet, to get more ask for more,'' which would give you the same performance problem with the multiple edX case. > > o IPFW divert/fwd are not implemented > > > > These are both troubled due to the code paths associated with bridging vs. > > packet forwarding vs. local delivery, and the possibility of duplicate > > delivery. I'm beginning to suspect that the real solution here is the > > because divert/fwd (at least as defined now) is a functionality above > level 2, i think the easy (and maybe most correct) way to implement > it is to interpret divert/fwd as a DENY for bridged packet, and > when the same packet gets to ip_input() do the required action. > The tricky part is that when a bridged packet matches a divert/fwd > rule, its tag should be changed by bdg_forward() so that it is > passed to ether_input() even if it does not have a local addr. That was my conclusion also. However, part of the problem here is that unlike IP forwarding layer stuff, you don't ``know'' if the packet was intended to pass to a node on the other side or not, and as such if divert were used to perform a packet transformation (say, a proxy of some sort), that would fairly seriously break the semantic that both sides of the bridge see similar things, as well as making it unclear when the proxy should be applied without a priori knowledge. I.e., you could no longer just say, ``Proxy all TCP connections'' because it's not clear when a SYN packet is destined for two nodes on the same segment, or two nodes seperated by the bridge. > > o IPFW DUMMYNET still implicit ACCEPT > > > > When using IPFW and DUMMYNET with BRIDGE, PIPE commands implicitely ACCEPT > > after the packet has suffered from traffic shaping. This is bad, should > > be in our ERRATA for various releases, and probably fixed. If fixed, it > > should be documented as such. > > you are a committer, arent you :) I'm more interested in the, ``fix'' solution, and haven't tried to tackle it yet as I wasn't clear on the reasons for not implementing it. Were there specific problems doing so that I have not yet come across, or is it just a ``change the gotos appropriately'' kind of arrangement? I'm not familiar with the ERRATA process but will inquire. > > BTW, on the DUMMYNET front, my feeling is that rather than using mbuf > > queueing routines for managing queues, it would be better/easier to use > > some sort of DUMMYNET queue structure that maintained meta-data, as > > this was done to avoid changing the interface of ip_input() and > ip_output(), which i really don't think we want to change. > Yes we can replace the fake mbuf with some other data structure, > but other than making the code a bit more readable i don;t think > it fundamentally changes things. Yes -- it's not a big functional change, but I think it would sort out a few of the layering problems I experienced when cleaning up the bridge/ipfw code. I.e., when the packet pops out of a pipe, being able to determine it's an IP packet based on the ethernet type and pass it back into the ipfw code based on that knowledge. If the mbuf packet header pointer is set to the ethernet header (I suspect not) then that could also be used. > > On an unrelated note, it would be a good idea if we did real spanning tree > > stuff--I have a copy of the appropriate IEEE spec, but haven't had a > > chance to review it for complexity/et al as yet due to travel. > > the spanning tree is not much complex, it is just boring code to > write and debug... That was my impression. :-) Any thoughts on filtering of outgoing packets to bridged ethernet segments? It depends I suppose, conceptually, on whether you think of the bridged segment as a number of adjacent ethernet segments, each potentially with its own network configuration (IP, IPX, et al), or if you think of it as, in effect, one segment. I.e., your IP address would be a property of bridge0, built from dc0 and dc1. In the first case, you might expect bridge filtering to occur only on the copies of the packets bridged from the segment where it was sent out onto other segments. In the second case, you might expect outgoing packets to be filtered when heading to any segment. Robert N M Watson robert@fledge.watson.org http://www.watson.org/~robert/ PGP key fingerprint: AF B5 5F FF A6 4A 79 37 ED 5F 55 E9 58 04 6A B1 TIS Labs at Network Associates, Safeport Network Services To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-net" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?Pine.NEB.3.96L.1000429095802.1309A-100000>