From owner-freebsd-stable@FreeBSD.ORG Fri Feb 25 23:13:43 2011 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 03E1B1065670 for ; Fri, 25 Feb 2011 23:13:43 +0000 (UTC) (envelope-from dillon@apollo.backplane.com) Received: from apollo.backplane.com (apollo.backplane.com [216.240.41.2]) by mx1.freebsd.org (Postfix) with ESMTP id C32128FC17 for ; Fri, 25 Feb 2011 23:13:42 +0000 (UTC) Received: from apollo.backplane.com (localhost [127.0.0.1]) by apollo.backplane.com (8.14.4/8.14.1) with ESMTP id p1PMxi4K019791 for ; Fri, 25 Feb 2011 14:59:44 -0800 (PST) Received: (from dillon@localhost) by apollo.backplane.com (8.14.4/8.13.4/Submit) id p1PMxiB5019790; Fri, 25 Feb 2011 14:59:44 -0800 (PST) Date: Fri, 25 Feb 2011 14:59:44 -0800 (PST) From: Matthew Dillon Message-Id: <201102252259.p1PMxiB5019790@apollo.backplane.com> Cc: FreeBSD-STABLE Mailing List References: <09E86832-F5D9-4415-83A0-FEF59693FE02@gsoft.com.au> Subject: Re: How to bind a static ether address to bridge? X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 25 Feb 2011 23:13:43 -0000 If you can swing a routed network that will definitely have the fewest complications. For a switched network if_bridge and ARP have to be integrated, something I just finished doing in DragonFly, so that all member interfaces of the bridge use *only* the bridge's MAC for all transactions, including ARP transactions, whether they require forwarding through the bridge or not. The bridge has its own internal forwarding table and a great deal of confusion occurs if the normal ARP code is trying to tie into individual interfaces instead of just the bridge interface, for *ANY* member of the bridge, not just the first member of the bridge. Some of the problems you are likely to hit using if_bridge: * ARP response flows in on member interface A with an ether destination of member interface B. OS decides to record the ARP route as coming from interface B (when it's actually coming from interface A), while the bridge internally records the proper forwarding (A). Fireworks ensue. * ARP responses targetting member interfaces which are part of the spanning tree protocol (when you have redundant links), and then wind up in the blocking state by the spanning tree protocol. The if_bridge code in FreeBSD sets the bridge's MAC to be the same as the first added interface, which is usually your LAN ethernet port. This will help a bit, just make sure that it *IS* your LAN ethernet port and that the spanning tree protocol is *NOT* turned on for that port. However, other member interfaces (usually TAPs if you are using something like OpenVPN) will have different MAC addresses and that will cause confusion. It might be possible to work around both issues by setting the MAC for *ALL* member interfaces to be the same as the bridge MAC, but I don't know. I gave up trying to do that in DFly and instead modified the ARP code to always use the bridge MAC for any interface which is a member of a bridge. That appears to have worked quite well. My home network (using DragonFly) is using if_bridge to a colocated box, ether bridging a class C over three WANs via OpenVPN, with the related TAP interfaces and the LAN interface as members of the bridge. The bridge is set up with the spanning tree protocol turned on for the three TAP interfaces and with bonding turned on for two of the TAP interfaces. But that's with DFly (and I just finished the work two days ago). If something similar cannot be done w/FreeBSD then I recommend porting the changes from DFly over to FreeBSD's bridging and ARP modules. It was a big headache but once I cleared up the ARP confusion things just started magically working. Other caveats: * TAP and BRIDGE interfaces are assigned a nearly random MAC address when they are created (in FreeBSD the bridge sets its MAC to the first member interface so that is at least ok if you always add your LAN as the first member interface, however the other member interfaces aren't so lucky). Rebooting the machine containing the bridge or destroying and rebuilding the bridge can create total and absolute havoc on your network because the rest of your switching infrastructure and machines will have the old MACs cached. The partial solution is taking on the MAC address of the LAN interface, which FreeBSD's bridging code does, and it might be possible to also set the other member interfaces to that same MAC (but I don't know if that will work). If not then this is almost a non-solvable problem short of making the ARP module more aware of the bridge. * If using redundant links without bonding support in the bridge code the bridge itself will get confused when the topology changes, though if it is a simple topology the bridge should be able to start forwarding to the backup link even though its internal forwarding table is messed up. The concept of a 'backup' link is a bit of a hack in the STP code (just as the concept of 'bonding' is a bit of a hack), so how well it works will depend on a lot of different factors. The idea of a 'backup' link is to be able to continue to switch packets when only one path is available even if that path has not been completely resolved through the STP protocol. * ARP only works because *EVERYONE* uses the same timeout. Futzing around with member associations on the bridge will cause the bridge to forget. The bridge should theoretically broadcast unicast packets for which it doesn't have a forwarding entry but... well, it is still possible for machines to get confused. When working on your setup you may have to 'arp -d -a' on one or more machines multiple times to force them to re-arp and cause all your intermediate ethernet switches to relearn the new MACs. Remember that your ethernet switches can get just as confused as your actual machines! 'why can't I see that packet going over my LAN, both my machines have the correct ARP entries!!!!'... but the little hardware ether switch between them might not. * A multi-homed network can sometimes have routing loops, particularly when you try to use an ethernet bridge. For example lets say you have a machine on your home network using address IPA which sends a packet to a machine out in the world over the wrong default route. The RESPONSE to that packet, sent *to* your machine, if it isn't blocked by edge routers (due to the source address being wrong for that edge) will come back through a DIFFERENT bridge member. In a switched network if the packet was destined to a machine directly on the other side of the bridge which is part of the switched network, the machines on the other side of the bridge may end up believing that IPA is accessed via the other direction instead of through the VPN/bridge. Needless to say, trying to route a response back to IPA through the remote side's default route instead of through the VPN directly to IPA may get blackholed or, worse, may end up creating a loop. The machines on the bridged network will get confused as to which direction to go to get to the machine with IPA. So, lots of horror is possible here. If you can use a routed network instead of a bridged network that's really what you want to do. On the otherhand, routed networks cannot handle channel bonding and redundancy (even if using BGP or OSPF for your internal network) nearly as well as switched networks can. If the bridge interface and ARP code is brought up to snuff it actually will do the job quite nicely. One last note on using a switched network and something like VPN. You will end up with multiple default routes and need to use IPFW2 to make sure that packets with source IPs for various WAN interfaces are forwarded through the correct default route. The 'master' default route for the machine would normally be set to the default route for the bridged network. Throwing NAT on top of everything else adds more fun to the pot, sometimes there isn't a clear distinction as to when a packet goes from being 'switched' to being 'routed', particularly when something like NAT bounces a packet back out the same interface it came in on. Essentially any address translation which occurs (NAT) takes the packet out of the switching path and places it into the routing path. The IPFW and PF tie-ins have to do their job so the rest of the system knows whether the packet filter 'ate' the packet (turning it into a routed packet), or simply filtered and returned it (leaving it as a switched packet). -Matt