From owner-freebsd-stable@freebsd.org Fri Jan 8 21:52:57 2016 Return-Path: Delivered-To: freebsd-stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id ECA0EA676C3 for ; Fri, 8 Jan 2016 21:52:56 +0000 (UTC) (envelope-from lists@jnielsen.net) Received: from webmail2.jnielsen.NET (webmail2.jnielsen.net [50.114.224.20]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "webmail2.jnielsen.net", Issuer "freebsdsolutions.net" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id C629717A1 for ; Fri, 8 Jan 2016 21:52:56 +0000 (UTC) (envelope-from lists@jnielsen.net) Received: from jnielse-ml.domo.com (50-207-241-62-static.hfc.comcastbusiness.net [50.207.241.62]) (authenticated bits=0) by webmail2.jnielsen.NET (8.15.2/8.15.1) with ESMTPSA id u08Lqk9G012060 (version=TLSv1 cipher=ECDHE-RSA-AES256-SHA bits=256 verify=NO) for ; Fri, 8 Jan 2016 14:52:50 -0700 (MST) (envelope-from lists@jnielsen.net) X-Authentication-Warning: webmail2.jnielsen.NET: Host 50-207-241-62-static.hfc.comcastbusiness.net [50.207.241.62] claimed to be jnielse-ml.domo.com From: John Nielsen Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: quoted-printable Subject: lagg(4) + VLAN + if_bridge(4) vs. ARP Message-Id: Date: Fri, 8 Jan 2016 14:52:45 -0700 To: FreeBSD stable Mime-Version: 1.0 (Mac OS X Mail 9.2 \(3112\)) X-Mailer: Apple Mail (2.3112) X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 08 Jan 2016 21:52:57 -0000 Hi all- I'm trying to troubleshoot a problem on a machine running recent = 10-STABLE. The machine has two physical interfaces and hosts a number of = services, including a bhyve VM (FreeBSD 10.2-RELEASE) acting as a = network appliance. The VM has three interfaces: external, = internal-trusted and internal-guest. Each VM interface is plumbed to a = TAP device on the host which in turn is a member of a bridge. Here is = the current (working) setup: External <--------> Host <-> Host <-> Host <-> VM port re0 bridge2 tap21 vtnet1 Switch <-> Host <-> Host <-> Host <-> Host <-> VM port em0 em0.2 bridge0 tap20 vtnet0 ^ \-----> Host <-> Host <-> Host <-> VM em0.103 bridge1 tap22 vtnet2 Since there is not much external traffic, most of the bandwidth = potential of re0 is wasted while em0 is sometimes busy. So I'd like to = move to a LAGG setup, as below: External Trusted Untrusted VLAN 99 VLAN 2 VLAN 103 | | | \ | / /---------------\ /------> Host <--> Host <-> Host <-> VM | switch | | lagg0.99 bridge2 tap21 vtnet1 \---------------/ | | | | /---> Host <--> Host <-> Host <-> VM | v | | lagg0.2 bridge0 tap20 vtnet0 | Host v v \ re0 <-----> Host <-> Host <--> Host <-> Host <-> VM \ lagg0 lagg0.103 bridge1 tap22 vtnet2 \-> Host ^ em0 <------/ So in other words, plugging the external port into the switch, creating = a new "external" VLAN, adding both em0 and re0 into a new LAGG and = creating VLAN child interfaces off of that. I tried the new setup today and it worked except that the VM no longer = received ARP replies from the external network. Using tcpdump on the = host's lagg0.99, I saw the ARP request from the VM go out and an ARP = reply come back, but that's as far as it went. I did not see the arp = reply on the host's bridge2 or tap21 interfaces, and the VM never = received it. I didn't make any changes on the VM, and all I changed on the host was = the networking via /etc/rc.conf. The host does run ipfw but I verified = that none of the rules reference any stale interface names. I have also = previously disabled all firewalling of bridged packets: net.link.bridge.pfil_onlyip=3D0 net.link.bridge.pfil_member=3D0 net.link.bridge.pfil_bridge=3D0 I also verified that "ifconfig bridge2 addr" contained the MAC addresses = of both the VM and the external device on the correct ports. So in the LAGG setup, why aren't the ARP replies going across bridge2 to = the VM? Any ideas on how to narrow down the cause appreciated. Thanks! -John Nielsen