From owner-freebsd-net@freebsd.org  Thu Jun 23 11:08:03 2016
Return-Path: <owner-freebsd-net@freebsd.org>
Delivered-To: freebsd-net@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id CF4C1B73241
 for <freebsd-net@mailman.ysv.freebsd.org>;
 Thu, 23 Jun 2016 11:08:03 +0000 (UTC)
 (envelope-from kpielorz_lst@tdx.co.uk)
Received: from smtp.krpservers.com (smtp.krpservers.com [62.13.128.145])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (Client CN "*.krpservers.com", Issuer "RapidSSL SHA256 CA - G3" (verified OK))
 by mx1.freebsd.org (Postfix) with ESMTPS id 7BD5D2A73
 for <freebsd-net@FreeBSD.org>; Thu, 23 Jun 2016 11:08:03 +0000 (UTC)
 (envelope-from kpielorz_lst@tdx.co.uk)
Received: from [10.12.30.106] (vpn01-01.tdx.co.uk [62.13.130.213] (may be
 forged)) (authenticated bits=0)
 by smtp.krpservers.com (8.15.2/8.15.2) with ESMTPSA id u5NArtit054794
 (version=TLSv1 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO)
 for <freebsd-net@FreeBSD.org>; Thu, 23 Jun 2016 11:53:57 +0100 (BST)
 (envelope-from kpielorz_lst@tdx.co.uk)
Date: Thu, 23 Jun 2016 11:53:47 +0100
From: Karl Pielorz <kpielorz_lst@tdx.co.uk>
To: freebsd-net@FreeBSD.org
Subject: Problem with VLAN config and traffic after 10.1-R -> 10.3-R-p5
 Upgrade?
Message-ID: <2ED5D9FEB55641BF734C14F3@[10.12.30.106]>
X-Mailer: Mulberry/4.0.8 (Win32)
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii; format=flowed
Content-Transfer-Encoding: 7bit
Content-Disposition: inline
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net/>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 23 Jun 2016 11:08:03 -0000


Hi,

We're in the process of updating our boxes from 10.1 to 10.3. This has gone 
OK for the simpler cases - but I seem to have found a couple of issues with 
the way 10.3 handles both configuring VLANs and actual traffic on VLANs.


On our box to be upgraded, our /etc/rc.conf has:

cloned_interfaces="lagg0 lagg1 lagg1.30 lagg1.35"
ifconfig_bge0="up"
ifconfig_bge1="up"
ifconfig_lagg0="laggproto failover laggport bge0 laggport bge1 172.16.50.1 
netmask 255.255.255.0"

ifconfig_em3="mtu 1504 up"
ifconfig_em0="mtu 1504 up"
ifconfig_lagg1="laggproto failover laggport em3 laggport em0 192.168.0.2 
netmask 255.255.255.0 mtu 1504"
ifconfig_lagg1_30="inet 192.168.200.2 netmask 255.255.255.0 mtu 1500"
ifconfig_lagg1_35="inet 192.168.210.2 netmask 255.255.255.0 mtu 1500"


The mtu 'hackery' is needed to avoid MTU issues with VLAN interfaces. The 
above worked fine under 10.1 - but the same config under 10.3:

 - Creates lagg0 correctly, and assigns the 172.16.50.1 IP to it
 - Creates lagg1 - and it's VLAN's
 - Does not assign 192.168.0.2 to lagg1 (it silently fails to - i.e. no 
errors logged / shown)

So when the system has finished booting you end up with:

  lagg0    = 172.16.50.1
  lagg1    = no IP assigned
  lagg1.30 = 192.168.200.2
  lagg1.35 = 192.168.210.2

The other thing I've found is, once the box is up:

#ping 192.168.200.1
PING 192.168.200.1 (192.168.200.1): 56 data bytes
ping: sendto: Host is down
^C
--- 192.168.200.1 ping statistics ---
6 packets transmitted, 0 packets received, 100.0% packet loss

Hmm, not good. 192.168.200.1 is a host on the VLAN 30 network (and is up - 
I'm logged into it on another session). Same happens for the 
192.168.210.0/24 network.


Running tcpdump on 192.168.200.1 I see lots of:

11:31:52.956094 ARP, Request who-has 192.168.200.1 tell 192.168.200.2, 
length 46
11:31:52.956102 ARP, Reply 192.168.200.1 is-at x:x:x:x:x:x, length 28
11:31:53.969140 ARP, Request who-has 192.168.200.1 tell 192.168.200.2, 
length 46
11:31:53.969148 ARP, Reply 192.168.200.1 is-at x:x:x:x:x:x, length 28

Ok, so the other box can see the ARP requests from the 10.3 box - and 
issues a reply, but the 10.3 box can't "ping" it.


This gets increasingly weird if I run tcpdump on the 10.3 box. The act of 
running 'tcpdump -i lagg1.30 -n' actually fixes the problem:


#ping 192.168.200.1
PING 192.168.100.1 (192.168.200.1): 56 data bytes
64 bytes from 192.168.200.1: icmp_seq=0 ttl=64 time=0.257 ms
64 bytes from 192.168.200.1: icmp_seq=1 ttl=64 time=0.168 ms
64 bytes from 192.168.200.1: icmp_seq=2 ttl=64 time=0.320 ms

If I ctrl-c the tcpdump on the 10.3 box at this point - pings stop dead. 
Restart the tcpdump - pings resume.


Restoring 10.1 on the box fixes this - but I'd obviously rather be using 
10.3 now.

Any ideas?

Thanks,

-Karl