Date: Thu, 23 Jun 2016 11:53:47 +0100 From: Karl Pielorz <kpielorz_lst@tdx.co.uk> To: freebsd-net@FreeBSD.org Subject: Problem with VLAN config and traffic after 10.1-R -> 10.3-R-p5 Upgrade? Message-ID: <2ED5D9FEB55641BF734C14F3@[10.12.30.106]>
next in thread | raw e-mail | index | archive | help
Hi, We're in the process of updating our boxes from 10.1 to 10.3. This has gone OK for the simpler cases - but I seem to have found a couple of issues with the way 10.3 handles both configuring VLANs and actual traffic on VLANs. On our box to be upgraded, our /etc/rc.conf has: cloned_interfaces="lagg0 lagg1 lagg1.30 lagg1.35" ifconfig_bge0="up" ifconfig_bge1="up" ifconfig_lagg0="laggproto failover laggport bge0 laggport bge1 172.16.50.1 netmask 255.255.255.0" ifconfig_em3="mtu 1504 up" ifconfig_em0="mtu 1504 up" ifconfig_lagg1="laggproto failover laggport em3 laggport em0 192.168.0.2 netmask 255.255.255.0 mtu 1504" ifconfig_lagg1_30="inet 192.168.200.2 netmask 255.255.255.0 mtu 1500" ifconfig_lagg1_35="inet 192.168.210.2 netmask 255.255.255.0 mtu 1500" The mtu 'hackery' is needed to avoid MTU issues with VLAN interfaces. The above worked fine under 10.1 - but the same config under 10.3: - Creates lagg0 correctly, and assigns the 172.16.50.1 IP to it - Creates lagg1 - and it's VLAN's - Does not assign 192.168.0.2 to lagg1 (it silently fails to - i.e. no errors logged / shown) So when the system has finished booting you end up with: lagg0 = 172.16.50.1 lagg1 = no IP assigned lagg1.30 = 192.168.200.2 lagg1.35 = 192.168.210.2 The other thing I've found is, once the box is up: #ping 192.168.200.1 PING 192.168.200.1 (192.168.200.1): 56 data bytes ping: sendto: Host is down ^C --- 192.168.200.1 ping statistics --- 6 packets transmitted, 0 packets received, 100.0% packet loss Hmm, not good. 192.168.200.1 is a host on the VLAN 30 network (and is up - I'm logged into it on another session). Same happens for the 192.168.210.0/24 network. Running tcpdump on 192.168.200.1 I see lots of: 11:31:52.956094 ARP, Request who-has 192.168.200.1 tell 192.168.200.2, length 46 11:31:52.956102 ARP, Reply 192.168.200.1 is-at x:x:x:x:x:x, length 28 11:31:53.969140 ARP, Request who-has 192.168.200.1 tell 192.168.200.2, length 46 11:31:53.969148 ARP, Reply 192.168.200.1 is-at x:x:x:x:x:x, length 28 Ok, so the other box can see the ARP requests from the 10.3 box - and issues a reply, but the 10.3 box can't "ping" it. This gets increasingly weird if I run tcpdump on the 10.3 box. The act of running 'tcpdump -i lagg1.30 -n' actually fixes the problem: #ping 192.168.200.1 PING 192.168.100.1 (192.168.200.1): 56 data bytes 64 bytes from 192.168.200.1: icmp_seq=0 ttl=64 time=0.257 ms 64 bytes from 192.168.200.1: icmp_seq=1 ttl=64 time=0.168 ms 64 bytes from 192.168.200.1: icmp_seq=2 ttl=64 time=0.320 ms If I ctrl-c the tcpdump on the 10.3 box at this point - pings stop dead. Restart the tcpdump - pings resume. Restoring 10.1 on the box fixes this - but I'd obviously rather be using 10.3 now. Any ideas? Thanks, -Karl
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?2ED5D9FEB55641BF734C14F3>