Date: Thu, 6 May 2010 08:29:48 -0700 From: Ben Smith <bsmith@boltnet.com> To: FreeBSD-gnats-submit@FreeBSD.org Cc: bsmith@boltnet.com Subject: amd64/146358: wrong destination MAC address Message-ID: <20100506152948.GB19474@boltnet.com> Resent-Message-ID: <201005061600.o46G0CTS066942@freefall.freebsd.org>
next in thread | raw e-mail | index | archive | help
>Number: 146358 >Category: amd64 >Synopsis: wrong destination MAC address >Confidential: no >Severity: serious >Priority: medium >Responsible: freebsd-amd64 >State: open >Quarter: >Keywords: >Date-Required: >Class: sw-bug >Submitter-Id: current-users >Arrival-Date: Thu May 06 16:00:11 UTC 2010 >Closed-Date: >Last-Modified: >Originator: bsmith@boltnet.com >Release: FreeBSD 8.0-RELEASE amd64 >Organization: Boltnet, Inc. >Environment: System: FreeBSD gw2.edge2 8.0-RELEASE FreeBSD 8.0-RELEASE #1: Tue May 4 13:36:59 UTC 2010 root@gw2.edge2:/usr/obj/usr/src/sys/BOLTNET amd64 Manufacturer: Supermicro Product Name: X8DTU igb0: Intel(R) PRO/1000 Network Connection version - 1.7.3> port 0xdc00-0xdc1f mem 0xfaee0000-0xfaefffff,0xfaec0000-0xfaedffff,0xfae9c000-0xfae9ffff irq 28 at device 0.0 on pci1 igb0: Using MSIX interrupts with 3 vectors igb1: Intel(R) PRO/1000 Network Connection version - 1.7.3> port 0xd880-0xd89f mem 0xfae60000-0xfae7ffff,0xfae40000-0xfae5ffff,0xfae98000-0xfae9bfff irq 40 at device 0.1 on pci1 *and* gw1# uname -a FreeBSD gw1.edge2 8.0-RELEASE FreeBSD 8.0-RELEASE #2: Wed May 5 16:07:01 UTC 2010 root@gw1.edge2:/usr/obj/usr/src/sys/BOLTNET amd64 igb0: Intel(R) PRO/1000 Network Connection version - 1.7.3> port 0xdc00-0xdc1f mem 0xfaee0000-0xfaefffff,0xfaec0000-0xfaedffff,0xfae9c000-0xfae9ffff irq 28 at device 0.0 on pci1 igb1: Intel(R) PRO/1000 Network Connection version - 1.7.3> port 0xd880-0xd89f mem 0xfae60000-0xfae7ffff,0xfae40000-0xfae5ffff,0xfae98000-0xfae9bfff irq 40 at device 0.1 on pci1 (same motherboard) openbgpd-4.5.20090709 Free implementation of the Border Gateway Protocol, Version >Description: Sometimes when using openbgpd the kernel table becomes decoupled/corrupt. Packets destined for remote hosts are sent out the correct interface but with the wrong destination(next hop gateway) mac address. gw1# bgpctl show ip bgp ben'sIP flags destination gateway lpref med aspath origin *> 198.144.192.0/19 206.51.36.18 100 0 26769 22212 14743 22781 7961 i * 198.144.192.0/19 206.51.37.5 100 0 14361 22212 14743 22781 7961 i I* 198.144.192.0/19 10.3.4.2 100 0 3491 22212 14743 22781 7961 i I* 198.144.192.0/19 10.3.5.2 100 0 3491 22212 14743 22781 7961 i gw1# netstat -nr | grep ^198.144.192 198.144.192.0/19 206.51.36.18 UG1 8 1245 vlan2 gw1# tcpdump -nei vlan2 icmp tcpdump: verbose output suppressed, use -v or -vv for full protocol decode listening on vlan2, link-type EN10MB (Ethernet), capture size 96 bytes 13:46:41.265114 00:02:fc:08:d9:82 > 00:30:48:9f:56:16, ethertype IPv4 (0x0800), length 98: ben > gw1: ICMP echo request, id 60539, seq 76, length 64 13:46:41.265122 00:30:48:9f:56:16 > 00:30:48:9f:56:3a, ethertype IPv4 (0x0800), length 98: gw1 > ben: ICMP echo reply, id 60539, seq 76, length 64 gw1# arp -an | grep 206.51.36.18 ? (206.51.36.18) at 00:19:2f:7a:0c:00 on vlan2 [vlan] gw1# arp -an | grep 56:3a ? (10.3.4.2) at 00:30:48:9f:56:3a on vlan4 [vlan] another example: gw2# tcpdump -nei vlan4 host inoc tcpdump: verbose output suppressed, use -v or -vv for full protocol decode listening on vlan4, link-type EN10MB (Ethernet), capture size 96 bytes 06:05:29.371654 00:30:48:9f:56:16 > 00:30:48:9f:56:3b, ethertype IPv4 (0x0800), length 74: gw1.60029 > inoc.179: Flags [S], seq 50533919, win 65535, options [mss 1460,nop,wscale 3,sackOK,TS val 95326947 ecr 0], length 0 06:05:30.342159 00:30:48:9f:56:16 > 00:30:48:9f:56:3b, ethertype IPv4 (0x0800), length 62: gw1.64469 > inoc.179: Flags [S], seq 2800506931, win 65535, options [mss 1460,sackOK,eol], length 0 ^C 2 packets captured 37 packets received by filter 0 packets dropped by kernel gw2# ifconfig vlan4 vlan4: flags=8843UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500 ether 00:30:48:9f:56:3a inet 10.3.4.2 netmask 0xffffff00 broadcast 10.3.4.255 media: Ethernet autoselect (1000baseT full-duplex>) status: active vlan: 4 parent interface: igb0 gw2# ifconfig vlan5 vlan5: flags=8843UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500 ether 00:30:48:9f:56:3b inet 10.3.5.2 netmask 0xffffff00 broadcast 10.3.5.255 media: Ethernet autoselect (1000baseT full-duplex>) status: active vlan: 5 parent interface: igb1 gw1# bgpctl show ip bgp inoc'sIP flags: * = Valid, > = Selected, I = via IBGP, A = Announced origin: i = IGP, e = EGP, ? = Incomplete flags destination gateway lpref med aspath origin I*> 208.70.82.0/23 10.3.4.2 100 0 3356 4150 15176 i I* 208.70.82.0/23 10.3.5.2 100 0 3356 4150 15176 i 208.70.82.0/23 206.51.36.13 100 2654 3549 1239 4150 15176 i 208.70.82.0/23 206.51.37.5 100 0 14361 3356 4150 15176 i gw1# !netst netstat -nr | grep default default 206.51.36.4 UGS 1 64713 vlan2 gw1# netstat -nr|grep ^208.70.82 208.70.82.0/24 10.3.4.2 UG1 0 0 vlan4 => 208.70.82.0/23 10.3.4.2 UG1 6 2706 vlan4 gw1# !teln telnet inoc 179 Trying inoc... gw1# tcpdump -nei vlan4 host inoc tcpdump: verbose output suppressed, use -v or -vv for full protocol decode listening on vlan4, link-type EN10MB (Ethernet), capture size 96 bytes 06:04:52.603694 00:30:48:9f:56:16 > 00:30:48:9f:56:3b, ethertype IPv4 (0x0800), length 74: gw1.54513 > inoc.179: Flags [S], seq 2254251043, win 65535, options [mss 1460,nop,wscale 3,sackOK,TS val 95545315 ecr 0], length 0 gw1# arp -an|grep 56:3b ? (10.3.5.2) at 00:30:48:9f:56:3b on vlan5 [vlan] gw1# sysctl -a | grep redir net.inet.ip.redirect: 0 net.inet.icmp.log_redirect: 1 net.inet.icmp.drop_redirect: 1 after stopping openbgpd: gw1# netstat -nr | grep ^208.70.82 208.70.82.0/24 10.3.4.2 UG1 0 0 vlan4 => 208.70.82.0/23 10.3.4.2 UG1 6 2753 vlan4 These machines have a lot of routes: gw1# bgpctl show Neighbor AS MsgRcvd MsgSent OutQ Up/Down State/PrfRcvd app1_vlan160 65030 2739 2741 0 22:04:02 5 app1_vlan120 65030 2746 2739 0 22:04:01 8 inoc 65534 3483 379202 0 1d05h10m 0 gw2_vlan5 33616 166248 244942 0 1d05h17m 224198 gw2_vlan4 33616 166246 244935 0 1d05h17m 224198 EquinixB 25658 302028 3517 0 1d05h15m 318387 EquinixA 25658 299936 3517 0 1d05h15m 318386 which may be a factor. This doesn't seem to happen all of the time. A reboot will fix it, if it breaks it seems to break within a few minutes of it coming online. It also doesn't break for all traffic. Some traffic with the same next hop will work. Even after /usr/local/etc/rc.d/openbgpd stop, the routes stay: gw1# netstat -nr | grep ^208.70.82 208.70.82.0/24 10.3.4.2 UG1 0 0 vlan4 => 208.70.82.0/23 10.3.4.2 UG1 6 2753 vlan4 >How-To-Repeat: Run openbgpd with several peers, reboot test the machine a couple times. Sometimes when it comes up it can no longer get to certain eBGP multihope peers because of this. >Fix: Rebooting (and probably also route flush?) will clear the bad routes and hopefully the problem doesn't happen again. >Release-Note: >Audit-Trail: >Unformatted:
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20100506152948.GB19474>