Date: Fri, 22 Feb 2008 10:43:22 +0200 From: Ian FREISLICH <ianf@clue.co.za> To: pyunyh@gmail.com Cc: FreeBSD Current <freebsd-current@freebsd.org>, Robert Backhaus <robbak@robbak.com> Subject: Re: Packet corruption in re0 Message-ID: <E1JSTV4-0000l2-EE@clue.co.za> In-Reply-To: Message from Pyun YongHyeon <pyunyh@gmail.com> of "Fri, 22 Feb 2008 13:27:00 %2B0900." <20080222042700.GB30497@cdnetworks.co.kr>
next in thread | previous in thread | raw e-mail | index | archive | help
Pyun YongHyeon wrote: > On Thu, Feb 21, 2008 at 01:18:18PM +0200, Ian FREISLICH wrote: > > Pyun YongHyeon wrote: > > > On Thu, Feb 21, 2008 at 02:47:43PM +1000, Robert Backhaus wrote: > > > > On Thu, Feb 21, 2008 at 1:50 PM, Pyun YongHyeon <pyunyh@gmail.com> wr ote: > > > > > On Thu, Feb 21, 2008 at 11:03:02AM +1000, Robert Backhaus wrote: > > > > > > I am experiencing roughly 15% packet corruption on the re inter face > > on > > > > > > my freebsd 7/amd64 box. > > > > > > > > > > > > FreeBSD gw.flexi.robbak.com 7.0-PRERELEASE FreeBSD 7.0-PRERELEA SE #8 > > : > > > > > > Tue Feb 5 09:49:55 EST 2008 > > > > > > root@gw.flexi.robbak.com:/usr/obj/usr/src/sys/GW amd64 > > > > > > > > > > > > Just to make troubleshooting difficult, this problem only shows up > > > > > > after the system has been up for roughly 36 hours, depending on the > > > > > > amount of traffic. > > > > > > > > > > > > > > > > I didn't take a look attached tcpdump files but I guess the > > > > > instability issue was fixed in HEAD. It's not yet MFCed but > > > > > I'll handle it in a week. > > > > > > > > > > Would you try re(4) in HEAD? > > > > > > > > > > > > > OK, I'll do that. What is the best way to do that? csupping to "." se ems a > > > > bit drastic, and I don't do much with cvs proper. I take it that I sh ould > > use > > > > anon-cvs to grab the directory, but I don't quite know how. > > > > > > > > > > Copy sys/dev/re/if_re.c, sys/pci/if_rlreg.h in HEAD to your box. > > > Due to lack of m_defrag(9) in 7-PRERELEASE/RC, you also have to add > > > that function to if_re.c(Copy m_defrag() in sys/kern/uipc_mbuf.c on > > > HEAD/RELENG_7 to if_re.c). That would make it build on your box. > > > > This doesn't solve the problem that I'm seeing on re(4) interfaces. > > It basically shows up as quagga establishing OSPF neighours as > > "Exchange/DR" when VLAN hardware tagging is enabled. I'm running > > OSPF over 802.1Q vlans. Neighbours are correctly negotiated once > > VLAN hardware tagging is disabled on the interface. > > > > I'll do more debugging. > > > > Hmm. That sounds like different issue to me. I guess I din't change > any semantics in VLAN H/W tagging. Do you still the same VLAN H/W > tagging related issues on RELENG_7? > > To narrow down the issue it would be even better to know which parts > of H/W assistance was broken. For example, > - Disable checksum offload for VLAN interface first and check > whether quagga works. You can only disable offload on the parent interface. > - Disable checksum offload for parent interface and check again. > If you can post tcpdump output for broken conntection it may help a > lot to diagnose the issue. The only flag affecting this behaviour is vlanhwtag. Various permutations of the interface flags make no difference to this behaviour as long as hardware tagging is enabled. It seems like it's corrupting large packets on transmit when vlanhwtag is enabled. From the tcpdump output it looks like a padding or packet length issue. Here's what tcpdump on the re(4) device thinks it's transmitting: 00:08:a1:3c:32:9c > 00:90:fb:0c:89:7d, ethertype 802.1Q (0x8100), length 1510: vlan 1000, p 0, ethertype IPv4, 196.22.138.92 > 196.22.138.89: OSPFv2, Database Description, length: 1472 Here's what was actually recieved by the em(4) device on the neighbour. Note the absense of the 801.1Q header: 00:08:a1:3c:32:9c > 00:90:fb:0c:89:7d, ethertype IPv4 (0x0800), length 1506: 196.22.138.92 > 196.22.138.89: OSPFv2, Database Description, length: 1472 When vlanhwtagging is disabled, the re(4) device transmits: 00:90:fb:0c:89:7d > 00:08:a1:3c:32:9c, ethertype 802.1Q (0x8100), length 1510: vlan 1000, p 0, ethertype IPv4, 196.22.138.89 > 196.22.138.92: OSPFv2, Database Description, length: 1472 and the em(4) device recieves: 00:08:a1:3c:32:9c > 00:90:fb:0c:89:7d, ethertype 802.1Q (0x8100), length 1510: vlan 1000, p 0, ethertype IPv4, 196.22.138.92 > 196.22.138.89: OSPFv2, Database Description, length: 1472 Let me know if you need more detailed tcpdump output than I've provided. Ian -- Ian Freislich
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?E1JSTV4-0000l2-EE>