From owner-freebsd-current@FreeBSD.ORG Sat Feb 23 18:36:04 2008 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 1342316A40B; Sat, 23 Feb 2008 18:36:04 +0000 (UTC) (envelope-from sam@errno.com) Received: from ebb.errno.com (ebb.errno.com [69.12.149.25]) by mx1.freebsd.org (Postfix) with ESMTP id CAB7213C468; Sat, 23 Feb 2008 18:36:03 +0000 (UTC) (envelope-from sam@errno.com) Received: from trouble.errno.com (trouble.errno.com [10.0.0.248]) (authenticated bits=0) by ebb.errno.com (8.13.6/8.12.6) with ESMTP id m1NIZv7E083509 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Sat, 23 Feb 2008 10:35:57 -0800 (PST) (envelope-from sam@errno.com) Message-ID: <47C0678D.20905@errno.com> Date: Sat, 23 Feb 2008 10:35:57 -0800 From: Sam Leffler User-Agent: Thunderbird 2.0.0.9 (X11/20071125) MIME-Version: 1.0 To: pyunyh@gmail.com References: <20080222042700.GB30497@cdnetworks.co.kr> <20080222094742.GF30497@cdnetworks.co.kr> In-Reply-To: <20080222094742.GF30497@cdnetworks.co.kr> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-DCC-Misty-Metrics: ebb.errno.com; whitelist Cc: Ian FREISLICH , Robert Backhaus , FreeBSD Current Subject: Re: Packet corruption in re0 [checksum offloading] X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 23 Feb 2008 18:36:04 -0000 Pyun YongHyeon wrote: > On Fri, Feb 22, 2008 at 10:43:22AM +0200, Ian FREISLICH wrote: > > Pyun YongHyeon wrote: > > > On Thu, Feb 21, 2008 at 01:18:18PM +0200, Ian FREISLICH wrote: > > > > Pyun YongHyeon wrote: > > > > > On Thu, Feb 21, 2008 at 02:47:43PM +1000, Robert Backhaus wrote: > > > > > > On Thu, Feb 21, 2008 at 1:50 PM, Pyun YongHyeon wr > > ote: > > > > > > > On Thu, Feb 21, 2008 at 11:03:02AM +1000, Robert Backhaus wrote: > > > > > > > > I am experiencing roughly 15% packet corruption on the re inter > > face > > > > on > > > > > > > > my freebsd 7/amd64 box. > > > > > > > > > > > > > > > > FreeBSD gw.flexi.robbak.com 7.0-PRERELEASE FreeBSD 7.0-PRERELEA > > SE #8 > > > > : > > > > > > > > Tue Feb 5 09:49:55 EST 2008 > > > > > > > > root@gw.flexi.robbak.com:/usr/obj/usr/src/sys/GW amd64 > > > > > > > > > > > > > > > > Just to make troubleshooting difficult, this problem only shows > > up > > > > > > > > after the system has been up for roughly 36 hours, depending on > > the > > > > > > > > amount of traffic. > > > > > > > > > > > > > > > > > > > > > > I didn't take a look attached tcpdump files but I guess the > > > > > > > instability issue was fixed in HEAD. It's not yet MFCed but > > > > > > > I'll handle it in a week. > > > > > > > > > > > > > > Would you try re(4) in HEAD? > > > > > > > > > > > > > > > > > > > OK, I'll do that. What is the best way to do that? csupping to "." se > > ems a > > > > > > bit drastic, and I don't do much with cvs proper. I take it that I sh > > ould > > > > use > > > > > > anon-cvs to grab the directory, but I don't quite know how. > > > > > > > > > > > > > > > > Copy sys/dev/re/if_re.c, sys/pci/if_rlreg.h in HEAD to your box. > > > > > Due to lack of m_defrag(9) in 7-PRERELEASE/RC, you also have to add > > > > > that function to if_re.c(Copy m_defrag() in sys/kern/uipc_mbuf.c on > > > > > HEAD/RELENG_7 to if_re.c). That would make it build on your box. > > > > > > > > This doesn't solve the problem that I'm seeing on re(4) interfaces. > > > > It basically shows up as quagga establishing OSPF neighours as > > > > "Exchange/DR" when VLAN hardware tagging is enabled. I'm running > > > > OSPF over 802.1Q vlans. Neighbours are correctly negotiated once > > > > VLAN hardware tagging is disabled on the interface. > > > > > > > > I'll do more debugging. > > > > > > > > > > Hmm. That sounds like different issue to me. I guess I din't change > > > any semantics in VLAN H/W tagging. Do you still the same VLAN H/W > > > tagging related issues on RELENG_7? > > > > > > To narrow down the issue it would be even better to know which parts > > > of H/W assistance was broken. For example, > > > - Disable checksum offload for VLAN interface first and check > > > whether quagga works. > > > > You can only disable offload on the parent interface. > > > > Hmm... I thought it should work. > I have no idea why ioctl handler of vlan(4) rejects checksum > offload configutation. I guess vlan(4) should be teached to handle > this. If parent interface have IFCAP_VLAN_HWCSUM capability and > IFCAP_VLAN_HWTAGGING, ifconfig(4) should be able to control checksum > offload for vlan(4) interface. CCed to yar to get his opinions on > controlling checksum offload on vlan(4). > > > > - Disable checksum offload for parent interface and check again. > > > If you can post tcpdump output for broken conntection it may help a > > > lot to diagnose the issue. > > > > The only flag affecting this behaviour is vlanhwtag. Various > > permutations of the interface flags make no difference to this > > behaviour as long as hardware tagging is enabled. > > > > Disabling VLAN HW tagging also turns off checksum offload on vlan(4) > interface. > > This reminds me that there are several places in the system where h/w checksum offload needs to be specially handled but instead is disabled as a WAR. In particular I'm thinking of the bridge where txcsum is muted on devices while they are plumbed. But this can be a big loss and the better approach (IMO) is to fill in the missing capability in s/w. Not sure what components there are besides bridge and vlan; maybe lagg? netgraph? Note there are other capabilities besides checksum offload, TSO can be done in s/w with good effect. Sam