From owner-freebsd-net@FreeBSD.ORG Fri Jan 16 17:28:34 2004 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 334F816A4CE for ; Fri, 16 Jan 2004 17:28:34 -0800 (PST) Received: from alicia.nttmcl.com (alicia.nttmcl.com [216.69.69.10]) by mx1.FreeBSD.org (Postfix) with ESMTP id 1F26443D53 for ; Fri, 16 Jan 2004 17:28:32 -0800 (PST) (envelope-from kelly@nttmcl.com) Received: from alicia.nttmcl.com (localhost [127.0.0.1]) by alicia.nttmcl.com (8.12.9/8.12.5) with ESMTP id i0H1SUXR002217 for ; Fri, 16 Jan 2004 17:28:30 -0800 (PST) (envelope-from kelly@nttmcl.com) Received: from localhost (kelly@localhost)i0H1SUhg002214 for ; Fri, 16 Jan 2004 17:28:30 -0800 (PST) (envelope-from kelly@nttmcl.com) X-Authentication-Warning: alicia.nttmcl.com: kelly owned process doing -bs Date: Fri, 16 Jan 2004 17:28:30 -0800 (PST) From: Kelly Yancey To: net@freebsd.org In-Reply-To: <20040113130721.U15761-102000@alicia.nttmcl.com> Message-ID: <20040116163017.G1217-100000@alicia.nttmcl.com> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Subject: Re: bge data corruption bug (was: 1168 octets payload and bad TCPchecksums) X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 17 Jan 2004 01:28:34 -0000 On Tue, 13 Jan 2004, Kelly Yancey wrote: > > On Fri, 2 Jan 2004, Kelly Yancey wrote: > > > > > We've got Broadcom BCM5701 cards configured for vlan tagging on a > > FreeBSD 4.7 based router; a vlan switch then terminates the trunked > > segment and splits it into separate physical subnets. It turns out that > > hosts on those segments cannot receive TCP packets with precisely 1168 > > octets of payload (ethernet frame size 1222 octets) as the checksum is > > always incorrect. We've manually backported all of the bge driver updates > > from 4-stable, but to no avail. > > What is particularly odd is that the checksums are always wrong by the > > same amount: 0xAC48 (the dump below only shows retries of the same > > packet, but the difference is the same even for other packets). > > Furthermore, it appears only TCP packets with 1168 octets of data are > > affected. I cannot easily create an environment without the vlans to > > determine whether or not tagging is related. Note also, that the IP > > checksum is correct. > > > > First, once slight clarification to my original posting: the received > from, after vlan untagging is 1222 octets; the sent frame includes a tag > so it is 1226 octets. > > Anyway, it appears that the cause of the bad checksums are that the last > dword of the transmitted frame is getting corrupted in hardware. > [ .. snip .. ] > So far, we have only been able to reproduce the problem with TCP packets > with 1168 octets of payload, using vlan tagging on the bge interface. [ .. snip .. ] Final update, just for the record: it turns out that, after adjusting for the difference in header sizes, the bug is easily reproduceable using ping with 1177 to 1180 bytes of payload. So, it isn't just TCP, and it isn't just 1222 byte (1126 with vlan tag) ethernet frames. It is a definate 4-byte window of 1219 to 1222 byte packets. Furthermore, the corruption is caused by the hardware apparently copying the dword 3rd from the end of the packet into the last dword of the frame. You can see this in the dumps in my previous posting, but using ping makes the problem really stand out. For example, the server sends a ICMP echo request which ends with: # tcpdump -Xx -s 4000 -pni vlan9 icmp [ snip ] 0x04a0 8485 8687 8889 8a8b 8c8d 8e8f 9091 9293 ................ 0x04b0 9495 9697 9899 9a9b ........ Then the client receives: # tcpdump -Xx -s 4000 -pni an0 icmp [ snip ] 0x04a0 8485 8687 8889 8a8b 8c8d 8e8f 9091 9293 ................ 0x04b0 9495 9697 9091 9293 ........ I've verified this with different clients, running both FreeBSD and Windows, and using different NICs on the client side. Swapping out the bge interface for one supported by the sk or em driver solves the problem. The workaround that we have found for the bge interface is to simply set the LINK0 flag on the vlan interfaces. I guess something about letting the hardware add the vlan tag keeps it from mangling our packets. Which means that this bug only affects -stable as sam's 1.44 delta avoids the issue on FreeBSD 5.0 and higher. In any event, we have our solution; if anyone else out there is using a bge card as a vlan parent interface on a 4.x box, consider yourself warned: enable LINK0 or face seemingly random data corruption. Kelly -- Kelly Yancey - kbyanc@{posi.net,FreeBSD.org} - kelly@nttmcl.com