Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 24 Jan 2007 19:02:03 +0000 (GMT)
From:      wpaul@FreeBSD.ORG (Bill Paul)
To:        r.c.ladan@gmail.com (Rene Ladan)
Cc:        pyunyh@gmail.com, freebsd-current@freebsd.org
Subject:   Re: Call for re(4) checksum offload testers.
Message-ID:  <20070124190203.E80FB16A403@hub.freebsd.org>
In-Reply-To: <45B736DE.1000100@gmail.com> from Rene Ladan at "Jan 24, 2007 11:37:18 am"

next in thread | previous in thread | raw e-mail | index | archive | help

--ELM829511406-21919-0_
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit

> Bill Paul schreef:
> [...]
> 
> > I'm very confused as to why the chip botches the TX checksumming in
> > this case. Unfortunately, most of this confusion stems from the fact
> > that you didn't specify exactly which chip rev the user with this
> > problem has, or give a test case to trip the bug.
> > 
> I am that user, using this card, found in Asus A6JE laptops.  From pciconf:
> 
> card:	class=0x020000 card=0x11f51043 chip=0x816810ec rev=0x01 hdr=0x00
> 	vendor=Realtek Semiconductor
> 	device=RTL8168/8111 PCI-E Gigabit Ethernet NIC
> 
> > I'm assuming this yet another problem with small IP fragments being
> > mangled. That being the case, it should be possible to trip the bug
> > with "ping -s 1473 <somehost>." (1473 is 1 byte too large to fit into
> > a 1500 byte frame, which will cause a 1 byte fragment to be sent.)
> > I thought I tested this with my sample PCIe cards though, and didn't
> > see a problem. I'll have to try it again tomorrow.
> > 
> ping -s 1473 <NAT box> succeeds both with and without the patch (i.e.
> ping gives timings), I've included two tcpdumps for further analysis.

Unfortunately, these packet dumps don't help me: I need a packet dump
that shows the failure, and these don't.

> The bug is visible when logging in to sites such as gmail.com or
> nl.bol.com (a Dutch shopping site), or when connecting Thunderbird to
> pop.gmail.com (which uses POP3 with SSL)

Hm. Ok, apparently the TCP segments that cause the problem look like
this:

10:41:54.607019 00:03:47:a6:3f:c0 > 00:00:0c:07:ac:2e, ethertype IPv4 (0x0800),
length 54: 147.11.46.221.63693 > 216.239.57.83.80: . ack 1 win 65535

I captured this by doing 'telnet gmail.com 80' from my system at work.
I contrived a quick test where I wrote a small routine to send a packet
with exactly these contents and duplicated the problem with my sample
8111B/8168B card (the frame isn't mangled as badly as the small IP
fragment case, but the TCP checksum is wrong). The RTL8101E (10/100) PCIe
adapter also botches the checksum in the same way. The earlier PCI cards
do not.

Based on testing with my sample adapters, I think the right thing to do
is skip the software padding in the TCP case. It appears that even
the older 8169 adapters that botch the small IP fragment case will correctly
handle this small TCP segment case. I'm attaching a patch which should
fix the problem without breaking the workaround for other NICs. If you
verify that this patch also fixes your problem, then this patch should
be checked in instead of the other one.

-Bill

--
=============================================================================
-Bill Paul            (510) 749-2329 | Senior Engineer, Master of Unix-Fu
                 wpaul@windriver.com | Wind River Systems
=============================================================================
              <adamw> you're just BEGGING to face the moose
=============================================================================

--ELM829511406-21919-0_
Content-Type: text/plain; charset=US-ASCII
Content-Disposition: attachment; filename=re.patch
Content-Description: re.patch
Content-Transfer-Encoding: 7bit

--- if_re.c.orig	Wed Jan 24 10:38:14 2007
+++ if_re.c	Wed Jan 24 10:40:06 2007
@@ -2075,9 +2075,13 @@
 	 * the mbuf chain has too many fragments so the coalescing code
 	 * below can assemble the packet into a single buffer that's
 	 * padded out to the mininum frame size.
+	 *
+	 * Note: this appears unnecessary for TCP, and doing for TCP
+	 * with PCIe adapters seems to result in bad checksums.
 	 */
 
-	if (arg.rl_flags && (*m_head)->m_pkthdr.len < RL_MIN_FRAMELEN)
+	if (arg.rl_flags && !(arg.rl_flags & CSUM_TCP) &&
+            (*m_head)->m_pkthdr.len < RL_MIN_FRAMELEN)
 		error = EFBIG;
 	else
 		error = bus_dmamap_load_mbuf(sc->rl_ldata.rl_mtag, map,

--ELM829511406-21919-0_--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20070124190203.E80FB16A403>