Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 3 Jul 2002 08:54:56 -0700 (PDT)
From:      John Polstra <jdp@polstra.com>
To:        net@freebsd.org
Cc:        lmckenna@lodgenet.com
Subject:   Re: bge problem under 4.6-stable
Message-ID:  <200207031554.g63Fsuge013706@vashon.polstra.com>
In-Reply-To: <3EA88113DE92D211807300805FA7994209149EE8@chaplin.lodgenet.com>
References:  <3EA88113DE92D211807300805FA7994209149EE8@chaplin.lodgenet.com>

next in thread | previous in thread | raw e-mail | index | archive | help
In article <3EA88113DE92D211807300805FA7994209149EE8@chaplin.lodgenet.com>,
McKenna, Lee <lmckenna@lodgenet.com> wrote:
> I have two machines with new 3Com 3C996B-T adapters using the bge0 driver
> and I am having a problem with nfs.  I can mount the server from the client,
> and I can cd into the mounted directory, but as soon as I do an 'ls'
> command, the client appears to hang.  Strange loooking packets occasionally
> show up in tcpdump with what appears to be huge, invalid port numbers and
> the packets appear to be fragments?.
> 
> I can ftp to/from the server just fine.
> 
> I tried changing ETHER_ALIGN to 0, but same results.  Tried media 100baseTX
> to force both cards to 100Mbps, same results.  I removed the gigabit switch
> and used a crossover cable between the 2 machines, same results.
> 
> I removed the bge0 cards and put in good ol' reliable fxp0 cards using same
> crossover cable and everything works fine.

Thanks for reporting this.  I had a moment to look into it last night,
and I could easily reproduce the problem.

Something is wrong with the hardware checksum offloading for
transmitted IP fragments.  If you do the NFS mount with read and write
sizes of 1024 so that no fragmentation occurs, the problem goes away.
I suspect that the default read/write sizes of 8K would work if you
enabled jumbo frames by setting the MTU to 9000 on each end.  (That's
assuming your switch supports jumbo frames.)

This quick hack to the bge driver (in -stable) also makes it work.  It
disables all support for HW checksums:

Index: if_bge.c
===================================================================
RCS file: /home/ncvs/src/sys/dev/bge/if_bge.c,v
retrieving revision 1.3.2.12
diff -u -r1.3.2.12 if_bge.c
--- if_bge.c	30 Jun 2002 17:46:35 -0000	1.3.2.12
+++ if_bge.c	3 Jul 2002 15:40:58 -0000
@@ -1689,9 +1689,11 @@
 	ifp->if_init = bge_init;
 	ifp->if_mtu = ETHERMTU;
 	ifp->if_snd.ifq_maxlen = BGE_TX_RING_CNT - 1;
+#if 0
 	ifp->if_hwassist = BGE_CSUM_FEATURES;
 	ifp->if_capabilities = IFCAP_HWCSUM;
 	ifp->if_capenable = ifp->if_capabilities;
+#endif
 
 	/* Save ASIC rev. */
 

You could probably get the same effect without modifying the driver by
adding "-txcsum" to the ifconfig arguments, but I didn't think to try
that last night.

This fix can be refined either to fix checksum offloading of fragments
or to disable the offloading for fragments without disabling it for
other packets.  I ran out of time last night, but I'll work on that
Real Soon Now.

Note, the bug may not be in the driver itself.  I think the bge driver
is the only one that even attempts to do checksum offloading for
fragments.  So the bug could easily be elsewhere in the system.

John
-- 
  John Polstra
  John D. Polstra & Co., Inc.                        Seattle, Washington USA
  "Disappointment is a good sign of basic intelligence."  -- Chögyam Trungpa


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-net" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200207031554.g63Fsuge013706>