Date: Mon, 29 Jun 2009 14:23:30 +0900 From: Pyun YongHyeon <pyunyh@gmail.com> To: current@freebsd.org Subject: Re: ale(4): Problems with tso, rxcsum and/or txcsum Message-ID: <20090629052330.GC1268@michelle.cdnetworks.co.kr> In-Reply-To: <20090627171110.GP31709@acme.spoerlein.net> References: <20090615121623.GA1479@roadrunner.spoerlein.net> <20090615125154.GG78415@michelle.cdnetworks.co.kr> <20090616093334.GB31709@acme.spoerlein.net> <20090616101740.GI78415@michelle.cdnetworks.co.kr> <20090627171110.GP31709@acme.spoerlein.net>
next in thread | previous in thread | raw e-mail | index | archive | help
On Sat, Jun 27, 2009 at 07:11:11PM +0200, Ulrich Sp??rlein wrote: > Sorry for the long delay, I only now got around testing this more > thoroughly. > > On Tue, 16.06.2009 at 19:17:40 +0900, Pyun YongHyeon wrote: > > On Tue, Jun 16, 2009 at 11:33:34AM +0200, Ulrich Sp??rlein wrote: > > > On Mon, 15.06.2009 at 21:51:54 +0900, Pyun YongHyeon wrote: > > > > On Mon, Jun 15, 2009 at 02:16:23PM +0200, Ulrich Sp??rlein wrote: > > > > > Hello Pyun, > > > > > > > > > > I have connection problems with the onboard GigE of an Asus P5Q board, using a recent 8-CURRENT > > > > > > > > > > ale0: <Atheros AR8121/AR8113/AR8114 PCIe Ethernet> port 0xdc00-0xdc7f mem 0xfe9c0000-0xfe9fffff irq 17 at device 0.0 on pci2 > > > > > ale0: 960 Tx FIFO, 1024 Rx FIFO > > > > > ale0: Using 1 MSI messages. > > > > > ale0: 4GB boundary crossed, switching to 32bit DMA addressing mode. > > > > > miibus0: <MII bus> on ale0 > > > > > ale0: Ethernet address: 00:24:8c:36:3e:10 > > > > > ale0: [FILTER] > > > > > ale0: link state changed to UP > > > > > > > > > > ale0@pci0:2:0:0: class=0x020000 card=0x82261043 chip=0x10261969 rev=0xb0 hdr=0x00 > > > > > vendor = 'Attansic (Now owned by Atheros)' > > > > > device = 'PCI-E ETHERNET CONTROLLER (AR8121/AR8113 )' > > > > > class = network > > > > > subclass = ethernet > > > > > > > > > > ale0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500 > > > > > options=311b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,TSO4,WOL_MCAST,WOL_MAGIC> > > > > > ether 00:24:8c:36:3e:10 > > > > > inet 192.168.0.146 netmask 0xffffff00 broadcast 192.168.0.255 > > > > > media: Ethernet autoselect (100baseTX <full-duplex>) > > > > > status: active > > > > > > > > > > When transferring data to the machine at ~10MB/s (100Mbit network only) the ssh > > > > > connection will die after a couple of minutes with > > > > > > > > > > Disconnecting: Bad packet length 1592360521. > > > > > > > > > > After disabling tso, txcsum and rxcsum the connection seems to be > > > > > stable, though. I fail to figure out a pattern, though. Do I need to > > > > > > > > Hmm, I think this is the second report that could be related with > > > > Rx checksum offloading. If disabling Rx checksum fix the issue, I > > > > have to disable it by default until I understand what's going on. > > > > > > I really need to disable tso, rxcsum *and* txcsum to make this card work > > > stable. :/ > > > > Hmm, let's see which offload was broken. Disabling all offloads > > make it hard to find broken one. > > Ok, disabling -rxcsum will make the connection stable. But when I enable > rxcsum again, it is also stable! It looks like it is not turned on > again. To sum it up: > > 1. doing nothing: ssh connection drops after a couple of minutes > 2. ifconfig ale0 -rxcsum: ssh runs stable for dozens of minutes > 3. ifconfig ale0 rxcsum: ssh runs stable for dozens of minutes (wtf?) > > > > There is one other weirdness, though, regarding tso. I have been using a > > > netcat-blast test, where I "upload" /dev/zero to another machine, and > > > "download" it from the same machine. > > Scrap all my previous findings regarding this issue. I re-ran the test > with three machines. So ale0 would download from machine A and upload to > machine B. No matter how I hard I try, I can always saturate the 100MBit > Ethernet in full duplex. Don't know how the previous numbers came about. > Yeah, I still can't reproduce the issue you've mentioned but I think it's better to disable Rx checksum offload at this time. If I manage to find root cause of issue I would enable it again with proper workarounds. > Thanks for your patience, but it looks like the rxcsum is indeed fishy > on this chip revision. > Committed to HEAD(r195153). You can still enable Rx checksum offload with ifconfig(8) but it is disabled by default. Thanks for reporting!
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20090629052330.GC1268>