Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 29 Jun 2009 14:23:30 +0900
From:      Pyun YongHyeon <pyunyh@gmail.com>
To:        current@freebsd.org
Subject:   Re: ale(4): Problems with tso, rxcsum and/or txcsum
Message-ID:  <20090629052330.GC1268@michelle.cdnetworks.co.kr>
In-Reply-To: <20090627171110.GP31709@acme.spoerlein.net>
References:  <20090615121623.GA1479@roadrunner.spoerlein.net> <20090615125154.GG78415@michelle.cdnetworks.co.kr> <20090616093334.GB31709@acme.spoerlein.net> <20090616101740.GI78415@michelle.cdnetworks.co.kr> <20090627171110.GP31709@acme.spoerlein.net>

index | next in thread | previous in thread | raw e-mail

On Sat, Jun 27, 2009 at 07:11:11PM +0200, Ulrich Sp??rlein wrote:
> Sorry for the long delay, I only now got around testing this more
> thoroughly.
> 
> On Tue, 16.06.2009 at 19:17:40 +0900, Pyun YongHyeon wrote:
> > On Tue, Jun 16, 2009 at 11:33:34AM +0200, Ulrich Sp??rlein wrote:
> > > On Mon, 15.06.2009 at 21:51:54 +0900, Pyun YongHyeon wrote:
> > > > On Mon, Jun 15, 2009 at 02:16:23PM +0200, Ulrich Sp??rlein wrote:
> > > > > Hello Pyun,
> > > > > 
> > > > > I have connection problems with the onboard GigE of an Asus P5Q board, using a recent 8-CURRENT
> > > > > 
> > > > > ale0: <Atheros AR8121/AR8113/AR8114 PCIe Ethernet> port 0xdc00-0xdc7f mem 0xfe9c0000-0xfe9fffff irq 17 at device 0.0 on pci2
> > > > > ale0: 960 Tx FIFO, 1024 Rx FIFO
> > > > > ale0: Using 1 MSI messages.
> > > > > ale0: 4GB boundary crossed, switching to 32bit DMA addressing mode.
> > > > > miibus0: <MII bus> on ale0
> > > > > ale0: Ethernet address: 00:24:8c:36:3e:10
> > > > > ale0: [FILTER]
> > > > > ale0: link state changed to UP
> > > > > 
> > > > > ale0@pci0:2:0:0:        class=0x020000 card=0x82261043 chip=0x10261969 rev=0xb0 hdr=0x00
> > > > >     vendor     = 'Attansic (Now owned by Atheros)'
> > > > >     device     = 'PCI-E ETHERNET CONTROLLER  (AR8121/AR8113 )'
> > > > >     class      = network
> > > > >     subclass   = ethernet
> > > > > 
> > > > > ale0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
> > > > >         options=311b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,TSO4,WOL_MCAST,WOL_MAGIC>
> > > > >         ether 00:24:8c:36:3e:10
> > > > >         inet 192.168.0.146 netmask 0xffffff00 broadcast 192.168.0.255
> > > > >         media: Ethernet autoselect (100baseTX <full-duplex>)
> > > > >         status: active
> > > > > 
> > > > > When transferring data to the machine at ~10MB/s (100Mbit network only) the ssh
> > > > > connection will die after a couple of minutes with
> > > > > 
> > > > > Disconnecting: Bad packet length 1592360521.
> > > > > 
> > > > > After disabling tso, txcsum and rxcsum the connection seems to be
> > > > > stable, though. I fail to figure out a pattern, though. Do I need to
> > > > 
> > > > Hmm, I think this is the second report that could be related with
> > > > Rx checksum offloading. If disabling Rx checksum fix the issue, I
> > > > have to disable it by default until I understand what's going on.
> > > 
> > > I really need to disable tso, rxcsum *and* txcsum to make this card work
> > > stable. :/
> > 
> > Hmm, let's see which offload was broken. Disabling all offloads
> > make it hard to find broken one.
> 
> Ok, disabling -rxcsum will make the connection stable. But when I enable
> rxcsum again, it is also stable! It looks like it is not turned on
> again. To sum it up:
> 
> 1. doing nothing: ssh connection drops after a couple of minutes
> 2. ifconfig ale0 -rxcsum: ssh runs stable for dozens of minutes
> 3. ifconfig ale0 rxcsum: ssh runs stable for dozens of minutes (wtf?)
> 
> > > There is one other weirdness, though, regarding tso. I have been using a
> > > netcat-blast test, where I "upload" /dev/zero to another machine, and
> > > "download" it from the same machine.
> 
> Scrap all my previous findings regarding this issue. I re-ran the test
> with three machines. So ale0 would download from machine A and upload to
> machine B. No matter how I hard I try, I can always saturate the 100MBit
> Ethernet in full duplex. Don't know how the previous numbers came about.
> 

Yeah, I still can't reproduce the issue you've mentioned but I
think it's better to disable Rx checksum offload at this time. If
I manage to find root cause of issue I would enable it again with
proper workarounds.

> Thanks for your patience, but it looks like the rxcsum is indeed fishy
> on this chip revision.
> 

Committed to HEAD(r195153). You can still enable Rx checksum
offload with ifconfig(8) but it is disabled by default.

Thanks for reporting!


help

Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20090629052330.GC1268>