Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 17 Aug 2010 12:35:31 -0700
From:      Pyun YongHyeon <pyunyh@gmail.com>
To:        Jack Vogel <jfvogel@gmail.com>
Cc:        freebsd-stable@freebsd.org
Subject:   Re: RELENG_7 em problems (and RELENG_8)
Message-ID:  <20100817193531.GC6482@michelle.cdnetworks.com>
In-Reply-To: <AANLkTiniEm=jV2HwJebaGHjQSCb5WrtMOJGQMdO%2BZW20@mail.gmail.com>
References:  <201006102031.o5AKVCH2016467@lava.sentex.ca> <201007021739.o62HdMOU092319@lava.sentex.ca> <20100702193654.GD10862@michelle.cdnetworks.com> <201008162107.o7GL76pA080191@lava.sentex.ca> <20100817185208.GA6482@michelle.cdnetworks.com> <AANLkTiniEm=jV2HwJebaGHjQSCb5WrtMOJGQMdO%2BZW20@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Tue, Aug 17, 2010 at 12:05:56PM -0700, Jack Vogel wrote:
> Hmmm, interesting, I'll have to have some testing done, maybe for the 574 it
> should automagically disable CSUM?
> 

I don't have 82574 controller to test but it may depend on how
pipelined Tx data DMA works. If 82574 can still pipeline Tx data
DMA when a new context is written it would be better to enable
checksum offloading. If em(4) uses single Tx queue, we can safely
enable checksum offloading, I guess.

> Jack
> 
> 
> On Tue, Aug 17, 2010 at 11:52 AM, Pyun YongHyeon <pyunyh@gmail.com> wrote:
> 
> > On Mon, Aug 16, 2010 at 05:07:11PM -0400, Mike Tancsa wrote:
> > > Hi Jack,
> > >         FYI, I am still seeing this same problem on RELENG_8 (code
> > > as of today).  Unfortunately I cant try Pyun's patch since the
> > > underlying code has changed since then.
> > >
> > > em4@pci0:3:0:0: class=0x020000 card=0x34ec8086 chip=0x10d38086
> > > rev=0x00 hdr=0x00
> > >     vendor     = 'Intel Corporation'
> > >     device     = 'Intel 82574L Gigabit Ethernet Controller (82574L)'
> > >     class      = network
> > >     subclass   = ethernet
> > >     cap 01[c8] = powerspec 2  supports D0 D3  current D0
> > >     cap 05[d0] = MSI supports 1 message, 64 bit enabled with 1 message
> > >     cap 10[e0] = PCI-Express 1 endpoint max data 128(256) link x1(x1)
> > >     cap 11[a0] = MSI-X supports 5 messages in map 0x1c
> > >
> > > pci3: <ACPI PCI bus> on pcib3
> > > em4: <Intel(R) PRO/1000 Network Connection 7.0.5> port 0x1000-0x101f
> > > mem 0xb1900000-0xb191ffff,0xb1920000-0xb1923fff irq 16 at device 0.0 on
> > pci3
> > > em4: Using MSI interrupt
> > > em4: [FILTER]
> > > em4: Ethernet address: 00:15:17:ed:3e:c4
> > >
> >
> > Here is updated patch for HEAD and stable/8.
> > http://people.freebsd.org/~yongari/em.csum_tso.20100817.patch<http://people.freebsd.org/%7Eyongari/em.csum_tso.20100817.patch>;
> >
> > It seems to work as expected under my limited environments. If
> > you're using multiple Tx queues with em(4) it would be better to
> > disable Tx checksum offloading as driver always have to create a
> > new checksum context for each frame. This will effectively disable
> > pipelined Tx data DMA which in turn greatly slows down Tx
> > performance for small sized frames. The reason driver have to
> > create a new checksum context when it uses multiple Tx queues comes
> > from hardware limitation. The controller tracks only for the last
> > context descriptor that was written such that driver does not know
> > the state of checksum context configured in other Tx queue.
> > Hope this helps.
> >
> > >
> > >
> > >         ---Mike
> > >
> > >
> > > At 03:36 PM 7/2/2010, Pyun YongHyeon wrote:
> > > >On Fri, Jul 02, 2010 at 01:39:22PM -0400, Mike Tancsa wrote:
> > > >> Hi Jack,
> > > >>         Just a followup to the email below. I now saw what appears
> > > >> to be the same problem on RELENG_8, but on a different nic and with
> > > >> VLANs.  So not sure if this is a general em problem, a problem
> > > >> specific to some em NICs, or a TSO problem in general.  The issue
> > > >> seemed to be triggered when I added a new vlan based on
> > > >>
> > > >> em3@pci0:14:0:0:        class=0x020000 card=0x109a15d9
> > > >> chip=0x109a8086 rev=0x00 hdr=0x00
> > > >>     vendor     = 'Intel Corporation'
> > > >>     device     = 'Intel PRO/1000 PL Network Adaptor (82573L)'
> > > >>     class      = network
> > > >>     subclass   = ethernet
> > > >>     cap 01[c8] = powerspec 2  supports D0 D3  current D0
> > > >>     cap 05[d0] = MSI supports 1 message, 64 bit enabled with 1 message
> > > >>     cap 10[e0] = PCI-Express 1 endpoint max data 128(256) link x1(x1)
> > > >>
> > > >> pci14: <ACPI PCI bus> on pcib5
> > > >> em3: <Intel(R) PRO/1000 Network Connection 7.0.5> port 0x6000-0x601f
> > > >> mem 0xe8300000-0xe831ffff irq 17 at device 0.0 on pci14
> > > >> em3: Using MSI interrupt
> > > >> em3: [FILTER]
> > > >> em3: Ethernet address: 00:30:48:9f:eb:81
> > > >>
> > > >> em3: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST>
> > > >> metric 0 mtu 1500
> > > >>         options=2098<VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,WOL_MAGIC>
> > > >>         ether 00:30:48:9f:eb:81
> > > >>         inet 10.255.255.254 netmask 0xfffffffc broadcast
> > 10.255.255.255
> > > >>         media: Ethernet autoselect (1000baseT <full-duplex>)
> > > >>         status: active
> > > >>
> > > >> I had to disable tso, rxcsum and txsum in order to see the devices on
> > > >> the other side of the two vlans trunked off em3.  Unfortunately, the
> > > >> other sides were switches 100km and 500km away so I didnt have any
> > > >> tcpdump capabilities to diagnose the issue.  I had already created
> > > >> one vlan off this NIC and all was fine.  A few weeks later, I added a
> > > >> new one and I could no longer telnet into the remote switches from
> > > >> the local machine.... But, I could telnet into the switches from
> > > >> machines not on the problem box. Hence, it would appear to be a
> > > >> general TSO issue no ? I disabled tso on the nic (I didnt disable
> > > >> net.inet.tcp.tso as I forgot about that).. Still nothing. I could
> > > >> always ping the remote devices, but no tcp services.  I then
> > > >> remembered this issue from before, so I tried disabling tso on the
> > > >> NIC. Still nothing. Then I disabled rxcsum and txcsum and I could
> > > >> then telnet into the remote devices.
> > > >>
> > > >> This newly observed issue was from a buildworld on Mon Jun 14
> > > >> 11:29:12 EDT 2010.
> > > >>
> > > >> I will try and recreate the issue locally again to see if I can
> > > >> trigger the problem on demand.  Any thoughts on what it might be ?
> > > >> Perhaps an issue specific to certain em nics ?
> > > >>
> > > >
> > > >http://www.freebsd.org/cgi/query-pr.cgi?pr=141843
> > > >I'm not sure whether you're seeing the same issue though.
> > > >I didn't have chance to try latest em(4) on stable/7.
> > >
> > > --------------------------------------------------------------------
> > > Mike Tancsa,                                      tel +1 519 651 3400
> > > Sentex Communications,                            mike@sentex.net
> > > Providing Internet since 1994                    www.sentex.net
> > > Cambridge, Ontario Canada                         www.sentex.net/mike
> > >
> >



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20100817193531.GC6482>