From owner-freebsd-stable@FreeBSD.ORG Fri Sep 24 22:36:11 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id AE8B3106566B for ; Fri, 24 Sep 2010 22:36:11 +0000 (UTC) (envelope-from jfvogel@gmail.com) Received: from mail-wy0-f182.google.com (mail-wy0-f182.google.com [74.125.82.182]) by mx1.freebsd.org (Postfix) with ESMTP id 16C748FC0C for ; Fri, 24 Sep 2010 22:36:08 +0000 (UTC) Received: by wyb33 with SMTP id 33so4162453wyb.13 for ; Fri, 24 Sep 2010 15:36:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:received:in-reply-to :references:date:message-id:subject:from:to:cc:content-type; bh=k6tG5KZKdYFf7jyB4jcMRgOHn3axk0Gpnft9t+A1DNc=; b=DH862wBMUFkUmo9+jk5d8jI99y4YePO29RGkX3ahSWVc+YY1hfAHDmItUXwMlIjAli CP727gIkqnWGFfTM/OZr6UfU6GnA+WOA9gp+JLy56RB7L+Nn6+LXjbRaBLzleC4+TlYR N0hvno5ODWknbzEYwtAQSjEVX6n+wp0WjjWoY= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; b=taUEgJT3KTTihqZBHid7qmedNrWJk4ERuNbXUnHWo5K/o8LNWTbkCCdTtcFOsya+cf H+kZcfG/k7BX3NmmXf463AyLAt33/IPa7uOVmE0kTQGZi493RWFj5+SXMWLtJemvAoWL 65NyDtaMQDsfwtMBEqVKMv80yZ0BmZJQbmi6w= MIME-Version: 1.0 Received: by 10.216.10.5 with SMTP id 5mr3277356weu.81.1285367767446; Fri, 24 Sep 2010 15:36:07 -0700 (PDT) Received: by 10.216.48.20 with HTTP; Fri, 24 Sep 2010 15:36:07 -0700 (PDT) In-Reply-To: <201009141759.o8EHxcZ0013539@lava.sentex.ca> References: <201006102031.o5AKVCH2016467@lava.sentex.ca> <201007021739.o62HdMOU092319@lava.sentex.ca> <20100702193654.GD10862@michelle.cdnetworks.com> <201008162107.o7GL76pA080191@lava.sentex.ca> <20100817185208.GA6482@michelle.cdnetworks.com> <201008171955.o7HJt67T087902@lava.sentex.ca> <20100817200020.GE6482@michelle.cdnetworks.com> <201009141759.o8EHxcZ0013539@lava.sentex.ca> Date: Fri, 24 Sep 2010 15:36:07 -0700 Message-ID: From: Jack Vogel To: Mike Tancsa Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: pyunyh@gmail.com, freebsd-stable@freebsd.org Subject: Re: RELENG_7 em problems (and RELENG_8) X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 24 Sep 2010 22:36:11 -0000 There is a new revision of the em driver coming next week, its going thru some stress pounding over the weekend, if no issues show up I'll put it into HEAD. Yongari's changes in TX context handling which effects checksum and tso are added. I've also decided that multiple queues in 82574 just are a source of problems without a lot of benefit, so it still uses MSIX but with only 3 vectors, meaning it seperates TX and RX but has a single queue. Its looking very stable, I hope it fixes everyone's issues. Jack On Tue, Sep 14, 2010 at 10:59 AM, Mike Tancsa wrote: > Hi Jack, > Any plans to commit the patch below ? I have been running it on a > number of boxes and it works as expected with no side effects. > > ---Mike > > > > At 04:00 PM 8/17/2010, Pyun YongHyeon wrote: > >> On Tue, Aug 17, 2010 at 03:55:12PM -0400, Mike Tancsa wrote: >> > At 02:52 PM 8/17/2010, Pyun YongHyeon wrote: >> > >> > >Here is updated patch for HEAD and stable/8. >> > >http://people.freebsd.org/~yongari/em.csum_tso.20100817.patch >> > > >> > >It seems to work as expected under my limited environments. If >> > >> > Thanks! The patch applies cleanly and all works as expected now! I am >> > no longer able to trigger the bug. I just use the stock unmodified >> > driver normally, so no multi queues >> > >> >> Glad to hear that. Thanks for testing! >> >> > # vmstat -i >> > interrupt total rate >> > irq256: em0 149 0 >> > irq257: em1 3 0 >> > irq259: em3 971 2 >> > irq260: ahci0 1520 3 >> > >> > >> > >> > em3: flags=8843 metric 0 mtu >> 1500 >> > >> options=219b >> > ether 00:15:17:xx:xx:xx >> > inet6 fe80::215:17ff:fexx:xxxx%em3 prefixlen 64 scopeid 0x4 >> > inet 192.168.xx.xx netmask 0xffffff00 broadcast 192.168.xx.xx >> > nd6 options=3 >> > media: Ethernet autoselect (100baseTX ) >> > status: active >> > >> > >> > em3@pci0:3:0:0: class=0x020000 card=0x34ec8086 chip=0x10d38086 >> > rev=0x00 hdr=0x00 >> > vendor = 'Intel Corporation' >> > device = 'Intel 82574L Gigabit Ethernet Controller (82574L)' >> > class = network >> > subclass = ethernet >> > cap 01[c8] = powerspec 2 supports D0 D3 current D0 >> > cap 05[d0] = MSI supports 1 message, 64 bit enabled with 1 message >> > cap 10[e0] = PCI-Express 1 endpoint max data 128(256) link x1(x1) >> > cap 11[a0] = MSI-X supports 5 messages in map 0x1c >> > >> > >> > >> > patch < em.csum_tso.20100817.patch >> > Hmm... Looks like a unified diff to me... >> > The text leading up to this was: >> > -------------------------- >> > |Index: sys/dev/e1000/if_em.c >> > |=================================================================== >> > |--- sys/dev/e1000/if_em.c (revision 211398) >> > |+++ sys/dev/e1000/if_em.c (working copy) >> > -------------------------- >> > Patching file sys/dev/e1000/if_em.c using Plan A... >> > Hunk #1 succeeded at 237. >> > Hunk #2 succeeded at 1730. >> > Hunk #3 succeeded at 1759. >> > Hunk #4 succeeded at 1930. >> > Hunk #5 succeeded at 3148. >> > Hunk #6 succeeded at 3351. >> > Hunk #7 succeeded at 3533. >> > Hunk #8 succeeded at 3590. >> > Hunk #9 succeeded at 3603. >> > Hmm... The next patch looks like a unified diff to me... >> > The text leading up to this was: >> > -------------------------- >> > |Index: sys/dev/e1000/if_em.h >> > |=================================================================== >> > |--- sys/dev/e1000/if_em.h (revision 211398) >> > |+++ sys/dev/e1000/if_em.h (working copy) >> > -------------------------- >> > Patching file sys/dev/e1000/if_em.h using Plan A... >> > Hunk #1 succeeded at 284. >> > done >> > >> > ---Mike >> > >> > >> > >you're using multiple Tx queues with em(4) it would be better to >> > >disable Tx checksum offloading as driver always have to create a >> > >new checksum context for each frame. This will effectively disable >> > >pipelined Tx data DMA which in turn greatly slows down Tx >> > >performance for small sized frames. The reason driver have to >> > >create a new checksum context when it uses multiple Tx queues comes >> > >from hardware limitation. The controller tracks only for the last >> > >context descriptor that was written such that driver does not know >> > >the state of checksum context configured in other Tx queue. >> > >Hope this helps. >> > > >> > >> >> > >> >> > >> ---Mike >> > >> >> > >> >> > >> At 03:36 PM 7/2/2010, Pyun YongHyeon wrote: >> > >> >On Fri, Jul 02, 2010 at 01:39:22PM -0400, Mike Tancsa wrote: >> > >> >> Hi Jack, >> > >> >> Just a followup to the email below. I now saw what appears >> > >> >> to be the same problem on RELENG_8, but on a different nic and >> with >> > >> >> VLANs. So not sure if this is a general em problem, a problem >> > >> >> specific to some em NICs, or a TSO problem in general. The issue >> > >> >> seemed to be triggered when I added a new vlan based on >> > >> >> >> > >> >> em3@pci0:14:0:0: class=0x020000 card=0x109a15d9 >> > >> >> chip=0x109a8086 rev=0x00 hdr=0x00 >> > >> >> vendor = 'Intel Corporation' >> > >> >> device = 'Intel PRO/1000 PL Network Adaptor (82573L)' >> > >> >> class = network >> > >> >> subclass = ethernet >> > >> >> cap 01[c8] = powerspec 2 supports D0 D3 current D0 >> > >> >> cap 05[d0] = MSI supports 1 message, 64 bit enabled with 1 >> message >> > >> >> cap 10[e0] = PCI-Express 1 endpoint max data 128(256) link >> x1(x1) >> > >> >> >> > >> >> pci14: on pcib5 >> > >> >> em3: port >> 0x6000-0x601f >> > >> >> mem 0xe8300000-0xe831ffff irq 17 at device 0.0 on pci14 >> > >> >> em3: Using MSI interrupt >> > >> >> em3: [FILTER] >> > >> >> em3: Ethernet address: 00:30:48:9f:eb:81 >> > >> >> >> > >> >> em3: flags=8943 >> > >> >> metric 0 mtu 1500 >> > >> >> >> options=2098 >> > >> >> ether 00:30:48:9f:eb:81 >> > >> >> inet 10.255.255.254 netmask 0xfffffffc broadcast >> > >10.255.255.255 >> > >> >> media: Ethernet autoselect (1000baseT ) >> > >> >> status: active >> > >> >> >> > >> >> I had to disable tso, rxcsum and txsum in order to see the devices >> on >> > >> >> the other side of the two vlans trunked off em3. Unfortunately, >> the >> > >> >> other sides were switches 100km and 500km away so I didnt have any >> > >> >> tcpdump capabilities to diagnose the issue. I had already created >> > >> >> one vlan off this NIC and all was fine. A few weeks later, I >> added a >> > >> >> new one and I could no longer telnet into the remote switches from >> > >> >> the local machine.... But, I could telnet into the switches from >> > >> >> machines not on the problem box. Hence, it would appear to be a >> > >> >> general TSO issue no ? I disabled tso on the nic (I didnt disable >> > >> >> net.inet.tcp.tso as I forgot about that).. Still nothing. I could >> > >> >> always ping the remote devices, but no tcp services. I then >> > >> >> remembered this issue from before, so I tried disabling tso on the >> > >> >> NIC. Still nothing. Then I disabled rxcsum and txcsum and I could >> > >> >> then telnet into the remote devices. >> > >> >> >> > >> >> This newly observed issue was from a buildworld on Mon Jun 14 >> > >> >> 11:29:12 EDT 2010. >> > >> >> >> > >> >> I will try and recreate the issue locally again to see if I can >> > >> >> trigger the problem on demand. Any thoughts on what it might be ? >> > >> >> Perhaps an issue specific to certain em nics ? >> > >> >> >> > >> > >> > >> >http://www.freebsd.org/cgi/query-pr.cgi?pr=141843 >> > >> >I'm not sure whether you're seeing the same issue though. >> > >> >I didn't have chance to try latest em(4) on stable/7. >> > >> >> > >> -------------------------------------------------------------------- >> > >> Mike Tancsa, tel +1 519 651 3400 >> > >> Sentex Communications, mike@sentex.net >> > >> Providing Internet since 1994 www.sentex.net >> > >> Cambridge, Ontario Canada >> www.sentex.net/mike >> > >> >> > >> > -------------------------------------------------------------------- >> > Mike Tancsa, tel +1 519 651 3400 >> > Sentex Communications, mike@sentex.net >> > Providing Internet since 1994 www.sentex.net >> > Cambridge, Ontario Canada www.sentex.net/mike >> > >> > > -------------------------------------------------------------------- > Mike Tancsa, tel +1 519 651 3400 > Sentex Communications, mike@sentex.net > Providing Internet since 1994 www.sentex.net > Cambridge, Ontario Canada www.sentex.net/mike > >