From owner-freebsd-stable@FreeBSD.ORG Sun Sep 14 20:23:08 2014 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id DEC3F16A; Sun, 14 Sep 2014 20:23:08 +0000 (UTC) Received: from smarthost1.sentex.ca (smarthost1.sentex.ca [IPv6:2607:f3e0:0:1::12]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client CN "smarthost.sentex.ca", Issuer "smarthost.sentex.ca" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id A15A221C; Sun, 14 Sep 2014 20:23:08 +0000 (UTC) Received: from [IPv6:2607:f3e0:0:4:f025:8813:7603:7e4a] (saphire3.sentex.ca [IPv6:2607:f3e0:0:4:f025:8813:7603:7e4a]) by smarthost1.sentex.ca (8.14.9/8.14.9) with ESMTP id s8EKN4go004725; Sun, 14 Sep 2014 16:23:04 -0400 (EDT) (envelope-from mike@sentex.net) Message-ID: <5415F926.80902@sentex.net> Date: Sun, 14 Sep 2014 16:23:02 -0400 From: Mike Tancsa Organization: Sentex Communications User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:31.0) Gecko/20100101 Thunderbird/31.1.1 MIME-Version: 1.0 To: Eric Joyner Subject: Re: svn commit: r267935 - head/sys/dev/e1000 (with work around?) References: <1737288805.35881978.1410642408202.JavaMail.root@uoguelph.ca> <5414DEAA.1060009@sentex.net> In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit X-Scanned-By: MIMEDefang 2.74 Cc: Glen Barber , Rick Macklem , freebsd-stable , Jack Vogel X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 14 Sep 2014 20:23:09 -0000 On 9/14/2014 4:08 PM, Eric Joyner wrote: > I'll try to, but I can't promise anything soon -- there's a lot of > 10gig/40gig stuff to do. Thanks Eric. At first, I thought it was just a certain variant of the em, but I have found at least two that get wedged. Its pretty easy to reproduce. One other thing I noticed is that the README states, "TSO is not supported on 82547 and 82544-based adapters, as well as older adapters." Yet, by default its enabled with the driver. Perhaps a check to just disable TSO for NICs not supported automatically ? The other NIC I can recreate the problem with is root@backup3:/usr/home/mdtancsa # pciconf -lvcb em0 em0@pci0:0:25:0: class=0x020000 card=0x34ec8086 chip=0x10ef8086 rev=0x05 hdr=0x00 vendor = 'Intel Corporation' device = '82578DM Gigabit Network Connection' class = network subclass = ethernet bar [10] = type Memory, range 32, base 0xb1a00000, size 131072, enabled bar [14] = type Memory, range 32, base 0xb1a25000, size 4096, enabled bar [18] = type I/O Port, range 32, base 0x2040, size 32, enabled cap 01[c8] = powerspec 2 supports D0 D3 current D0 cap 05[d0] = MSI supports 1 message, 64 bit enabled with 1 message cap 13[e0] = PCI Advanced Features: FLR TP root@backup3:/usr/home/mdtancsa # The odd thing however is that all works fine with the previous rev that was in the tree. ---Mike > > --- > - Eric Joyner > > On Sat, Sep 13, 2014 at 5:17 PM, Mike Tancsa > wrote: > > > > Hi Eric, > Any chance you can look at this em driver bug in Jack's > absence ? > > ---Mike > > > On 9/13/2014 5:06 PM, Rick Macklem wrote: > > Mike Tansca wrote: > > On 9/12/2014 7:33 PM, Rick Macklem wrote: > > I wrote: > > The patches are in 10.1. I thought his report said > 10.0 in the message. > > If Mike is running a recent stable/10 or > releng/10.1, then it has been > patched for this and NFS should work with TSO > enabled. If it doesn't, > then something else is broken. > > Oops, I looked and I see Mike was testing r270560 (which > would have both > the patches). I don't have an explanation why TSO and > 64K rsize, wsize > would cause a hang, but does appear it will exist in > 10.1 unless it > gets resolved. > > Mike, one difference is that, even with the patches the > driver will be > copying the transmit mbuf list via m_defrag() to 32 > MCLBYTE clusters > when using 64K rsize, wsize. > If you can reproduce the hang, you might want to look at > how many mbuf > clusters are allocated. If you've hit the limit, then I > think that > would explain it. > > > I have been running the test for a few hrs now and no > lockups of the > nic, so doing the nfs mount with -orsize=32768,wsize=32768 > certainly > > ? seems to work around the lockup. How do I check the mbuf > clusters ? > > Btw, in the past when reducing the rsize,wsize has fixed a > problem that > isn't fixed by disabling TSO, it has been a problem w.r.t. > receiving a > burst of ethernet packets. > I believe this may be a problem with either the receive ring size or > interrupt latency (testers have reported cases where changing > the way > the device driver uses interrupts have fixed the problem so that it > worked with 64K rsize, wsize). > > I have no familiarity with this hardware/driver so I can't suggest > anything specific to try except maybe how interrupts are handled, > if the driver has a sysctl for that. > > rick > > _________________________________________________ > freebsd-stable@freebsd.org > mailing list > http://lists.freebsd.org/__mailman/listinfo/freebsd-__stable > > To unsubscribe, send any mail to > "freebsd-stable-unsubscribe@__freebsd.org > " > > > > > -- > ------------------- > Mike Tancsa, tel +1 519 651 3400 > Sentex Communications, mike@sentex.net > Providing Internet services since 1994 www.sentex.net > > Cambridge, Ontario Canada http://www.tancsa.com/ > > -- ------------------- Mike Tancsa, tel +1 519 651 3400 Sentex Communications, mike@sentex.net Providing Internet services since 1994 www.sentex.net Cambridge, Ontario Canada http://www.tancsa.com/