From owner-freebsd-stable@FreeBSD.ORG Sat Sep 13 21:20:43 2014 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id B7E21C11; Sat, 13 Sep 2014 21:20:43 +0000 (UTC) Received: from esa-jnhn.mail.uoguelph.ca (esa-jnhn.mail.uoguelph.ca [131.104.91.44]) by mx1.freebsd.org (Postfix) with ESMTP id 6D8C6B16; Sat, 13 Sep 2014 21:20:43 +0000 (UTC) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AqAEAK60FFSDaFve/2dsb2JhbABfhDuCeM4KgR94hAUoVhsYAgINeohRpxOVNoEsjUojNIRSBbJHg3ohgTZBgQIBAQE X-IronPort-AV: E=Sophos;i="5.04,519,1406606400"; d="scan'208";a="153898247" Received: from muskoka.cs.uoguelph.ca (HELO zcs3.mail.uoguelph.ca) ([131.104.91.222]) by esa-jnhn.mail.uoguelph.ca with ESMTP; 13 Sep 2014 17:06:48 -0400 Received: from zcs3.mail.uoguelph.ca (localhost.localdomain [127.0.0.1]) by zcs3.mail.uoguelph.ca (Postfix) with ESMTP id 3412DB4050; Sat, 13 Sep 2014 17:06:48 -0400 (EDT) Date: Sat, 13 Sep 2014 17:06:48 -0400 (EDT) From: Rick Macklem To: Mike Tancsa Message-ID: <1737288805.35881978.1410642408202.JavaMail.root@uoguelph.ca> Subject: Re: svn commit: r267935 - head/sys/dev/e1000 (with work around?) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Originating-IP: [172.17.91.209] X-Mailer: Zimbra 7.2.6_GA_2926 (ZimbraWebClient - FF3.0 (Win)/7.2.6_GA_2926) Cc: Glen Barber , freebsd-stable , Jack Vogel X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 13 Sep 2014 21:20:43 -0000 Mike Tansca wrote: > On 9/12/2014 7:33 PM, Rick Macklem wrote: > > I wrote: > >> The patches are in 10.1. I thought his report said 10.0 in the message. > >> > >> If Mike is running a recent stable/10 or releng/10.1, then it has been > >> patched for this and NFS should work with TSO enabled. If it doesn't, > >> then something else is broken. > > Oops, I looked and I see Mike was testing r270560 (which would have both > > the patches). I don't have an explanation why TSO and 64K rsize, wsize > > would cause a hang, but does appear it will exist in 10.1 unless it > > gets resolved. > > > > Mike, one difference is that, even with the patches the driver will be > > copying the transmit mbuf list via m_defrag() to 32 MCLBYTE clusters > > when using 64K rsize, wsize. > > If you can reproduce the hang, you might want to look at how many mbuf > > clusters are allocated. If you've hit the limit, then I think that > > would explain it. > > I have been running the test for a few hrs now and no lockups of the > nic, so doing the nfs mount with -orsize=32768,wsize=32768 certainly ? seems to work around the lockup. How do I check the mbuf clusters ? Btw, in the past when reducing the rsize,wsize has fixed a problem that isn't fixed by disabling TSO, it has been a problem w.r.t. receiving a burst of ethernet packets. I believe this may be a problem with either the receive ring size or interrupt latency (testers have reported cases where changing the way the device driver uses interrupts have fixed the problem so that it worked with 64K rsize, wsize). I have no familiarity with this hardware/driver so I can't suggest anything specific to try except maybe how interrupts are handled, if the driver has a sysctl for that. rick