From owner-freebsd-stable@FreeBSD.ORG Sat Sep 13 12:08:37 2014 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id ABCF4C09; Sat, 13 Sep 2014 12:08:37 +0000 (UTC) Received: from esa-jnhn.mail.uoguelph.ca (esa-jnhn.mail.uoguelph.ca [131.104.91.44]) by mx1.freebsd.org (Postfix) with ESMTP id 5FB53AB; Sat, 13 Sep 2014 12:08:35 +0000 (UTC) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: Aq8EAG4zFFSDaFve/2dsb2JhbABcA4NgVwSCeMYZCoZ6VAGBH3iEAwEBAQMBAQEBICYFGAgLBRYHEQICDRkCKQEJJgYIBwQBHASIFQgNpkiVSgEXgSyNSQEGAQEIEyQQBxGCJkESgUEFhh+EKYs8hAGEYpNggWcegXUhLwd/AQgXIoECAQEB X-IronPort-AV: E=Sophos;i="5.04,517,1406606400"; d="scan'208";a="153841900" Received: from muskoka.cs.uoguelph.ca (HELO zcs3.mail.uoguelph.ca) ([131.104.91.222]) by esa-jnhn.mail.uoguelph.ca with ESMTP; 13 Sep 2014 08:08:28 -0400 Received: from zcs3.mail.uoguelph.ca (localhost.localdomain [127.0.0.1]) by zcs3.mail.uoguelph.ca (Postfix) with ESMTP id 17E48B4044; Sat, 13 Sep 2014 08:08:28 -0400 (EDT) Date: Sat, 13 Sep 2014 08:08:28 -0400 (EDT) From: Rick Macklem To: Mike Tancsa Message-ID: <1472615261.35797393.1410610108085.JavaMail.root@uoguelph.ca> In-Reply-To: <54139566.7050202@sentex.net> Subject: Re: svn commit: r267935 - head/sys/dev/e1000 (with work around?) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Originating-IP: [172.17.91.201] X-Mailer: Zimbra 7.2.6_GA_2926 (ZimbraWebClient - FF3.0 (Win)/7.2.6_GA_2926) Cc: Glen Barber , freebsd-stable , Jack Vogel X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 13 Sep 2014 12:08:37 -0000 Mike Tancsa wrote: > On 9/12/2014 7:33 PM, Rick Macklem wrote: > > I wrote: > >> The patches are in 10.1. I thought his report said 10.0 in the > >> message. > >> > >> If Mike is running a recent stable/10 or releng/10.1, then it has > >> been > >> patched for this and NFS should work with TSO enabled. If it > >> doesn't, > >> then something else is broken. > > Oops, I looked and I see Mike was testing r270560 (which would have > > both > > the patches). I don't have an explanation why TSO and 64K rsize, > > wsize > > would cause a hang, but does appear it will exist in 10.1 unless it > > gets resolved. > > > > Mike, one difference is that, even with the patches the driver will > > be > > copying the transmit mbuf list via m_defrag() to 32 MCLBYTE > > clusters > > when using 64K rsize, wsize. > > If you can reproduce the hang, you might want to look at how many > > mbuf > > clusters are allocated. If you've hit the limit, then I think that > > would explain it. > > > I have been running the test for a few hrs now and no lockups of the > nic, so doing the nfs mount with -orsize=32768,wsize=32768 certainly > seems to work around the lockup. How do I check the mbuf clusters ? > > root@backup3:/usr/home/mdtancsa # vmstat -z | grep -i clu > mbuf_cluster: 2048, 760054, 4444, 370, 3088708, 0, > 0 > root@backup3:/usr/home/mdtancsa # > root@backup3:/usr/home/mdtancsa # netstat -m > 3322/4028/7350 mbufs in use (current/cache/total) > 2826/1988/4814/760054 mbuf clusters in use (current/cache/total/max) This was all I was thinking of. It certainly doesn't look like a problem. If the 64K rsize, wsize test is about the same, I'd say you aren't running out of mbuf clusters. > 2430/1618 mbuf+clusters out of packet secondary zone in use > (current/cache) > 0/4/4/380026 4k (page size) jumbo clusters in use > (current/cache/total/max) > 0/0/0/112600 9k jumbo clusters in use (current/cache/total/max) > 0/0/0/63337 16k jumbo clusters in use (current/cache/total/max) > 6482K/4999K/11481K bytes allocated to network (current/cache/total) > 0/0/0 requests for mbufs denied (mbufs/clusters/mbuf+clusters) > 0/0/0 requests for mbufs delayed (mbufs/clusters/mbuf+clusters) > 0/0/0 requests for jumbo clusters delayed (4k/9k/16k) > 0/0/0 requests for jumbo clusters denied (4k/9k/16k) > 0 requests for sfbufs denied > 0 requests for sfbufs delayed > 0 requests for I/O initiated by sendfile > root@backup3:/usr/home/mdtancsa # > > Interface is RUNNING and ACTIVE > em1: hw tdh = 343, hw tdt = 838 > em1: hw rdh = 512, hw rdt = 511 > em1: Tx Queue Status = 1 > em1: TX descriptors avail = 516 > em1: Tx Descriptors avail failure = 1 I don't know anything about the hardware, but this looks suspicious to me? Hopefully someone familiar with the hardware can help, rick > em1: RX discarded packets = 0 > em1: RX Next to Check = 512 > em1: RX Next to Refresh = 511 > > > I just tested on the other em nic and I can wedge it as well, so its > not > limited to one particular type of em nic. > > > em0: Watchdog timeout -- resetting > em0: Queue(0) tdh = 349, hw tdt = 176 > em0: TX(0) desc avail = 173,Next TX to Clean = 349 > em0: link state changed to DOWN > em0: link state changed to UP > > so it does not seem limited to just certain em nics > > em0@pci0:0:25:0: class=0x020000 card=0x34ec8086 > chip=0x10ef8086 > rev=0x05 hdr=0x00 > vendor = 'Intel Corporation' > device = '82578DM Gigabit Network Connection' > class = network > subclass = ethernet > bar [10] = type Memory, range 32, base 0xb1a00000, size > 131072, > enabled > bar [14] = type Memory, range 32, base 0xb1a25000, size 4096, > enabled > bar [18] = type I/O Port, range 32, base 0x2040, size 32, > enabled > cap 01[c8] = powerspec 2 supports D0 D3 current D0 > cap 05[d0] = MSI supports 1 message, 64 bit enabled with 1 > message > cap 13[e0] = PCI Advanced Features: FLR TP > > > I can lock things up fairly quickly by running these 2 scripts across > an > nfs mount. > > #!/bin/sh > > while true > do > dd if=/dev/urandom ibs=64k count=1000 | pbzip2 -c -p3 > > /mnt/test.bz2 > dd if=/dev/urandom ibs=63k count=1000 | pbzip2 -c -p3 > > /mnt/test.bz2 > dd if=/dev/urandom ibs=66k count=1000 | pbzip2 -c -p3 > > /mnt/test.bz2 > done > root@backup3:/usr/home/mdtancsa # cat i3 > #!/bin/sh > > while true > do > dd if=/dev/zero of=/mnt/test2 bs=128k count=2000 > sleep 10 > done > > > ---Mike > > > > > -- > ------------------- > Mike Tancsa, tel +1 519 651 3400 > Sentex Communications, mike@sentex.net > Providing Internet services since 1994 www.sentex.net > Cambridge, Ontario Canada http://www.tancsa.com/ > _______________________________________________ > freebsd-stable@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-stable > To unsubscribe, send any mail to > "freebsd-stable-unsubscribe@freebsd.org" >