From owner-svn-src-user@FreeBSD.ORG Mon Dec 3 09:38:02 2012 Return-Path: Delivered-To: svn-src-user@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id CEB4B56E; Mon, 3 Dec 2012 09:38:02 +0000 (UTC) (envelope-from rwatson@FreeBSD.org) Received: from cyrus.watson.org (cyrus.watson.org [65.122.17.42]) by mx1.freebsd.org (Postfix) with ESMTP id 80D6C8FC13; Mon, 3 Dec 2012 09:38:02 +0000 (UTC) Received: from fledge.watson.org (fledge.watson.org [65.122.17.41]) by cyrus.watson.org (Postfix) with ESMTPS id 15E2C46B39; Mon, 3 Dec 2012 04:38:01 -0500 (EST) Date: Mon, 3 Dec 2012 09:38:00 +0000 (GMT) From: Robert Watson X-X-Sender: robert@fledge.watson.org To: Maxim Sobolev Subject: Re: svn commit: r242910 - in user/andre/tcp_workqueue/sys: kern sys In-Reply-To: <50BC6EF9.4040706@FreeBSD.org> Message-ID: References: <201211120847.qAC8lEAM086331@svn.freebsd.org> <50A0D420.4030106@freebsd.org> <0039CD42-C909-41D0-B0A7-7DFBC5B8D839@mu.org> <50A1206B.1000200@freebsd.org> <3D373186-09E2-48BC-8451-E4439F99B29D@mu.org> <50BC4EF6.8040902@FreeBSD.org> <50BC61A1.9040604@freebsd.org> <50BC6EF9.4040706@FreeBSD.org> User-Agent: Alpine 2.00 (BSF 1167 2008-08-23) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: Alfred Perlstein , Andre Oppermann , "src-committers@freebsd.org" , "svn-src-user@freebsd.org" X-BeenThere: svn-src-user@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "SVN commit messages for the experimental " user" src tree" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 03 Dec 2012 09:38:03 -0000 On Mon, 3 Dec 2012, Maxim Sobolev wrote: >>> We are also in quite mbufs hungry environment, is's not 10GigE, but we are >>> dealing with forwarding voice traffic, which consists of predominantly >>> very small packets (20-40 bytes). So we have a lot of small packets >>> in-flight, which uses a lot of MBUFS. >>> >>> What however happens, the network stack consistently lock up after we put >>> more than 16-18MB/sec onto it, which corresponds to about 350-400 Kpps. >> >> Can you drop into kdb? Do you have any backtrace to see where or how it >> lock up? > > Unfortunately it's hardly and option in production, unless we can reproduce > the issue on the test machine. It is not locking up per se, but all > network-related activity ceases. We can still get in through kvm console. Could you share the results of vmstat -z and netstat -m for the box? (FYI, if you do find yourself in DDB, "show uma" is essentially the same as "vmstat -z".) Robert > >>> This is way lower than any nmbclusters/maxusers limits we have >>> (1.5m/1500). >>> >>> With half of that critical load right now we see something along those >>> lines: >>> >>> 66365/71953/138318/1597440 mbuf clusters in use (current/cache/total/max) >>> 149617K/187910K/337528K bytes allocated to network (current/cache/total) >>> >>> Machine has 24GB of ram. >>> >>> vm.kmem_map_free: 24886267904 >>> vm.kmem_map_size: 70615040 >>> vm.kmem_size_scale: 1 >>> vm.kmem_size_max: 329853485875 >>> vm.kmem_size_min: 0 >>> vm.kmem_size: 24956903424 >>> >>> So my question is whether there are some other limits that can cause >>> MBUFS starvation if the number >>> of allocated clusters grows to more than 200-250k? I am curious how it >>> works in the dynamic system - >>> since no memory is pre-allocated for MBUFS, what happens if the >>> network load increases gradually >>> while the system is running? Is it possible to get to ENOMEM >>> eventually with all memory already >>> taken for other pools? >> >> Yes, mbuf allocation is not guaranteed and can fail before the limit is >> reached. What may happen is that a RX DMA ring refill failed and the >> driver wedges. This would be a driver bug. >> >> Can you give more information on the NIC's and drivers you use? > > All of them use various incarnations of Intel GigE chip, mostly igb(4), but > we've seen the same behaviour with em(4) as well. > > Both 8.2 and 8.3 are affected. We have not been able to confirm if 9.1 has > the same issue. > > igb1: port > 0xec00-0xec1f mem > 0xfbee0000-0xfbefffff,0xfbec0000-0xfbedffff,0xfbe9c000-0xfbe9ffff irq 40 at > device 0.1 on pci10 > igb1: Using MSIX interrupts with 9 vectors > igb1: Ethernet address: 00:30:48:cf:bb:1d > igb1: [ITHREAD] > igb1: Bound queue 0 to cpu 8 > igb1: [ITHREAD] > igb1: Bound queue 1 to cpu 9 > igb1: [ITHREAD] > igb1: Bound queue 2 to cpu 10 > igb1: [ITHREAD] > igb1: Bound queue 3 to cpu 11 > igb1: [ITHREAD] > igb1: Bound queue 4 to cpu 12 > igb1: [ITHREAD] > igb1: Bound queue 5 to cpu 13 > igb1: [ITHREAD] > igb1: Bound queue 6 to cpu 14 > igb1: [ITHREAD] > igb1: Bound queue 7 to cpu 15 > igb1: [ITHREAD] > > igb1@pci0:10:0:1: class=0x020000 card=0x10c915d9 chip=0x10c98086 > rev=0x01 hdr=0x00 > vendor = 'Intel Corporation' > class = network > subclass = ethernet > > -Maxim >