From owner-freebsd-stable@FreeBSD.ORG Fri Dec 30 03:51:59 2011 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 53E271065675 for ; Fri, 30 Dec 2011 03:51:59 +0000 (UTC) (envelope-from mandrews@bit0.com) Received: from magnum.bit0.com (magnum.bit0.com [IPv6:2604:e700:b0:1::200]) by mx1.freebsd.org (Postfix) with ESMTP id DFDDA8FC13 for ; Fri, 30 Dec 2011 03:51:58 +0000 (UTC) Received: from magnum.bit0.com (localhost [127.0.0.1]) by magnum.bit0.com (Postfix) with ESMTP id 2DC6653AE; Thu, 29 Dec 2011 22:51:58 -0500 (EST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed; d=bit0.com; h=message-id :date:from:mime-version:to:subject:references:in-reply-to :content-type:content-transfer-encoding; s=boogity; bh=BzikaFP1f 8pGZZXfdNapFdz3IUlGcgbXz6FCYWU4e1o=; b=2j0iBNrKC8eCruJSxZBG7Xbsz rSu0RB8ViXi3IGyuaSKAn+eZHq9N/HaL/+4A21XfoFkmlVpwdhLli+KPdFnqKuaU ngcTFWlUOZHPtQlaktHt9qlD+LTZvJ+kaLorATICQLm6OEMR/CkAf6QNSTlArsqn M29A9PptBsPub0tIlk= DomainKey-Signature: a=rsa-sha1; c=nofws; d=bit0.com; h=message-id:date :from:mime-version:to:subject:references:in-reply-to :content-type:content-transfer-encoding; q=dns; s=boogity; b=Sk2 7PyYahsO13rxzqS02Se6hPnrJbG7ZML1ZNu+7mBtOzkCt+IM/Nk3U/cKzLb73uti KUrr8FkHC/zifn5v1SUCO7De4X/MeSlahErVc/26KGDwg9I5nDVXQtq9UKjXPXMR 2ebFZx3hVGcOiJ18FD45a2pIFDEC1Y0NDKl2pJww= Received: from [IPv6:2001:470:1f11:c3c:230:1bff:febc:8604] (unknown [IPv6:2001:470:1f11:c3c:230:1bff:febc:8604]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (No client certificate requested) by magnum.bit0.com (Postfix) with ESMTPSA id A91C553A4; Thu, 29 Dec 2011 22:51:57 -0500 (EST) Message-ID: <4EFD353D.1060900@bit0.com> Date: Thu, 29 Dec 2011 22:51:25 -0500 From: Mike Andrews User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:6.0) Gecko/20110812 Thunderbird/6.0 MIME-Version: 1.0 To: freebsd-stable@freebsd.org, pyunyh@gmail.com References: <4ED154B6.2030304@bit0.com> <20111128013931.GC1830@michelle.cdnetworks.com> <4ED40D58.1030107@bit0.com> <20111128234212.GC1655@michelle.cdnetworks.com> In-Reply-To: <20111128234212.GC1655@michelle.cdnetworks.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: Subject: Re: 9.0-RC2 re(4) "no memory for jumbo buffers" issue X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 30 Dec 2011 03:52:00 -0000 On 11/28/2011 6:42 PM, YongHyeon PYUN wrote: > On Mon, Nov 28, 2011 at 05:38:16PM -0500, Mike Andrews wrote: >> On 11/27/11 8:39 PM, YongHyeon PYUN wrote: >>> On Sat, Nov 26, 2011 at 04:05:58PM -0500, Mike Andrews wrote: >>>> I have a Supermicro 5015A-H (Intel Atom 330) server with two Realtek >>>> RTL8111C-GR gigabit NICs on it. As far as I can tell, these support >>>> jumbo frames up to 7422 bytes. When running them at an MTU of 5000 on >>> Actually the maximum size is 6KB for RTL8111C, not 7422. >>> RTL8111C and newer PCIe based gigabit controllers no longer support >>> scattering a jumbo frame into multiple RX buffers so a single RX >>> buffer has to receive an entire jumbo frame. This adds more burden >>> to system because it has to allocate a jumbo frame even when it >>> receives a pure TCP ACK. >> OK, that makes sense. >> >>>> FreeBSD 9.0-RC2, after a week or so of update, with fairly light network >>>> activity, the interfaces die with "no memory for jumbo buffers" errors >>>> on the console. Unloading and reloading the driver (via serial console) >>>> doesn't help; only rebooting seems to clear it up. >>>> >>> The jumbo code path is the same as normal MTU sized one so I think >>> possibility of leaking mbufs in driver is very low. And the >>> message "no memory for jumbo RX buffers" can only happen either >>> when you up the interface again or interface restart triggered by >>> watchdog timeout handler. I don't think you're seeing watchdog >>> timeouts though. >> I'm fairly certain the interface isn't changing state when this happens >> -- it just kinda spontaneously happens after a week or two, with no >> interface up/down transitions. I don't see any watchdog messages when >> this happens. > There is another code path that causes controller reinitialization. > If you change MTU or offloading configuration(TSO, VLAN tagging, > checksum offloading etc) it will reinitialize the controller. So do > you happen to trigger one of these code path during a week or two? > >>> When you see "no memory for jumbo RX buffers" message, did you >>> check available mbuf pool? >> Not yet, that's why I asked for debugging tips -- I'll do that the next >> time this happens. >> >>>> What's the best way to go about debugging this... which sysctl's should >>>> I be looking at first? I have already tried raising kern.ipc.nmbjumbo9 >>>> to 16384 and it doesn't seem to help things... maybe prolonging it >>>> slightly, but not by much. The problem is it takes a week or so to >>>> reproduce the problem each time... >>>> >>> I vaguely guess it could be related with other subsystem which >>> leaks mbufs such that driver was not able to get more jumbo RX >>> buffers from system. For instance, r228016 would be worth to try on >>> your box. I can't clearly explain why em(4) does not suffer from >>> the issue though. >> I've just this morning built a kernel with that fix, so we'll see how >> that goes. > Ok. OK, this just happened again with a 9.0-RC3 kernel rev r228247. whitedog# ifconfig re0 down;ifconfig re0 up;ifconfig re1 down;ifconfig re1 up re0: no memory for jumbo RX buffers re1: no memory for jumbo RX buffers whitedog# netstat -m 526/1829/2355 mbufs in use (current/cache/total) 0/1278/1278/25600 mbuf clusters in use (current/cache/total/max) 0/356 mbuf+clusters out of packet secondary zone in use (current/cache) 0/336/336/12800 4k (page size) jumbo clusters in use (current/cache/total/max) 512/385/897/6400 9k jumbo clusters in use (current/cache/total/max) 0/0/0/3200 16k jumbo clusters in use (current/cache/total/max) 4739K/7822K/12561K bytes allocated to network (current/cache/total) 0/0/0 requests for mbufs denied (mbufs/clusters/mbuf+clusters) 0/4560/0 requests for jumbo clusters denied (4k/9k/16k) 0/0/0 sfbufs in use (current/peak/max) 0 requests for sfbufs denied 0 requests for sfbufs delayed 0 requests for I/O initiated by sendfile 0 calls to protocol drain routines