Date: Thu, 29 Dec 2011 22:51:25 -0500 From: Mike Andrews <mandrews@bit0.com> To: freebsd-stable@freebsd.org, pyunyh@gmail.com Subject: Re: 9.0-RC2 re(4) "no memory for jumbo buffers" issue Message-ID: <4EFD353D.1060900@bit0.com> In-Reply-To: <20111128234212.GC1655@michelle.cdnetworks.com> References: <4ED154B6.2030304@bit0.com> <20111128013931.GC1830@michelle.cdnetworks.com> <4ED40D58.1030107@bit0.com> <20111128234212.GC1655@michelle.cdnetworks.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On 11/28/2011 6:42 PM, YongHyeon PYUN wrote: > On Mon, Nov 28, 2011 at 05:38:16PM -0500, Mike Andrews wrote: >> On 11/27/11 8:39 PM, YongHyeon PYUN wrote: >>> On Sat, Nov 26, 2011 at 04:05:58PM -0500, Mike Andrews wrote: >>>> I have a Supermicro 5015A-H (Intel Atom 330) server with two Realtek >>>> RTL8111C-GR gigabit NICs on it. As far as I can tell, these support >>>> jumbo frames up to 7422 bytes. When running them at an MTU of 5000 on >>> Actually the maximum size is 6KB for RTL8111C, not 7422. >>> RTL8111C and newer PCIe based gigabit controllers no longer support >>> scattering a jumbo frame into multiple RX buffers so a single RX >>> buffer has to receive an entire jumbo frame. This adds more burden >>> to system because it has to allocate a jumbo frame even when it >>> receives a pure TCP ACK. >> OK, that makes sense. >> >>>> FreeBSD 9.0-RC2, after a week or so of update, with fairly light network >>>> activity, the interfaces die with "no memory for jumbo buffers" errors >>>> on the console. Unloading and reloading the driver (via serial console) >>>> doesn't help; only rebooting seems to clear it up. >>>> >>> The jumbo code path is the same as normal MTU sized one so I think >>> possibility of leaking mbufs in driver is very low. And the >>> message "no memory for jumbo RX buffers" can only happen either >>> when you up the interface again or interface restart triggered by >>> watchdog timeout handler. I don't think you're seeing watchdog >>> timeouts though. >> I'm fairly certain the interface isn't changing state when this happens >> -- it just kinda spontaneously happens after a week or two, with no >> interface up/down transitions. I don't see any watchdog messages when >> this happens. > There is another code path that causes controller reinitialization. > If you change MTU or offloading configuration(TSO, VLAN tagging, > checksum offloading etc) it will reinitialize the controller. So do > you happen to trigger one of these code path during a week or two? > >>> When you see "no memory for jumbo RX buffers" message, did you >>> check available mbuf pool? >> Not yet, that's why I asked for debugging tips -- I'll do that the next >> time this happens. >> >>>> What's the best way to go about debugging this... which sysctl's should >>>> I be looking at first? I have already tried raising kern.ipc.nmbjumbo9 >>>> to 16384 and it doesn't seem to help things... maybe prolonging it >>>> slightly, but not by much. The problem is it takes a week or so to >>>> reproduce the problem each time... >>>> >>> I vaguely guess it could be related with other subsystem which >>> leaks mbufs such that driver was not able to get more jumbo RX >>> buffers from system. For instance, r228016 would be worth to try on >>> your box. I can't clearly explain why em(4) does not suffer from >>> the issue though. >> I've just this morning built a kernel with that fix, so we'll see how >> that goes. > Ok. OK, this just happened again with a 9.0-RC3 kernel rev r228247. whitedog# ifconfig re0 down;ifconfig re0 up;ifconfig re1 down;ifconfig re1 up re0: no memory for jumbo RX buffers re1: no memory for jumbo RX buffers whitedog# netstat -m 526/1829/2355 mbufs in use (current/cache/total) 0/1278/1278/25600 mbuf clusters in use (current/cache/total/max) 0/356 mbuf+clusters out of packet secondary zone in use (current/cache) 0/336/336/12800 4k (page size) jumbo clusters in use (current/cache/total/max) 512/385/897/6400 9k jumbo clusters in use (current/cache/total/max) 0/0/0/3200 16k jumbo clusters in use (current/cache/total/max) 4739K/7822K/12561K bytes allocated to network (current/cache/total) 0/0/0 requests for mbufs denied (mbufs/clusters/mbuf+clusters) 0/4560/0 requests for jumbo clusters denied (4k/9k/16k) 0/0/0 sfbufs in use (current/peak/max) 0 requests for sfbufs denied 0 requests for sfbufs delayed 0 requests for I/O initiated by sendfile 0 calls to protocol drain routines
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4EFD353D.1060900>