From owner-freebsd-net@FreeBSD.ORG Fri Mar 8 08:27:44 2013 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 22F4F981; Fri, 8 Mar 2013 08:27:44 +0000 (UTC) (envelope-from jfvogel@gmail.com) Received: from mail-ve0-f180.google.com (mail-ve0-f180.google.com [209.85.128.180]) by mx1.freebsd.org (Postfix) with ESMTP id B7AFABF; Fri, 8 Mar 2013 08:27:43 +0000 (UTC) Received: by mail-ve0-f180.google.com with SMTP id jx10so1040939veb.25 for ; Fri, 08 Mar 2013 00:27:37 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:x-received:in-reply-to:references:date:message-id :subject:from:to:cc:content-type; bh=gdU2vJzm6t7YLjU3UDorWz5FHnY+HPa9McEz1yMCyCE=; b=qZZNNrVVstQ0I7K74WY6rwNkgy8oTnegeQiPmHhfgQwDkw2I3fqLOAQA9MPVrtWgUH dU8DH0mp2XmCHtpVoOnVIstJJk+K4i/LLiE7T5P1manDYF/wagbMYl0RZjc/4H2WZAue lXCe+p6qG+7tyTkrA11Og9rjHzBoxAsq5ZOZOcC/TNmh00MhQTmt9S0Mnh4C9lUwNyfI dl5YLvOKi+eDkUPxxfXUOBEzkbGJ/8T8ja/pfetZuhAQmluDH6fO4ANvUFBls+4ivu7i wP9BywpSFZLOkzQb+hPnoFeW2uZVtbVxmzf8sJk4bxuHIp6NH54qD56G0kX4hvpdWcOc Ge5w== MIME-Version: 1.0 X-Received: by 10.58.56.161 with SMTP id b1mr588244veq.42.1362731257517; Fri, 08 Mar 2013 00:27:37 -0800 (PST) Received: by 10.220.191.132 with HTTP; Fri, 8 Mar 2013 00:27:37 -0800 (PST) In-Reply-To: <20130308075458.GA1442@michelle.cdnetworks.com> References: <20793.36593.774795.720959@hergotha.csail.mit.edu> <20130308075458.GA1442@michelle.cdnetworks.com> Date: Fri, 8 Mar 2013 00:27:37 -0800 Message-ID: Subject: Re: Limits on jumbo mbuf cluster allocation From: Jack Vogel To: pyunyh@gmail.com Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.14 Cc: jfv@freebsd.org, freebsd-net@freebsd.org, Garrett Wollman X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 08 Mar 2013 08:27:44 -0000 On Thu, Mar 7, 2013 at 11:54 PM, YongHyeon PYUN wrote: > On Fri, Mar 08, 2013 at 02:10:41AM -0500, Garrett Wollman wrote: > > I have a machine (actually six of them) with an Intel dual-10G NIC on > > the motherboard. Two of them (so far) are connected to a network > > using jumbo frames, with an MTU a little under 9k, so the ixgbe driver > > allocates 32,000 9k clusters for its receive rings. I have noticed, > > on the machine that is an active NFS server, that it can get into a > > state where allocating more 9k clusters fails (as reflected in the > > mbuf failure counters) at a utilization far lower than the configured > > limits -- in fact, quite close to the number allocated by the driver > > for its rx ring. Eventually, network traffic grinds completely to a > > halt, and if one of the interfaces is administratively downed, it > > cannot be brought back up again. There's generally plenty of physical > > memory free (at least two or three GB). > > > > There are no console messages generated to indicate what is going on, > > and overall UMA usage doesn't look extreme. I'm guessing that this is > > a result of kernel memory fragmentation, although I'm a little bit > > unclear as to how this actually comes about. I am assuming that this > > hardware has only limited scatter-gather capability and can't receive > > a single packet into multiple buffers of a smaller size, which would > > reduce the requirement for two-and-a-quarter consecutive pages of KVA > > for each packet. In actual usage, most of our clients aren't on a > > jumbo network, so most of the time, all the packets will fit into a > > normal 2k cluster, and we've never observed this issue when the > > *server* is on a non-jumbo network. > > > > AFAIK all Intel controllers generate jumbo frame by concatenating > multiple mbufs on RX side so there is no physically contiguous 9KB > allocation. I vaguely guess there could be mbuf leakage when jumbo > frame is enabled. I would check how driver handles mbuf shortage or > frame errors while mbuf concatenation for jumbo frame is in > progress. > No, this is not true, if using a 9K jumbo it will actually use the larger mbuf pool, the code has been this way for a little while now. Jack > > > Does anyone have suggestions for dealing with this issue? Will > > increasing the amount of KVA (to, say, twice physical memory) help > > things? It seems to me like a bug that these large packets don't have > > their own submap to ensure that allocation is always possible when > > sufficient physical pages are available. > > > > -GAWollman >