From owner-svn-src-head@FreeBSD.ORG Tue Mar 12 15:01:02 2013 Return-Path: Delivered-To: svn-src-head@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 8AD178F7; Tue, 12 Mar 2013 15:01:02 +0000 (UTC) (envelope-from glebius@FreeBSD.org) Received: from cell.glebius.int.ru (glebius.int.ru [81.19.69.10]) by mx1.freebsd.org (Postfix) with ESMTP id 09BD1E53; Tue, 12 Mar 2013 15:01:01 +0000 (UTC) Received: from cell.glebius.int.ru (localhost [127.0.0.1]) by cell.glebius.int.ru (8.14.6/8.14.6) with ESMTP id r2CF0rZv096936; Tue, 12 Mar 2013 19:00:53 +0400 (MSK) (envelope-from glebius@FreeBSD.org) Received: (from glebius@localhost) by cell.glebius.int.ru (8.14.6/8.14.6/Submit) id r2CF0rm7096935; Tue, 12 Mar 2013 19:00:53 +0400 (MSK) (envelope-from glebius@FreeBSD.org) X-Authentication-Warning: cell.glebius.int.ru: glebius set sender to glebius@FreeBSD.org using -f Date: Tue, 12 Mar 2013 19:00:53 +0400 From: Gleb Smirnoff To: Andre Oppermann Subject: Re: svn commit: r248196 - head/sys/nfs Message-ID: <20130312150053.GI48089@FreeBSD.org> References: <201303121219.r2CCJN5Z069789@svn.freebsd.org> <513F3A54.3090702@freebsd.org> MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="vkEkAx9hr54EJ73W" Content-Disposition: inline In-Reply-To: <513F3A54.3090702@freebsd.org> User-Agent: Mutt/1.5.21 (2010-09-15) Cc: svn-src-head@freebsd.org, svn-src-all@freebsd.org, src-committers@freebsd.org X-BeenThere: svn-src-head@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: SVN commit messages for the src tree for head/-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 12 Mar 2013 15:01:02 -0000 --vkEkAx9hr54EJ73W Content-Type: text/plain; charset=koi8-r Content-Disposition: inline Andre, On Tue, Mar 12, 2013 at 03:23:16PM +0100, Andre Oppermann wrote: A> On 12.03.2013 13:19, Gleb Smirnoff wrote: A> > Author: glebius A> > Date: Tue Mar 12 12:19:23 2013 A> > New Revision: 248196 A> > URL: http://svnweb.freebsd.org/changeset/base/248196 A> > A> > Log: A> > Use m_get2() to get mbuf of appropriate size. A> A> The problem with m_get2() is that it will attempt to use jumbo mbufs A> larger than PAGE_SIZE if the request size is sufficiently large. A> A> In light of the recent issues with jumbo > PAGE_SIZE allocations on A> loaded systems I don't think it is appropriate to extend this practice A> further. A> A> Anything that needs more than PAGE_SIZE (where PAGE_SIZE == 4K) mbuf A> space should allocate an mbuf chain with m_getm2(). That is what we A> have mbuf chains for. A> A> Sufficient availability of mbufs > PAGE_SIZE cannot be guaranteed after A> some uptime and/or a loaded system due to memory fragmentation. Allocation A> can fail non-deterministically for mbufs > PAGE_SIZE long before smaller A> mbufs become unavailable due to overall (kernel) memory exhaustion. A> A> Mbufs > PAGE_SIZE are not only contiguous in kernel address space but A> also in real memory address space. They are intended jumbo frames on A> poor NIC's without scatter-capable DMA engines. A> A> When you put the m_get2() function proposal for review and comments I A> already highlighted these issues and put forward serious concerns about A> this. A> A> Please change m_get2() to limit itself to mbuf allocations <= PAGE_SIZE. I already got similar patch in my queue. We have a lot of code that could benefit from using both m_get2() and m_getm2() that are limited to not use jumbos at all. If you are concerned about using jumbos that are > PAGE_SIZE, then I can extend API in my patch. ... done. Patch attached. The NFS code itself guarantees that it won't request > than MCLBYTES, so using bare m_get2() here is safe. I can add flag there later for clarity. -- Totus tuus, Glebius. --vkEkAx9hr54EJ73W Content-Type: text/x-diff; charset=koi8-r Content-Disposition: attachment; filename="m_get2,m_getm2.diff" Index: kern/uipc_mbuf.c =================================================================== --- kern/uipc_mbuf.c (revision 248207) +++ kern/uipc_mbuf.c (working copy) @@ -93,15 +93,20 @@ m_get2(int size, int how, short type, int flags) struct mb_args args; struct mbuf *m, *n; uma_zone_t zone; + int nojumbos, pgjumbos; - args.flags = flags; + nojumbos = (flags & M_NOJUMBO) ? 1 : 0; + pgjumbos = (flags & M_PGJUMBONLY) ? 1 : 0; + + args.flags = flags & (M_PKTHDR | M_EOR); args.type = type; if (size <= MHLEN || (size <= MLEN && (flags & M_PKTHDR) == 0)) return (uma_zalloc_arg(zone_mbuf, &args, how)); if (size <= MCLBYTES) return (uma_zalloc_arg(zone_pack, &args, how)); - if (size > MJUM16BYTES) + + if (nojumbos || (pgjumbos && size > MJUMPAGESIZE) || size > MJUM16BYTES) return (NULL); m = uma_zalloc_arg(zone_mbuf, &args, how); @@ -168,9 +173,12 @@ struct mbuf * m_getm2(struct mbuf *m, int len, int how, short type, int flags) { struct mbuf *mb, *nm = NULL, *mtail = NULL; + int usejumbos; KASSERT(len >= 0, ("%s: len is < 0", __func__)); + usejumbos = (flags & M_NOJUMBO) ? 0 : 1; + /* Validate flags. */ flags &= (M_PKTHDR | M_EOR); @@ -180,7 +188,7 @@ m_getm2(struct mbuf *m, int len, int how, short ty /* Loop and append maximum sized mbufs to the chain tail. */ while (len > 0) { - if (len > MCLBYTES) + if (usejumbos && len > MCLBYTES) mb = m_getjcl(how, type, (flags & M_PKTHDR), MJUMPAGESIZE); else if (len >= MINCLSIZE) Index: sys/mbuf.h =================================================================== --- sys/mbuf.h (revision 248207) +++ sys/mbuf.h (working copy) @@ -206,6 +206,13 @@ struct mbuf { #define M_HASHTYPEBITS 0x0F000000 /* mask of bits holding flowid hash type */ /* + * Flags to pass with mbuf flags to some functions, not written in + * an mbuf itself. + */ +#define M_NOJUMBO M_PROTO1 +#define M_PGJUMBONLY M_PROTO2 + +/* * For RELENG_{6,7} steal these flags for limited multiple routing table * support. In RELENG_8 and beyond, use just one flag and a tag. */ --vkEkAx9hr54EJ73W--