From owner-freebsd-hackers@FreeBSD.ORG Mon Nov 12 12:10:14 2012 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 747B51E6 for ; Mon, 12 Nov 2012 12:10:14 +0000 (UTC) (envelope-from andre@freebsd.org) Received: from c00l3r.networx.ch (c00l3r.networx.ch [62.48.2.2]) by mx1.freebsd.org (Postfix) with ESMTP id CEDD78FC15 for ; Mon, 12 Nov 2012 12:10:13 +0000 (UTC) Received: (qmail 96107 invoked from network); 12 Nov 2012 13:44:35 -0000 Received: from unknown (HELO [62.48.0.94]) ([62.48.0.94]) (envelope-sender ) by c00l3r.networx.ch (qmail-ldap-1.03) with SMTP for ; 12 Nov 2012 13:44:35 -0000 Message-ID: <50A0E77C.6010108@freebsd.org> Date: Mon, 12 Nov 2012 13:11:40 +0100 From: Andre Oppermann User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:16.0) Gecko/20121026 Thunderbird/16.0.2 MIME-Version: 1.0 To: alc@freebsd.org Subject: Re: Memory reserves or lack thereof References: <20121110132019.GP73505@kib.kiev.ua> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: Konstantin Belousov , "freebsd-hackers@freebsd.org" , Alan Cox , "Sears, Steven" X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 12 Nov 2012 12:10:14 -0000 On 11.11.2012 22:40, Alan Cox wrote: > On Sat, Nov 10, 2012 at 7:20 AM, Konstantin Belousov wrote: >> Your analysis is right, there is nothing to add or correct. >> This is the reason to strongly prefer M_WAITOK. >> > > Agreed. Once upon time, before SMPng, M_NOWAIT was rarely used. It was > well understand that it should only be used by interrupt handlers. > > The trouble is that M_NOWAIT conflates two orthogonal things. The obvious > being that the allocation shouldn't sleep. The other being how far we're > willing to deplete the cache/free page queues. > > When fine-grained locking got sprinkled throughout the kernel, we all to > often found ourselves wanting to do allocations without the possibility of > blocking. So, M_NOWAIT became commonplace, where it wasn't before. Yes, we have many places where we don't want to sleep for example in the network code. There we simply want to be told that we've run out of memory and handle the failure. It's expected to happen from time to time. We don't need or want to dig deep or into reserves. Packets are expected to get lost from time to time and upper layer protocols will handle retransmits just fine. What we *don't* want normally is to get blocked on a failing memory allocation. We'd rather drop this one and go on with the next packet to avoid the head of line blocking problem where everything cascades to a total halt. As a side note we don't do many, if any, true interrupt time allocations anymore. Usually the interrupt is just acknowledged in interrupt context and a taskqueue or ithread is scheduled to do all the hard work. Neither runs in interrupt context. > This had the unintended consequence of introducing a lot of memory > allocations in the top-half of the kernel, i.e., non-interrupt handling > code, that were digging deep into the cache/free page queues. > > Also, ironically, in today's kernel an "M_NOWAIT | M_USE_RESERVE" > allocation is less likely to succeed than an "M_NOWAIT" allocation. > However, prior to FreeBSD 7.x, M_NOWAIT couldn't allocate a cached page; it > could only allocate a free page. M_USE_RESERVE said that it ok to allocate > a cached page even though M_NOWAIT was specified. Consequently, the system > wouldn't dig as far into the free page queue if M_USE_RESERVE was > specified, because it was allowed to reclaim a cached page. > > In conclusion, I think it's time that we change M_NOWAIT so that it doesn't > dig any deeper into the cache/free page queues than M_WAITOK does and > reintroduce a M_USE_RESERVE-like flag that says dig deep into the > cache/free page queues. The trouble is that we then need to identify all > of those places that are implicitly depending on the current behavior of > M_NOWAIT also digging deep into the cache/free page queues so that we can > add an explicit M_USE_RESERVE. I don't think many places depend on M_NOWAIT digging deep. I'm perfectly happy with having M_NOWAIT give up on first try. Only together with M_TRY_REALLY_HARD it would dig into reserves. PS: We have a really nasty namespace collision with the mbuf flags which use the M_* prefix as well. -- Andre