From owner-freebsd-hackers@FreeBSD.ORG Fri Jul 18 10:20:58 2003 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 313DE37B401; Fri, 18 Jul 2003 10:20:58 -0700 (PDT) Received: from godel.mtl.distributel.net (nat.MTL.distributel.NET [66.38.181.24]) by mx1.FreeBSD.org (Postfix) with ESMTP id 830C643F93; Fri, 18 Jul 2003 10:20:57 -0700 (PDT) (envelope-from bmilekic@technokratis.com) Received: from godel.mtl.distributel.net (localhost [127.0.0.1]) h6IDP3EH029533; Fri, 18 Jul 2003 13:25:03 GMT (envelope-from bmilekic@technokratis.com) Received: (from bmilekic@localhost) by godel.mtl.distributel.net (8.12.9/8.12.9/Submit) id h6IDP31b029532; Fri, 18 Jul 2003 13:25:03 GMT X-Authentication-Warning: godel.mtl.distributel.net: bmilekic set sender to bmilekic@technokratis.com using -f Date: Fri, 18 Jul 2003 13:25:03 +0000 From: Bosko Milekic To: harti@freebsd.org Message-ID: <20030718132503.GB29449@technokratis.com> References: <20030718185122.N14232@beagle.fokus.fraunhofer.de> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20030718185122.N14232@beagle.fokus.fraunhofer.de> User-Agent: Mutt/1.4.1i cc: hackers@freebsd.org Subject: Re: SMP problem with uma_zalloc X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 18 Jul 2003 17:20:58 -0000 On Fri, Jul 18, 2003 at 07:05:58PM +0200, Harti Brandt wrote: > > Hi all, > > it seems there is a problem with the zone allocator in SMP systems. > > I have a zone, that has an upper limit on items that resolves to an > upper limit of pages of 1. It turns out, that allocations from this > zone get stuck from time to time. It seems to me, that the following > happens: > > - on the first call to uma_zalloc a page is allocated and all the free > items are put into the cache of the CPU. uz_free of the zone is 0 and > uz_cachefree holds all the free items. > > - when the next call to uma_zalloc occurs on the same CPU, everything is > fine. uma_zalloc just gets the next item from the cache. > > - when the call happens on another CPU, the code finds uz_free to be 0 and > checks the page limit (uma_core.c:1492). It finds the limit already > reached and puts the process to sleep (uma_zalloc was called with > M_WAITOK). > > - the process may sleep there forever (depending on circumstances). > > If M_WAITOK is not set, the code will falsely return NULL while there > are still free items (albeight in the cache of another CPU). > > I wonder whether this is intended behaviour. If yes, this should be > definitely documented. uma_zone_set_max() seems to be documented only in > the header file and it does not mention, that free items may not actually > be allocatable because they happen to sit in another CPU's cache. > > If it is not intended (I would prefer this), I wonder how one can get the > items out of another's CPU cache. I'm not too familiar with this code. > I suppose this should be done somewhere around uma_core.c:1485? > > Regards, > harti > -- > harti brandt, > http://www.fokus.fraunhofer.de/research/cc/cats/employees/hartmut.brandt/private > brandt@fokus.fraunhofer.de, harti@freebsd.org If the per-cpu caches are relatively small (which they ought to be, especially when you've hit a maximum number of allocations from the zone), then this is actually not that bad of a behavior. I spoke to Jeff about this and it seemed to me that he was leaning toward keeping the behavior this way and, in fact, also perhaps _not_ even doing an internal free to the zone when UMA_ZFLAG_FULL is in effect but we still have space in the pcpu cache. While I'm not sure if going that far is a good idea, I _don't_ really think that the current behavior is a bad idea. As mentionned, when you have a zone that is mostly starved, all future frees will go back to the zone and not the per-cpu caches, but if you have some free items in another per-cpu cache, you're not likely to hit a starvation situation unless something is horribly wrong. And having the free code actually drain the per-cpu caches in a zone-full situation may lead to bad behavior under heavy load. Think about what happens under heavy load... your zone is starved and if you then flush all the pcpu caches and the load is still heavy, you're likely to have other threads try to allocate anyway, so they'll end up having to dip into the zone anyway; therefore, there doesn't seem to be much of a reason to push the cached objects back into the zone (if they're going to leave it again soon anyway). -- Bosko Milekic * bmilekic@technokratis.com * bmilekic@FreeBSD.org TECHNOkRATIS Consulting Services * http://www.technokratis.com/