From owner-freebsd-hackers@FreeBSD.ORG  Fri Jul 18 10:20:58 2003
Return-Path: <owner-freebsd-hackers@FreeBSD.ORG>
Delivered-To: freebsd-hackers@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP
	id 313DE37B401; Fri, 18 Jul 2003 10:20:58 -0700 (PDT)
Received: from godel.mtl.distributel.net (nat.MTL.distributel.NET
	[66.38.181.24])	by mx1.FreeBSD.org (Postfix) with ESMTP
	id 830C643F93; Fri, 18 Jul 2003 10:20:57 -0700 (PDT)
	(envelope-from bmilekic@technokratis.com)
Received: from godel.mtl.distributel.net (localhost [127.0.0.1])
	h6IDP3EH029533;	Fri, 18 Jul 2003 13:25:03 GMT
	(envelope-from bmilekic@technokratis.com)
Received: (from bmilekic@localhost)
	by godel.mtl.distributel.net (8.12.9/8.12.9/Submit) id h6IDP31b029532;
	Fri, 18 Jul 2003 13:25:03 GMT
X-Authentication-Warning: godel.mtl.distributel.net: bmilekic set sender to
	bmilekic@technokratis.com using -f
Date: Fri, 18 Jul 2003 13:25:03 +0000
From: Bosko Milekic <bmilekic@technokratis.com>
To: harti@freebsd.org
Message-ID: <20030718132503.GB29449@technokratis.com>
References: <20030718185122.N14232@beagle.fokus.fraunhofer.de>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20030718185122.N14232@beagle.fokus.fraunhofer.de>
User-Agent: Mutt/1.4.1i
cc: hackers@freebsd.org
Subject: Re: SMP problem with uma_zalloc
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
	<freebsd-hackers.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>,
	<mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>,
	<mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 18 Jul 2003 17:20:58 -0000


On Fri, Jul 18, 2003 at 07:05:58PM +0200, Harti Brandt wrote:
> 
> Hi all,
> 
> it seems there is a problem with the zone allocator in SMP systems.
> 
> I have a zone, that has an upper limit on items that resolves to an
> upper limit of pages of 1. It turns out, that allocations from this
> zone get stuck from time to time. It seems to me, that the following
> happens:
> 
> - on the first call to uma_zalloc a page is allocated and all the free
> items are put into the cache of the CPU. uz_free of the zone is 0 and
> uz_cachefree holds all the free items.
> 
> - when the next call to uma_zalloc occurs on the same CPU, everything is
> fine. uma_zalloc just gets the next item from the cache.
> 
> - when the call happens on another CPU, the code finds uz_free to be 0 and
> checks the page limit (uma_core.c:1492). It finds the limit already
> reached and puts the process to sleep (uma_zalloc was called with
> M_WAITOK).
> 
> - the process may sleep there forever (depending on circumstances).
> 
> If M_WAITOK is not set, the code will falsely return NULL while there
> are still free items (albeight in the cache of another CPU).
> 
> I wonder whether this is intended behaviour. If yes, this should be
> definitely documented. uma_zone_set_max() seems to be documented only in
> the header file and it does not mention, that free items may not actually
> be allocatable because they happen to sit in another CPU's cache.
> 
> If it is not intended (I would prefer this), I wonder how one can get the
> items out of another's CPU cache. I'm not too familiar with this code.
> I suppose this should be done somewhere around uma_core.c:1485?
> 
> Regards,
> harti
> -- 
> harti brandt,
> http://www.fokus.fraunhofer.de/research/cc/cats/employees/hartmut.brandt/private
> brandt@fokus.fraunhofer.de, harti@freebsd.org

  If the per-cpu caches are relatively small (which they ought to be,
  especially when you've hit a maximum number of allocations from the
  zone), then this is actually not that bad of a behavior.

  I spoke to Jeff about this and it seemed to me that he was leaning
  toward keeping the behavior this way and, in fact, also perhaps _not_
  even doing an internal free to the zone when UMA_ZFLAG_FULL is in
  effect but we still have space in the pcpu cache.  While I'm not sure
  if going that far is a good idea, I _don't_ really think that the
  current behavior is a bad idea.  As mentionned, when you have a zone
  that is mostly starved, all future frees will go back to the zone and
  not the per-cpu caches, but if you have some free items in another
  per-cpu cache, you're not likely to hit a starvation situation unless
  something is horribly wrong.  And having the free code actually drain
  the per-cpu caches in a zone-full situation may lead to bad behavior
  under heavy load.  Think about what happens under heavy load... your
  zone is starved and if you then flush all the pcpu caches and the load
  is still heavy, you're likely to have other threads try to allocate
  anyway, so they'll end up having to dip into the zone anyway;
  therefore, there doesn't seem to be much of a reason to push the
  cached objects back into the zone (if they're going to leave it again
  soon anyway).

-- 
Bosko Milekic  *  bmilekic@technokratis.com  *  bmilekic@FreeBSD.org
TECHNOkRATIS Consulting Services  *  http://www.technokratis.com/