From owner-freebsd-arch Wed Feb 27 11:26:19 2002 Delivered-To: freebsd-arch@freebsd.org Received: from apollo.backplane.com (apollo.backplane.com [216.240.41.2]) by hub.freebsd.org (Postfix) with ESMTP id 7882737B400 for ; Wed, 27 Feb 2002 11:26:13 -0800 (PST) Received: (from dillon@localhost) by apollo.backplane.com (8.11.6/8.9.1) id g1RJQCm29905; Wed, 27 Feb 2002 11:26:12 -0800 (PST) (envelope-from dillon) Date: Wed, 27 Feb 2002 11:26:12 -0800 (PST) From: Matthew Dillon Message-Id: <200202271926.g1RJQCm29905@apollo.backplane.com> To: Jeff Roberson Cc: arch@FreeBSD.ORG Subject: Re: Slab allocator References: <20020227005915.C17591-100000@mail.chesapeake.net> Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG :... : :There are also per cpu queues of items, with a per cpu lock. This allows :for very effecient allocation, and also it provides near linear :performance as the number of cpus increase. I do still depend on giant to :talk to the back end page supplier (kmem_alloc, etc.). Once the VM is :locked the allocator will not require giant at all. :... : :Since you've read this far, I'll let you know where the patch is. ;-) : :http://www.chesapeake.net/~jroberson/uma.tar :... :Any feedback is appreciated. I'd like to know what people expect from :this before it is committable. : :Jeff : :PS Sorry for the long winded email. :-) Well, one thing I've noticed right off the bat is that the code is trying to take advantage of per-cpu queues but is still having to obtain a per-cpu mutex to lock the per-cpu queue. Another thing I noticed is that the code appears to assume that PCPU_GET(cpuid) is stable in certain places, and I don't think that condition necessarily holds unless you explicitly enter a critical section (critical_enter() and critical_exit()). There are some cases where you obtain the per-cpu cache and lock it, which would be safe even if the cpu changed out from under you, and other case such as in uma_zalloc_internal() where you assume that the cpuid is stable when it isn't. I also noticed that cache_drain() appears to be the only place where you iterate through the per-cpu mutexes. All the other places appear to use the current-cpu's mutex. I would recommend the following: * That you review your code with special attention to the lack of stability of PCPU_GET(cpuid) when you are not in a critical section. * That you consider getting rid of the per-cpu locks and instead use critical_enter() and critical_exit() to obtain a stable cpuid in order to allocate or free from the current cpu's cache without having to obtain any mutexes whatsoever. Theoretically this would allow most calls to allocate and free small amounts of memory to run as fast as a simple procedure call would run (akin to what the kernel malloc() in -stable is able to accomplish). * That you consider an alternative method for draining the per-cpu caches. For example, by having the per-cpu code use a global, shared SX lock along with the critical section to access their per-cpu caches and then have the cache_drain code obtain an exclusive SX lock in order to have full access to all of the per-cpu caches. * Documentation. i.e. comment the code more, especially areas where you have to special-case things like for example when you unlock a cpu cache in order to call uma_zfree_internal(). -Matt Matthew Dillon To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message