From owner-freebsd-arch Wed Feb 27 14:49:39 2002 Delivered-To: freebsd-arch@freebsd.org Received: from mail.chesapeake.net (chesapeake.net [205.130.220.14]) by hub.freebsd.org (Postfix) with ESMTP id 509D837B405 for ; Wed, 27 Feb 2002 14:49:32 -0800 (PST) Received: from localhost (jroberson@localhost) by mail.chesapeake.net (8.11.6/8.11.6) with ESMTP id g1RMnO576629; Wed, 27 Feb 2002 17:49:24 -0500 (EST) (envelope-from jroberson@chesapeake.net) Date: Wed, 27 Feb 2002 17:49:24 -0500 (EST) From: Jeff Roberson To: Matthew Dillon Cc: arch@FreeBSD.ORG Subject: Re: Slab allocator In-Reply-To: <200202271926.g1RJQCm29905@apollo.backplane.com> Message-ID: <20020227172755.W59764-100000@mail.chesapeake.net> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG See comments below: On Wed, 27 Feb 2002, Matthew Dillon wrote: > > Well, one thing I've noticed right off the bat is that the code > is trying to take advantage of per-cpu queues but is still > having to obtain a per-cpu mutex to lock the per-cpu queue. > > Another thing I noticed is that the code appears to assume > that PCPU_GET(cpuid) is stable in certain places, and I don't > think that condition necessarily holds unless you explicitly > enter a critical section (critical_enter() and critical_exit()). > There are some cases where you obtain the per-cpu cache and lock > it, which would be safe even if the cpu changed out from under > you, and other case such as in uma_zalloc_internal() where you > assume that the cpuid is stable when it isn't. Ok, I did make a PCPU_GET mistake.. If uma_zalloc_internal is called from the fast path it needs to hand down a cache. The point of the locks is so that you don't have to have a critical section around the entire allocator. They really should be fast because they should only be cached in one cpu's cache. This also makes it easier to drain. I think that the preemption and migration case is going to be somewhat rare so it's ok to block another cpu for a little while if it happens. As long as I pass around a cpu # it shouldn't matter if I get preempted. > > I also noticed that cache_drain() appears to be the only > place where you iterate through the per-cpu mutexes. All > the other places appear to use the current-cpu's mutex. > > I would recommend the following: > > * That you review your code with special attention to > the lack of stability of PCPU_GET(cpuid) when you > are not in a critical section. Yes, thanks, I just noticed that bug. > > * That you consider getting rid of the per-cpu locks > and instead use critical_enter() and critical_exit() > to obtain a stable cpuid in order to allocate or > free from the current cpu's cache without having to > obtain any mutexes whatsoever. I'm not sure how I could properly and effeciently handle draining the per cpu queues in this scheme. One alternative to the current strategy is to wrap the fast path in a critical section and only if you are out of per cpu buckets or your buckets are empty would you drop the critical section and enter the zone mutex. Then you could go about allocating buckets and re-enter the critical section to see if you still need it. If cpu migration happens very often you could potentially get several cpus following the 'fill the bucket' path. This would cause you to ask for much more memory than you really need. > > Theoretically this would allow most calls to allocate > and free small amounts of memory to run as fast as > a simple procedure call would run (akin to what > the kernel malloc() in -stable is able to accomplish). > > * That you consider an alternative method for draining > the per-cpu caches. For example, by having the > per-cpu code use a global, shared SX lock along > with the critical section to access their per-cpu > caches and then have the cache_drain code obtain > an exclusive SX lock in order to have full access > to all of the per-cpu caches. > A global SX lock would cause cache thrashing across all cpus. As it is I'd like some way to cache line size align specific things for maximum smp performance. I've been toying with the idea of adding a new MD api to get cache parameters. This would be neccisary if I want to keep the UMA_ALIGN_CACHE flag. > * Documentation. i.e. comment the code more, especially > areas where you have to special-case things like for > example when you unlock a cpu cache in order to > call uma_zfree_internal(). Yes, I agree. > > -Matt > Matthew Dillon > > Jeff To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message