From owner-freebsd-arch Mon Feb 17 21:54:53 2003 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 8A1C937B401 for ; Mon, 17 Feb 2003 21:54:51 -0800 (PST) Received: from HAL9000.homeunix.com (12-233-57-224.client.attbi.com [12.233.57.224]) by mx1.FreeBSD.org (Postfix) with ESMTP id 67E3F43F75 for ; Mon, 17 Feb 2003 21:54:50 -0800 (PST) (envelope-from dschultz@uclink.Berkeley.EDU) Received: from HAL9000.homeunix.com (localhost [127.0.0.1]) by HAL9000.homeunix.com (8.12.6/8.12.5) with ESMTP id h1I5sgQb011220; Mon, 17 Feb 2003 21:54:42 -0800 (PST) (envelope-from dschultz@uclink.Berkeley.EDU) Received: (from das@localhost) by HAL9000.homeunix.com (8.12.6/8.12.5/Submit) id h1I5sc3H011219; Mon, 17 Feb 2003 21:54:38 -0800 (PST) (envelope-from dschultz@uclink.Berkeley.EDU) Date: Mon, 17 Feb 2003 21:54:38 -0800 From: David Schultz To: Matthew Dillon Cc: Bosko Milekic , Andrew Gallatin , freebsd-arch@FreeBSD.ORG Subject: Re: mb_alloc cache balancer / garbage collector Message-ID: <20030218055438.GA10838@HAL9000.homeunix.com> Mail-Followup-To: Matthew Dillon , Bosko Milekic , Andrew Gallatin , freebsd-arch@FreeBSD.ORG References: <20030216213552.A63109@unixdaemons.com> <15952.62746.260872.18687@grasshopper.cs.duke.edu> <20030217095842.D64558@unixdaemons.com> <200302171742.h1HHgSOq097182@apollo.backplane.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <200302171742.h1HHgSOq097182@apollo.backplane.com> Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG Thus spake Matthew Dillon : > Wouldn't it be easier and more scaleable to implement the hysteresis on > the fly? It sounds like it ought to be simple... you have a sysctl > to set the per-cpu free cache size and hysteresis (for example, 32[8], > aka upon reaching 32 free 32 - 8 = 24 to the global cache, keeping 8). > Overflow goes into a global pool. Active systems do not usually > bounce from 0 to the maximum number of mbufs and back again, over > and over again. Instead they tend to have smaller swings and 'drift' > towards the edges, so per-cpu hysteresis should not have to exceed > 10% of the total available buffer space in order to reap the maximum > locality of reference and mutex benefit. Even in a very heavily loaded > system I would expect something like 128[64] to be sufficient. This > sort of hysteresis could be implemented trivially in the main mbuf > freeing code without any need for a thread and would have the same > performance / L1 cache characteristics. Additionally, on-the-fly > hysteresis would be able to handle extreme situations that a thread > could not (such as extreme swings), and on-the-fly hysteresis can > scale in severe or extreme situations while a thread cannot. FWIW, I believe Sun's slab allocator does essentially what you describe, including the adjustment of per-CPU caches on the fly. However, instead of having a sysctl for the size of the per-cpu caches, they dynamically tune the sizes within a certain range every 15 seconds by monitoring contention of the lock on the global cache. Apparently this tends to stabilize very quickly. Take a look at Jeff Bonwick's magazine allocator paper. The way they keep down the overhead of managing per-CPU caches on the fly is quite clever. http://www.usenix.org/events/usenix01/bonwick.html BTW, this is *not* the original slab allocator paper; it covers extensions to it that add, among other things, per-CPU caches. To give you an idea of how big Solaris' per-CPU caches are, the ranges are described in the following table from _Solaris_Internals_. As I mentioned, they are occasionally adjusted within these ranges. Keep in mind that this is for a generic memory allocator, though, and not an mbuf allocator. Object Size Range Min PCPU Cache Size Max PCPU Cache Size 0-63 15 143 64-127 7 95 128-255 3 47 256-511 1 31 512-1023 1 15 1024-2047 1 7 2048-16383 1 3 16384- 1 1 To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message