From owner-freebsd-arch Mon Feb 17 16: 0:42 2003 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id D7ED537B401 for ; Mon, 17 Feb 2003 16:00:39 -0800 (PST) Received: from apollo.backplane.com (apollo.backplane.com [216.240.41.2]) by mx1.FreeBSD.org (Postfix) with ESMTP id 301E243F75 for ; Mon, 17 Feb 2003 16:00:39 -0800 (PST) (envelope-from dillon@apollo.backplane.com) Received: from apollo.backplane.com (localhost [127.0.0.1]) by apollo.backplane.com (8.12.6/8.12.6) with ESMTP id h1I00bSJ000433; Mon, 17 Feb 2003 16:00:37 -0800 (PST) (envelope-from dillon@apollo.backplane.com) Received: (from dillon@localhost) by apollo.backplane.com (8.12.6/8.12.6/Submit) id h1I00bvl000432; Mon, 17 Feb 2003 16:00:37 -0800 (PST) Date: Mon, 17 Feb 2003 16:00:37 -0800 (PST) From: Matthew Dillon Message-Id: <200302180000.h1I00bvl000432@apollo.backplane.com> To: Bosko Milekic Cc: Andrew Gallatin , freebsd-arch@FreeBSD.ORG Subject: Re: mb_alloc cache balancer / garbage collector References: <20030216213552.A63109@unixdaemons.com> <15952.62746.260872.18687@grasshopper.cs.duke.edu> <20030217095842.D64558@unixdaemons.com> <200302171742.h1HHgSOq097182@apollo.backplane.com> <20030217154127.A66206@unixdaemons.com> Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG : What the daemon does is replenish the per-CPU caches (if necessary) in : one shot without imposing the overhead on the allocation path. That : is, it'll move a bunch of buckets over to the per-CPU caches if they : are under-populated; doing that from the main allocation path is : theoretically possible but tends to produce high spiking in latency. : So what the daemon basically is is a compromise between doing it in : the allocation/free path on-the-fly, and doing it from a parallel : thread. Additionally, the daemon will empty part of the global cache :... Hmm. Well, you can also replentish the per-CPU caches in-bulk on the fly. You simply pull in more then one buffer and you will reap the same overhead benefits in the allocation path. If you depend on a thread to do this then you can create a situation where a chronic buffer shortage in the per-cpu cache can occur if the thread doesn't get cpu quickly enough, resulting in non-optimal operation. In otherwords, while it may seem you are saving latency in the critical path (the network trying to allocate a buffer), I think you might actually be creating a situation where instead of latency you wind up with a critical shortage. I don't think VM interaction is that big a deal. The VM system has a notion of a 'shortage' and a 'severe shortage'. When you are allocating mbufs from the global VM system into the per-cpu cache you simply allocate up to into the cache or until the VM system gets low (but not severely low) on memory. The hysteresis does not have to be much to reap the benefits and mitigate the overhead of the global mutex(es)... just 5 or 10 mbufs would mitigate global mutex overhead to the point where it becomes irrelevant. By creating a thread you are introducing more moving parts, and like a physical system these moving parts are going to ineract with each other. Remember, the VM system is *already* trying to ensure that enough free pages exist in the system. If you have a second thread eating memory in large globs it is far more likely that you will destabilize the pageout daemon and create an oscillation between the two threads (pageout daemon and your balancer). This might not turn up in benchmarks (which tend to focus on just one subsystem), but it could lead to some pretty nasty degenerate cases under heavy general loads. I think it is far better to let the VM system do its job and pull the mbufs in on-the-fly in smaller chunks which are less likely to destabilize the pageout daemon. This can be exasperated... made even worse, if your balancing thread is given a high priority. So you have the potential to starve the mbuf system if the balancing thread is too LOW a priority, and the potential to destabilize the VM system if the balancing thread is too HIGH a priority. Also, it seems to me that VM overheads are better addressed in the UMA subsystem, not in a leaf allocation subsystem. -Matt To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message