From owner-freebsd-performance@FreeBSD.ORG Thu Jan 6 14:01:06 2005 Return-Path: Delivered-To: freebsd-performance@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 73C0C16A4CE for ; Thu, 6 Jan 2005 14:01:06 +0000 (GMT) Received: from fledge.watson.org (fledge.watson.org [204.156.12.50]) by mx1.FreeBSD.org (Postfix) with ESMTP id 8C53E43D54 for ; Thu, 6 Jan 2005 14:01:05 +0000 (GMT) (envelope-from robert@fledge.watson.org) Received: from fledge.watson.org (localhost [127.0.0.1]) by fledge.watson.org (8.13.1/8.13.1) with ESMTP id j06DvEZQ096450; Thu, 6 Jan 2005 08:57:14 -0500 (EST) (envelope-from robert@fledge.watson.org) Received: from localhost (robert@localhost)j06DvEUj096447; Thu, 6 Jan 2005 13:57:14 GMT (envelope-from robert@fledge.watson.org) Date: Thu, 6 Jan 2005 13:57:14 +0000 (GMT) From: Robert Watson X-Sender: robert@fledge.watson.org To: Jesper Louis Andersen In-Reply-To: <20050106114154.GB30825@miracle.mongers.org> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII cc: freebsd-performance@freebsd.org cc: Bosko Milekic cc: Igor Shmukler cc: Hubert Feyrer Subject: Re: Benchmark: NetBSD 2.0 beats FreeBSD 5.3 in server performance X-BeenThere: freebsd-performance@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Performance/tuning List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 06 Jan 2005 14:01:06 -0000 On Thu, 6 Jan 2005, Jesper Louis Andersen wrote: > But this is speculation. I would like to see perfarmonce benchmarks for > your scenario as well. Since the performance optimization on FreeBSD for the last few years, with features like SMPng, libpthread, and UMA, has been focussed on macro performance not micro performance, it's not surprising the micro performance requires tuning. However, there is lots of on-going work on this front so I'd expect to see continued improvement in the immediate (5.4) life time. There are a number of optimizations in 6.x that are on the merge path for 5.x that will directly impact the results in these measurements -- in particular, what is clearly a bug in the way mutexes are released on UP kernels that adds almost a hundred cycles to every mutex release operation. This was identified in my micro-benchmarking shortly after the release of 5.3, and may play a substantial part in the posted results, especially for the very micro benchmarks that involve kernel memory allocation. Hopefully for 5.4 we'll also have a move to soft critical sections protecting UMA memory caches, which should eliminate mutex operations from common case memory allocation making a further benefit. Those changes are not yet in CVS HEAD, but are in Perforce and show nice benefits in micro-benchmarks involving kernel object allocation, such as socket allocation, etc. > I disagree that the original post is entirely FUD. While the conclusion > is subjective, fact is that at the particular mix of microbenchmarks > shows NetBSD faster than FreeBSD. I am wondering if that is the price > you pay on single-cpu boxes to gain speed at the SMP boxes. And if this > is true the question becomes if fine-grained locking is worth the > implementation time when most computers are still single-cpu (Yes, I > know this can change rapidly with the newer CPU types). I think the post Bosko is referring to is indeed almost entirely FUD, since it consists of one line of useful content ("Look, a set of decent-looking set of microbenchmark results") and the rest is ("I hate these specific FreeBSD developers, why do we let committers design the system!"). I.e., the posted report was simply an excuse for someone to chew out the FreeBSD developers. That doesn't invalidate the report, but it does suggest that the interpretation presented in that post might be a little unbalanced. Obviously, we need to read and understand the report, and act on areas where we can improve. Regarding SMP or not -- the path the FreeBSD Project has taken (and this choice was before I was really all that involved, to be honest) was a re-architecture of the kernel to improve performance, scalability, and structure via a movement to a parallelizable, preemptible, threaded kernel. I think this is the right architecture to move to, as it not only improves performance and scalability, but it also closes a lot of existing race conditions in the kernel that only became more exposed as threading and SMP became more predominant. This has had a lot of performance benefits, but comes with initial costs that aren't all immediately offset by initial benefits. Now that this model is largely adopted, we'll see a nice increase in benefits over time -- i.e., it was an investment. Obviously, people can and will disagree about the nature of the investment, the time for payoff, etc, but I think it's easy to argue there have been some very strong benefits so far. Likewise, it's possible to argue that the exact path by which it was implemented could have been better -- i.e., if the dotcom crash hadn't happened at the wrong moment leaving us with far fewer developers working on it than hoped. However, we've accomplished a lot, especially given the available resources. Immediate and measurable performance from the new architecture is now the primary thrust of the netperf work, having gotten the initial cut working, so we should see some large gains in that area, having an immediate effect on both micro-benchmarks and macro-benchmarks. More on SMP generally -- as other posts argue, I think you'll see SMP become more and more a part of "out of the box" systems over the next couple of years, especially for server class hardware. It's quite hard to buy decent server equipment from Intel that doesn't have at least HTT today. It is very important that we re-optimize UP having adopted the SMPng approach, and there's a substantial amount of work going into that, but SMP is an increasing reality that the FreeBSD Project has been addressing head-on for several years. It required a lot of work over that time, but as a result of that investment, we will be ready for the next generation of systems where SMP is no longer an option. So I'd look to immediate performance improvements in 5.x-STABLE for UP and SMP over the next few months leading up to 5.4. We should all bug Stephan Uphoff and John Baldwin to get the critical section stuff in the tree so I can merge in my UMA changes, and make sure that the UP mutex optimizations are also in RELENG_5 :-). Robert N M Watson