From owner-cvs-all@FreeBSD.ORG Wed Jul 25 09:25:14 2007 Return-Path: Delivered-To: cvs-all@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 807BB16A421; Wed, 25 Jul 2007 09:25:14 +0000 (UTC) (envelope-from rwatson@FreeBSD.org) Received: from cyrus.watson.org (cyrus.watson.org [209.31.154.42]) by mx1.freebsd.org (Postfix) with ESMTP id 03A8913C4A6; Wed, 25 Jul 2007 09:25:13 +0000 (UTC) (envelope-from rwatson@FreeBSD.org) Received: from fledge.watson.org (fledge.watson.org [209.31.154.41]) by cyrus.watson.org (Postfix) with ESMTP id C080248E02; Wed, 25 Jul 2007 05:25:02 -0400 (EDT) Date: Wed, 25 Jul 2007 10:25:02 +0100 (BST) From: Robert Watson X-X-Sender: robert@fledge.watson.org To: Jeremie Le Hen In-Reply-To: <20070724153608.GH96643@obiwan.tataz.chchile.org> Message-ID: <20070725101907.N83919@fledge.watson.org> References: <200707032126.l63LQ7ea027929@repoman.freebsd.org> <20070703142714.F552@10.0.0.1> <20070723201642.GC96643@obiwan.tataz.chchile.org> <46A54A0B.4000109@elischer.org> <20070724153608.GH96643@obiwan.tataz.chchile.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: cvs-src@FreeBSD.org, src-committers@FreeBSD.org, Julian Elischer , cvs-all@FreeBSD.org Subject: Re: cvs commit: src/sys/kern kern_descrip.c X-BeenThere: cvs-all@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: CVS commit messages for the entire tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 25 Jul 2007 09:25:14 -0000 On Tue, 24 Jul 2007, Jeremie Le Hen wrote: > On Mon, Jul 23, 2007 at 05:38:35PM -0700, Julian Elischer wrote: >> I think that is the wrong question.. >> the question is "why do we drop off after 8?" >> which I'm sure Jeff is already working on. > > Actually since the workbench has been run on an 8-core amd64 and that both > Linux and FreeBSD drop here, I thought it was natural to get the best > performance with 8 threads... Am I wrong? There's probably a few things going on here, especially for the write path: (1) The benchmark client and server are running on the same box, so those 8 cpus are split over, presumably, 8 client threads/processes and 8 server threads serving one each if concurrency is set to 8 on the benchmark. (2) For reliable storage without nvram backing, synchronous writes are required in order to ensure transactions are committed to disk. This means significant scheduling gaps that can be used for other things. (3) For read workloads with data working sets greater than the size of physical memory, synchronous reads also become an issue. This applies to most workloads -- I routinely use 2xncpus as the argument to -j for buildworld/kernel, as I know that a lot of time will be spent waiting on disk I/O on my boxes. Another important point: as algorithms and data structures change, not to mention hardware configuration, the "sweet spot" may (will) also change. For example, when comparing two scheduler configurations, it's important to compare performance across a range of concurrency settings, as the optimal points may differ, as may the "tail" as the configuration becomes saturated. And it's precisely that tail where we're analyzing the drop-off in this benchmark. There's some argument to be made that we should also be exploring the impact of varying the number of available physical CPUs, not just the level of concurrency configured in the application, as what administrators presumably also care about from a hardware purchase perspective is whether or not adding additional CPUs will improve their performance, not just how to best use the number of CPUs they currently have. Robert N M Watson Computer Laboratory University of Cambridge