Date: Wed, 25 Jul 2007 10:25:02 +0100 (BST) From: Robert Watson <rwatson@FreeBSD.org> To: Jeremie Le Hen <jeremie@le-hen.org> Cc: cvs-src@FreeBSD.org, src-committers@FreeBSD.org, Julian Elischer <julian@elischer.org>, cvs-all@FreeBSD.org Subject: Re: cvs commit: src/sys/kern kern_descrip.c Message-ID: <20070725101907.N83919@fledge.watson.org> In-Reply-To: <20070724153608.GH96643@obiwan.tataz.chchile.org> References: <200707032126.l63LQ7ea027929@repoman.freebsd.org> <20070703142714.F552@10.0.0.1> <20070723201642.GC96643@obiwan.tataz.chchile.org> <46A54A0B.4000109@elischer.org> <20070724153608.GH96643@obiwan.tataz.chchile.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On Tue, 24 Jul 2007, Jeremie Le Hen wrote: > On Mon, Jul 23, 2007 at 05:38:35PM -0700, Julian Elischer wrote: >> I think that is the wrong question.. >> the question is "why do we drop off after 8?" >> which I'm sure Jeff is already working on. > > Actually since the workbench has been run on an 8-core amd64 and that both > Linux and FreeBSD drop here, I thought it was natural to get the best > performance with 8 threads... Am I wrong? There's probably a few things going on here, especially for the write path: (1) The benchmark client and server are running on the same box, so those 8 cpus are split over, presumably, 8 client threads/processes and 8 server threads serving one each if concurrency is set to 8 on the benchmark. (2) For reliable storage without nvram backing, synchronous writes are required in order to ensure transactions are committed to disk. This means significant scheduling gaps that can be used for other things. (3) For read workloads with data working sets greater than the size of physical memory, synchronous reads also become an issue. This applies to most workloads -- I routinely use 2xncpus as the argument to -j for buildworld/kernel, as I know that a lot of time will be spent waiting on disk I/O on my boxes. Another important point: as algorithms and data structures change, not to mention hardware configuration, the "sweet spot" may (will) also change. For example, when comparing two scheduler configurations, it's important to compare performance across a range of concurrency settings, as the optimal points may differ, as may the "tail" as the configuration becomes saturated. And it's precisely that tail where we're analyzing the drop-off in this benchmark. There's some argument to be made that we should also be exploring the impact of varying the number of available physical CPUs, not just the level of concurrency configured in the application, as what administrators presumably also care about from a hardware purchase perspective is whether or not adding additional CPUs will improve their performance, not just how to best use the number of CPUs they currently have. Robert N M Watson Computer Laboratory University of Cambridge
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20070725101907.N83919>