Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 25 Jul 2007 10:25:02 +0100 (BST)
From:      Robert Watson <rwatson@FreeBSD.org>
To:        Jeremie Le Hen <jeremie@le-hen.org>
Cc:        cvs-src@FreeBSD.org, src-committers@FreeBSD.org, Julian Elischer <julian@elischer.org>, cvs-all@FreeBSD.org
Subject:   Re: cvs commit: src/sys/kern kern_descrip.c
Message-ID:  <20070725101907.N83919@fledge.watson.org>
In-Reply-To: <20070724153608.GH96643@obiwan.tataz.chchile.org>
References:  <200707032126.l63LQ7ea027929@repoman.freebsd.org> <20070703142714.F552@10.0.0.1> <20070723201642.GC96643@obiwan.tataz.chchile.org> <46A54A0B.4000109@elischer.org> <20070724153608.GH96643@obiwan.tataz.chchile.org>

next in thread | previous in thread | raw e-mail | index | archive | help

On Tue, 24 Jul 2007, Jeremie Le Hen wrote:

> On Mon, Jul 23, 2007 at 05:38:35PM -0700, Julian Elischer wrote:
>>  I think that is the wrong question..
>>  the question is "why do we drop off after 8?"
>>  which I'm sure Jeff is already working on.
>
> Actually since the workbench has been run on an 8-core amd64 and that both 
> Linux and FreeBSD drop here, I thought it was natural to get the best 
> performance with 8 threads...  Am I wrong?

There's probably a few things going on here, especially for the write path:

(1) The benchmark client and server are running on the same box, so those 8
     cpus are split over, presumably, 8 client threads/processes and 8 server
     threads serving one each if concurrency is set to 8 on the benchmark.

(2) For reliable storage without nvram backing, synchronous writes are
     required in order to ensure transactions are committed to disk.  This
     means significant scheduling gaps that can be used for other things.

(3) For read workloads with data working sets greater than the size of
     physical memory, synchronous reads also become an issue.

This applies to most workloads -- I routinely use 2xncpus as the argument to 
-j for buildworld/kernel, as I know that a lot of time will be spent waiting 
on disk I/O on my boxes.

Another important point: as algorithms and data structures change, not to 
mention hardware configuration, the "sweet spot" may (will) also change.  For 
example, when comparing two scheduler configurations, it's important to 
compare performance across a range of concurrency settings, as the optimal 
points may differ, as may the "tail" as the configuration becomes saturated. 
And it's precisely that tail where we're analyzing the drop-off in this 
benchmark.

There's some argument to be made that we should also be exploring the impact 
of varying the number of available physical CPUs, not just the level of 
concurrency configured in the application, as what administrators presumably 
also care about from a hardware purchase perspective is whether or not adding 
additional CPUs will improve their performance, not just how to best use the 
number of CPUs they currently have.

Robert N M Watson
Computer Laboratory
University of Cambridge



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20070725101907.N83919>