Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 12 Jun 2006 16:00:30 +0100 (BST)
From:      Robert Watson <rwatson@FreeBSD.org>
To:        Danial Thom <danial_thom@yahoo.com>
Cc:        freebsd-performance@freebsd.org
Subject:   Re: Initial 6.1 questions
Message-ID:  <20060612155149.S24745@fledge.watson.org>
In-Reply-To: <20060612142104.188.qmail@web33301.mail.mud.yahoo.com>
References:  <20060612142104.188.qmail@web33301.mail.mud.yahoo.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Mon, 12 Jun 2006, Danial Thom wrote:

> first, why is the default for HZ now 1000? It seems that 900 extra clock 
> interrupts aren't a performance enhancement.

This is a design change that is in the process of being reconsidered.  I 
expect that HZ will not be 1000 in 7.x, but can't tell you whether it will go 
back to 100, or some middle ground.  There are a number of benefits to a 
higher HZ, not least is more accurate timing of some network timer events. 
Since I don't have my hands in the timer code, I can't speak to what the 
decision process here is, or when any change might happen, but I do expect to 
see some change.

> Running a simple test with a traffic generator (firing udp packets to a 
> blackhole), the system overhead with a single processor goes up from 10% to 
> 15% when running a kernel with SMP enabled (and nothing else different). I 
> have ITR set to 6000 interrupts per second. That seems like an awful lot of 
> overhead. Is there some problem running an SMP-enabled kernel when only 1 
> processor is present, or is there really 50% extra overhead on an SMP 
> scheduler? I'll have a dual core in a few days to test with.

I don't know about the particular number, but there is a significant overhead 
to building in SMP support currently -- in particular, you pick up a lot of 
atomic instructions which increases the cost of locking operations even 
without contention.  Some of that overhead reduces as the workload goes up, as 
there's coalescing of work under locked regions, reduced context switch rates 
as work is performed in batches, etc.  There is currently extremely active 
work in the area of reducing the overhead of scheduling and context switching, 
being driven in part by the 32-processor support in Sun4v.  I don't expect to 
see large portions of that merged to RELENG_6, but it will be in RELENG_7. 
Again, not my area of expertise, but there is work going on in this area.

Finally, there is a known performance problem involving loopback network 
traffic and preemption, which results in additional context switches.  You may 
want to try disabling preemption and see if/how that impacts your numbers. 
There has been seen quite a bit of discussion of this problem, and I expect to 
see a solution for it in the near future.  This problem does not manifest for 
remote traffic, only loopback traffic.

Robert N M Watson
Computer Laboratory
Universty of Cambridge



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20060612155149.S24745>