Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 29 Mar 2007 15:26:07 +0400
From:      Yar Tikhiy <yar@comp.chem.msu.su>
To:        Robert Watson <rwatson@freebsd.org>
Cc:        freebsd-hackers@freebsd.org, Duane Whitty <duane@dwlabs.ca>
Subject:   Re: Locking etc. (Long, boring, redundant, newbie questions)
Message-ID:  <20070329112606.GA72497@comp.chem.msu.su>
In-Reply-To: <20070328102027.T1185@fledge.watson.org>
References:  <20070328082620.GA1052@dwpc.dwlabs.ca> <20070328102027.T1185@fledge.watson.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Wed, Mar 28, 2007 at 10:40:58AM +0100, Robert Watson wrote:
> 
> Spin locks are, FYI, slower than default mutexes.  The reason is that they 
> have to do more work: they not only perform an atomic operation/memory 
> barrier to set the cross-CPU lock state, but they also have to disable 
> interrupts to synchronize with fast interrupt handlers.  In general, you 
> are right: you should only use a spin mutex if you are running in a fast 
> handler, or synchronizing with a fast handler.  The one general exception 
> is the scheduler itself, which must protect its data structures with spin 
> locks in order to implement sleeping primitives.  As such, the scheduler 
> lock and various other low level locks (such as in turnstiles) are 
> implemented with spin locks and not default mutexes.  Since default mutexes 
> spin adaptively, the reduced overhead of contention experienced with spin 
> locks (i.e., no scheduler overhead for simple contention cases) is also 
> experienced with default mutexes.

By the way, could the following observation of mine be related to
the high cost of spin mutexes?

While doing some measurements of the performance of our vlan driver
in a router, I found that with RELENG_6 the pps rate through the
router degraded considerably (by ~100Kpps) if its physical interfaces
used hardware vlan tagging.  I attributed that to the overhead of
allocating and freeing a mbuf tag for each packet as it entered and
then left the router.  I used hwpmc and pmcstat to see which kernel
functions took most time and found that critical_exit() made top 5
in the list of CPU time eaters if the network interfaces were using
hardware vlan tagging.

The machine was an UP amd64, but it ran FreeBSD/i386, with an UP
kernel.

As I can see from the code, critical_exit() may grab and later
release a spin mutex.  I've got a hazy recollection that our kernel
memory allocator uses critical sections to protect its per-CPU
structures.  That's why I suspect that the effect I observed may
have its roots in the behaviour of spin mutexes.  Could it be so?

Thanks!

-- 
Yar



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20070329112606.GA72497>