Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 30 Mar 2007 02:20:05 +0200
From:      "Attilio Rao" <attilio@freebsd.org>
To:        "Yar Tikhiy" <yar@comp.chem.msu.su>
Cc:        freebsd-hackers@freebsd.org, Robert Watson <rwatson@freebsd.org>, Duane Whitty <duane@dwlabs.ca>
Subject:   Re: Locking etc. (Long, boring, redundant, newbie questions)
Message-ID:  <3bbf2fe10703291720yd82045ei66ead6e4f251a20@mail.gmail.com>
In-Reply-To: <20070329112606.GA72497@comp.chem.msu.su>
References:  <20070328082620.GA1052@dwpc.dwlabs.ca> <20070328102027.T1185@fledge.watson.org> <20070329112606.GA72497@comp.chem.msu.su>

next in thread | previous in thread | raw e-mail | index | archive | help
2007/3/29, Yar Tikhiy <yar@comp.chem.msu.su>:
> On Wed, Mar 28, 2007 at 10:40:58AM +0100, Robert Watson wrote:
> >
> > Spin locks are, FYI, slower than default mutexes.  The reason is that they
> > have to do more work: they not only perform an atomic operation/memory
> > barrier to set the cross-CPU lock state, but they also have to disable
> > interrupts to synchronize with fast interrupt handlers.  In general, you
> > are right: you should only use a spin mutex if you are running in a fast
> > handler, or synchronizing with a fast handler.  The one general exception
> > is the scheduler itself, which must protect its data structures with spin
> > locks in order to implement sleeping primitives.  As such, the scheduler
> > lock and various other low level locks (such as in turnstiles) are
> > implemented with spin locks and not default mutexes.  Since default mutexes
> > spin adaptively, the reduced overhead of contention experienced with spin
> > locks (i.e., no scheduler overhead for simple contention cases) is also
> > experienced with default mutexes.
>
> By the way, could the following observation of mine be related to
> the high cost of spin mutexes?
>
> While doing some measurements of the performance of our vlan driver
> in a router, I found that with RELENG_6 the pps rate through the
> router degraded considerably (by ~100Kpps) if its physical interfaces
> used hardware vlan tagging.  I attributed that to the overhead of
> allocating and freeing a mbuf tag for each packet as it entered and
> then left the router.  I used hwpmc and pmcstat to see which kernel
> functions took most time and found that critical_exit() made top 5
> in the list of CPU time eaters if the network interfaces were using
> hardware vlan tagging.
>
> The machine was an UP amd64, but it ran FreeBSD/i386, with an UP
> kernel.
>
> As I can see from the code, critical_exit() may grab and later
> release a spin mutex.  I've got a hazy recollection that our kernel
> memory allocator uses critical sections to protect its per-CPU
> structures.  That's why I suspect that the effect I observed may
> have its roots in the behaviour of spin mutexes.  Could it be so?

This is not entirely true.
What happens is that if you enable preemption in your kernel,
critical_exit() holds sched_lock just beacause it needs to perform a
mi_switch(), so just another thread will be scheduled to run currently
(and, please, note that sched_lock will be dropped by internal
functions when context switch will be completed). Otherwise you don't
pay the penalty of the sched_lock acquisition.

Attilio


-- 
Peace can only be achieved by understanding - A. Einstein



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?3bbf2fe10703291720yd82045ei66ead6e4f251a20>