Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 24 Oct 2012 20:06:00 +0100
From:      Attilio Rao <attilio@freebsd.org>
To:        Jim Harris <jim.harris@gmail.com>
Cc:        svn-src-head@freebsd.org, svn-src-all@freebsd.org, src-committers@freebsd.org, John Baldwin <jhb@freebsd.org>
Subject:   Re: svn commit: r242014 - head/sys/kern
Message-ID:  <CAJ-FndDzBdq8q6J7QKqf=abi_702s_ia=pa3XbBv80rxbGb-SA@mail.gmail.com>
In-Reply-To: <CAJP=Hc9wLv02sX%2BWnzZtaKccSAFzqg8jT0oP13nLw1jMfwOEBQ@mail.gmail.com>
References:  <201210241836.q9OIafqo073002@svn.freebsd.org> <201210241443.25988.jhb@freebsd.org> <CAJP=Hc9wLv02sX%2BWnzZtaKccSAFzqg8jT0oP13nLw1jMfwOEBQ@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Wed, Oct 24, 2012 at 8:00 PM, Jim Harris <jim.harris@gmail.com> wrote:
> On Wed, Oct 24, 2012 at 11:43 AM, John Baldwin <jhb@freebsd.org> wrote:
>> On Wednesday, October 24, 2012 2:36:41 pm Jim Harris wrote:
>>> Author: jimharris
>>> Date: Wed Oct 24 18:36:41 2012
>>> New Revision: 242014
>>> URL: http://svn.freebsd.org/changeset/base/242014
>>>
>>> Log:
>>>   Pad tdq_lock to avoid false sharing with tdq_load and tdq_cpu_idle.
>>>
>>>   This enables CPU searches (which read tdq_load) to operate independently
>>>   of any contention on the spinlock.  Some scheduler-intensive workloads
>>>   running on an 8C single-socket SNB Xeon show considerable improvement with
>>>   this change (2-3% perf improvement, 5-6% decrease in CPU util).
>>>
>>>   Sponsored by:       Intel
>>>   Reviewed by:        jeff
>>>
>>> Modified:
>>>   head/sys/kern/sched_ule.c
>>>
>>> Modified: head/sys/kern/sched_ule.c
>>>
>> ==============================================================================
>>> --- head/sys/kern/sched_ule.c Wed Oct 24 18:33:44 2012        (r242013)
>>> +++ head/sys/kern/sched_ule.c Wed Oct 24 18:36:41 2012        (r242014)
>>> @@ -223,8 +223,13 @@ static int sched_idlespinthresh = -1;
>>>   * locking in sched_pickcpu();
>>>   */
>>>  struct tdq {
>>> -     /* Ordered to improve efficiency of cpu_search() and switch(). */
>>> +     /*
>>> +      * Ordered to improve efficiency of cpu_search() and switch().
>>> +      * tdq_lock is padded to avoid false sharing with tdq_load and
>>> +      * tdq_cpu_idle.
>>> +      */
>>>       struct mtx      tdq_lock;               /* run queue lock. */
>>> +     char            pad[64 - sizeof(struct mtx)];
>>
>> Can this use 'tdq_lock __aligned(CACHE_LINE_SIZE)' instead?
>>
>
> No - that doesn't pad it.  I believe that only works if it's global,
> i.e. not part of a data structure.

As I've already said in another thread __align() doesn't work on
object declaration, so what that won't pad it either if it is global
or part of a struct.
It is just implemented as __attribute__((aligned(X))):
http://gcc.gnu.org/onlinedocs/gcc-3.2/gcc/Type-Attributes.html

Attilio


-- 
Peace can only be achieved by understanding - A. Einstein



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAJ-FndDzBdq8q6J7QKqf=abi_702s_ia=pa3XbBv80rxbGb-SA>