Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 25 Oct 2012 12:49:36 -0500
From:      Alan Cox <alc@rice.edu>
To:        Andre Oppermann <andre@freebsd.org>
Cc:        Adrian Chadd <adrian@FreeBSD.org>, src-committers@FreeBSD.org, svn-src-all@FreeBSD.org, Attilio Rao <attilio@FreeBSD.org>, Bruce Evans <brde@optusnet.com.au>, svn-src-head@FreeBSD.org, Jim Harris <jim.harris@gmail.com>
Subject:   Re: svn commit: r242014 - head/sys/kern
Message-ID:  <50897BB0.7050601@rice.edu>
In-Reply-To: <5089678A.6070609@freebsd.org>
References:  <201210241836.q9OIafqo073002@svn.freebsd.org> <CAJ-VmonpdJ445hXVaoHqFgS0v7QRwqHWodQrVHm2CN9T661www@mail.gmail.com> <CAJP=Hc9XmvfW3MrDjvK15OAx1fyfjPk%2BEhqHUOzoEpChu5imtg@mail.gmail.com> <50883EA8.1010308@freebsd.org> <CAJ-FndAJwuD=oUHyVK4xHoZNrf0r%2Bq2WxZ9rUMW%2B-zzMo_8QuA@mail.gmail.com> <20121025142313.S999@besplex.bde.org> <5089678A.6070609@freebsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On 10/25/2012 11:23, Andre Oppermann wrote:
> On 25.10.2012 05:49, Bruce Evans wrote:
>> On Wed, 24 Oct 2012, Attilio Rao wrote:
>>
>>> On Wed, Oct 24, 2012 at 8:16 PM, Andre Oppermann <andre@freebsd.org> 
>>> wrote:
>>>> ...
>>>> Let's go back and see how we can do this the sanest way.  These are
>>>> the options I see at the moment:
>>>>
>>>>  1. sprinkle __aligned(CACHE_LINE_SIZE) all over the place
>>>
>>> This is wrong because it doesn't give padding.
>>
>> Unless it is sprinkled in struct declarations.
>>
>>>>  2. use a macro like MTX_ALIGN that can be SMP/UP aware and in
>>>>     the future possibly change to a different compiler dependent
>>>>     align attribute
>>>
>>> What is this macro supposed to do? I don't understand that from your
>>> description.
>>>
>>>>  3. embed __aligned(CACHE_LINE_SIZE) into struct mtx itself so it
>>>>     automatically gets aligned in all cases, even when dynamically
>>>>     allocated.
>>>
>>> This works but I think it is overkill for structures including sleep
>>> mutexes which are the vast majority. So I wouldn't certainly be in
>>> favor of such a patch.
>>
>> This doesn't work either with fully dynamic (auto) allocations.  Stack
>> alignment is generally broken (limited, and pessimized for both space
>> and time) in gcc (it works better in clang).  On amd64, it is limited
>> by the default of -mpreferred-stack-boundary=4.  Since 2**4 is smaller
>> than the cache line size and stack alignments larger than it are broken
>> in gcc, __aligned(CACHE_LINE_SIZE) never works (except accidentally,
>> 16/CACHE_LINE_SIZE of the time.  On i386, we reduce the space/time
>> pessimizations a little by overriding the default to
>> -mpreferred-stack-boundary=2.  2**2 is even smaller than the cache
>> line size.  (The pessimizations are for both space and time, since
>> time and code space is wasted for the code to keep the stack aligned,
>> and cache space and thus also time are wasted for padding.  Most
>> functions don't benefit from more than sizeof(register_t) alignment.)
>
> I'm not aware of stack allocated mutexes anywhere in the kernel.
> Even if there is a case it's very special and unique.
>
> I've verified that __aligned(CACHE_LINE_SIZE) on the definition of
> struct mtx itself (in sys/_mutex.h) correctly aligns and pads the
> global .bss resident mutexes for 64B and 128B cache line sizes.
>

Padding every mutex is going to have a non-trivial effect on the size of 
some dynamically allocated structures containing locks, like the vm 
object and the vnode.  Moreover, the effect of this padding will be the 
greatest on address space limited systems, like i386, where the size of 
a vm object is only about 130 bytes.

>> Dynamic allocations via malloc() get whatever alignment malloc() gives.
>> This is only required to be 4 or 8 or 16 or so (the maximum for a C
>> object declared in conforming C (no __align()), but malloc() usually
>> gives more.  If it gives CACHE_LINE_SIZE, that is wasteful for most
>> small allocations.
>
> Stand-alone mutexes are normally not malloc'ed.  They're always
> embedded into some larger structure they protect.
>
>> __builtin_alloca() is broken in gcc-3.3.3, but works in gcc-4.2.1, at
>> least on i386.  In gcc-3.3.3, it assumes that the stack is the default
>> 16-byte aligned even if -mpreferred-stack-boundary=2 is in CFLAGS to
>> say otherwise, and just subtracts from the stack pointer.  In gcc-4.2.1,
>> it does the necessary andl of the stack pointer, but only 16-byte
>> alignment.
>>
>> It is another bug that there sre no extensions of malloc() or alloca().
>> Since malloc() is in the library and may give CACHE_LINE_SIZE but
>> __builtin_alloca() is in the compiler and only gives 16, these functions
>> are not even as compatible as they should be.
>>
>> I don't know of any mutexes allocated on the stack, but there are stack
>> frames with mcontexts in them that need special alignment so they cause
>> problems on i386.  They can't just be put on the stack due to the above
>> bugs. They are laboriously allocated using malloc().  Since they are a
>> quite large, 1 mcontext barely fits on the kernel stack, so kib didn't
>> like my alloca() method for allocating them.
>
> You lost me here.
>




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?50897BB0.7050601>