Date: Thu, 25 Oct 2012 14:49:43 +1100 (EST) From: Bruce Evans <brde@optusnet.com.au> To: Attilio Rao <attilio@FreeBSD.org> Cc: Adrian Chadd <adrian@FreeBSD.org>, src-committers@FreeBSD.org, Andre Oppermann <andre@FreeBSD.org>, svn-src-all@FreeBSD.org, svn-src-head@FreeBSD.org, Jim Harris <jim.harris@gmail.com> Subject: Re: svn commit: r242014 - head/sys/kern Message-ID: <20121025142313.S999@besplex.bde.org> In-Reply-To: <CAJ-FndAJwuD=oUHyVK4xHoZNrf0r%2Bq2WxZ9rUMW%2B-zzMo_8QuA@mail.gmail.com> References: <201210241836.q9OIafqo073002@svn.freebsd.org> <CAJ-VmonpdJ445hXVaoHqFgS0v7QRwqHWodQrVHm2CN9T661www@mail.gmail.com> <CAJP=Hc9XmvfW3MrDjvK15OAx1fyfjPk%2BEhqHUOzoEpChu5imtg@mail.gmail.com> <50883EA8.1010308@freebsd.org> <CAJ-FndAJwuD=oUHyVK4xHoZNrf0r%2Bq2WxZ9rUMW%2B-zzMo_8QuA@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Wed, 24 Oct 2012, Attilio Rao wrote: > On Wed, Oct 24, 2012 at 8:16 PM, Andre Oppermann <andre@freebsd.org> wrote: >> ... >> Let's go back and see how we can do this the sanest way. These are >> the options I see at the moment: >> >> 1. sprinkle __aligned(CACHE_LINE_SIZE) all over the place > > This is wrong because it doesn't give padding. Unless it is sprinkled in struct declarations. >> 2. use a macro like MTX_ALIGN that can be SMP/UP aware and in >> the future possibly change to a different compiler dependent >> align attribute > > What is this macro supposed to do? I don't understand that from your > description. > >> 3. embed __aligned(CACHE_LINE_SIZE) into struct mtx itself so it >> automatically gets aligned in all cases, even when dynamically >> allocated. > > This works but I think it is overkill for structures including sleep > mutexes which are the vast majority. So I wouldn't certainly be in > favor of such a patch. This doesn't work either with fully dynamic (auto) allocations. Stack alignment is generally broken (limited, and pessimized for both space and time) in gcc (it works better in clang). On amd64, it is limited by the default of -mpreferred-stack-boundary=4. Since 2**4 is smaller than the cache line size and stack alignments larger than it are broken in gcc, __aligned(CACHE_LINE_SIZE) never works (except accidentally, 16/CACHE_LINE_SIZE of the time. On i386, we reduce the space/time pessimizations a little by overriding the default to -mpreferred-stack-boundary=2. 2**2 is even smaller than the cache line size. (The pessimizations are for both space and time, since time and code space is wasted for the code to keep the stack aligned, and cache space and thus also time are wasted for padding. Most functions don't benefit from more than sizeof(register_t) alignment.) Dynamic allocations via malloc() get whatever alignment malloc() gives. This is only required to be 4 or 8 or 16 or so (the maximum for a C object declared in conforming C (no __align()), but malloc() usually gives more. If it gives CACHE_LINE_SIZE, that is wasteful for most small allocations. __builtin_alloca() is broken in gcc-3.3.3, but works in gcc-4.2.1, at least on i386. In gcc-3.3.3, it assumes that the stack is the default 16-byte aligned even if -mpreferred-stack-boundary=2 is in CFLAGS to say otherwise, and just subtracts from the stack pointer. In gcc-4.2.1, it does the necessary andl of the stack pointer, but only 16-byte alignment. It is another bug that there sre no extensions of malloc() or alloca(). Since malloc() is in the library and may give CACHE_LINE_SIZE but __builtin_alloca() is in the compiler and only gives 16, these functions are not even as compatible as they should be. I don't know of any mutexes allocated on the stack, but there are stack frames with mcontexts in them that need special alignment so they cause problems on i386. They can't just be put on the stack due to the above bugs. They are laboriously allocated using malloc(). Since they are a quite large, 1 mcontext barely fits on the kernel stack, so kib didn't like my alloca() method for allocating them. Bruce
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20121025142313.S999>