Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 5 Jan 2013 13:54:02 -0800
From:      Jason Evans <jasone@freebsd.org>
To:        Adrian Chadd <adrian@freebsd.org>
Cc:        freebsd-hackers@freebsd.org, Hakisho Nukama <nukama@gmail.com>
Subject:   Re: L1 cache thrashing affects performance of HIMENO benchmark
Message-ID:  <DFFCA030-3206-4EB2-88C8-262AB298FF9F@freebsd.org>
In-Reply-To: <CAJ-Vmo=O9v==Xpm8F2UUjkEAOaHbKydLMpn6Bz-5rRBt%2B2TEAg@mail.gmail.com>
References:  <CA%2Bzcas1V0KzcBKjq1u3Rwu_Nm7hVkG%2BG%2BeHpHRDQG0_NcXoOWA@mail.gmail.com> <CAJ-Vmo=O9v==Xpm8F2UUjkEAOaHbKydLMpn6Bz-5rRBt%2B2TEAg@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help

On Jan 5, 2013, at 12:47 PM, Adrian Chadd <adrian@freebsd.org> wrote:
> On 5 January 2013 07:38, Hakisho Nukama <nukama@gmail.com> wrote:
>> FreeBSD (PCBSD) is slower compared to Linux and kFreeBSD in this
>> benchmark of HIMENO:
>> http://openbenchmarking.org/prospect/1202215-BY-FREEBSD9683/88ac7a01c6cb355d7e7603224b2ee1e5a4cb881d
>> Also DragonFly BSD compares worse to kFreeBSD and Linux:
>> http://www.phoronix.com/scan.php?page=article&item=dragonfly_linux_32&num=3
>> http://openbenchmarking.org/prospect/1206255-SU-DRAGONFLY55/88ac7a01c6cb355d7e7603224b2ee1e5a4cb881d
>> 
>> Matt, Venkatesh and Alex investigated this performance problem and
>> came to these results:
>> http://leaf.dragonflybsd.org/mailarchive/users/2013-01/msg00011.html
> 
> I've CC'ed jasone on this as it's an interesting side-effect of memory
> allocation logic.
> 
> Jason - any comments?

There are many variations on this class of performance problem, and the short of it is that only the application can have adequate understanding of data structure layout and access patterns to reliably make optimal use of the cache.  However, it is possible for the allocator to lay out memory in a more haphazard fashion than jemalloc, phkmalloc, etc. do, such that the application can be cache-oblivious and (usually) not suffer worst case consequences as happened in this case.  Extent-based allocators like dlmalloc often get this "for free" for a significant range of allocation sizes.  jemalloc could be modified to this end, but a full solution would necessarily increase internal fragmentation.  It might be worth experimenting with nonetheless.

Thanks,
Jason


Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?DFFCA030-3206-4EB2-88C8-262AB298FF9F>