From owner-svn-src-head@FreeBSD.ORG Wed Oct 24 20:43:08 2012 Return-Path: Delivered-To: svn-src-head@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 0C879373 for ; Wed, 24 Oct 2012 20:43:08 +0000 (UTC) (envelope-from andre@freebsd.org) Received: from c00l3r.networx.ch (c00l3r.networx.ch [62.48.2.2]) by mx1.freebsd.org (Postfix) with ESMTP id 463F68FC0A for ; Wed, 24 Oct 2012 20:43:06 +0000 (UTC) Received: (qmail 36598 invoked from network); 24 Oct 2012 22:20:52 -0000 Received: from c00l3r.networx.ch (HELO [127.0.0.1]) ([62.48.2.2]) (envelope-sender ) by c00l3r.networx.ch (qmail-ldap-1.03) with SMTP for ; 24 Oct 2012 22:20:52 -0000 Message-ID: <508852CE.5030807@freebsd.org> Date: Wed, 24 Oct 2012 22:42:54 +0200 From: Andre Oppermann User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:16.0) Gecko/20121010 Thunderbird/16.0.1 MIME-Version: 1.0 To: Alexander Motin Subject: Re: svn commit: r242014 - head/sys/kern References: <201210241836.q9OIafqo073002@svn.freebsd.org> <50883EA8.1010308@freebsd.org> <508841DC.7040701@FreeBSD.org> In-Reply-To: <508841DC.7040701@FreeBSD.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: svn-src-head@freebsd.org, Adrian Chadd , src-committers@freebsd.org, Jim Harris , svn-src-all@freebsd.org X-BeenThere: svn-src-head@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: SVN commit messages for the src tree for head/-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 24 Oct 2012 20:43:08 -0000 On 24.10.2012 21:30, Alexander Motin wrote: > On 24.10.2012 22:16, Andre Oppermann wrote: >> On 24.10.2012 20:56, Jim Harris wrote: >>> On Wed, Oct 24, 2012 at 11:41 AM, Adrian Chadd >>> wrote: >>>> On 24 October 2012 11:36, Jim Harris wrote: >>>> >>>>> Pad tdq_lock to avoid false sharing with tdq_load and tdq_cpu_idle. >>>> >>>> Ok, but.. >>>> >>>> >>>>> struct mtx tdq_lock; /* run queue lock. */ >>>>> + char pad[64 - sizeof(struct mtx)]; >>>> >>>> .. don't we have an existing compile time macro for the cache line >>>> size, which can be used here? >>> >>> Yes, but I didn't use it for a couple of reasons: >>> >>> 1) struct tdq itself is currently using __aligned(64), so I wanted to >>> keep it consistent. >>> 2) CACHE_LINE_SIZE is currently defined as 128 on x86, due to >>> NetBurst-based processors having 128-byte cache sectors a while back. >>> I had planned to start a separate thread on arch@ about this today on >>> whether this was still appropriate. >> >> See also the discussion on svn-src-all regarding global struct mtx >> alignment. >> >> Thank you for proving my point. ;) >> >> Let's go back and see how we can do this the sanest way. These are >> the options I see at the moment: >> >> 1. sprinkle __aligned(CACHE_LINE_SIZE) all over the place >> 2. use a macro like MTX_ALIGN that can be SMP/UP aware and in >> the future possibly change to a different compiler dependent >> align attribute >> 3. embed __aligned(CACHE_LINE_SIZE) into struct mtx itself so it >> automatically gets aligned in all cases, even when dynamically >> allocated. >> >> Personally I'm undecided between #2 and #3. #1 is ugly. In favor >> of #3 is that there possibly isn't any case where you'd actually >> want the mutex to share a cache line with anything else, even a data >> structure. > > I'm sorry, could you hint me with some theory? I think I can agree that cache line sharing can be a > problem in case of spin locks -- waiting thread will constantly try to access page modified by other > CPU, that I guess will cause cache line writes to the RAM. But why is it so bad to share lock with > respective data in case of non-spin locks? Won't benefits from free regular prefetch of the right > data while grabbing lock compensate penalties from relatively rare collisions? Cliff Click describes it in detail: http://www.azulsystems.com/blog/cliff/2009-04-14-odds-ends For a classic mutex it likely doesn't make much difference since the cache line is exclusive anyway while the lock is held. On LL/SC systems there may be cache line dirtying on a failed locking attempt. For spin mutexes it hurts badly as you noted. Especially on RW mutexes it hurts because a read lock dirties the cache line for all other CPU's. Here the RW mutex should be on its own cache line in all cases. -- Andre