Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 16 Jan 2007 14:19:04 -0700
From:      Scott Long <scottl@samsco.org>
To:        John Baldwin <jhb@freebsd.org>
Cc:        Pawel Jakub Dawidek <pjd@freebsd.org>, Kip Macy <kip.macy@gmail.com>, Suleiman Souhlal <ssouhlal@freebsd.org>, Attilio Rao <attilio@freebsd.org>, freebsd-current@freebsd.org, freebsd-arch@freebsd.org
Subject:   Re: [PATCH] Mantaining turnstile aligned to 128 bytes in i386 CPUs
Message-ID:  <45AD4148.90002@samsco.org>
In-Reply-To: <200701161605.22394.jhb@freebsd.org>
References:  <3bbf2fe10607250813w8ff9e34pc505bf290e71758@mail.gmail.com>	<200701161438.52481.jhb@freebsd.org>	<3bbf2fe10701161236s48e6cc16p99c8c38c1d7becde@mail.gmail.com> <200701161605.22394.jhb@freebsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help
John Baldwin wrote:
> On Tuesday 16 January 2007 15:36, Attilio Rao wrote:
>> 2007/1/16, John Baldwin <jhb@freebsd.org>:
>>> On Tuesday 16 January 2007 11:51, Attilio Rao wrote:
>>>> 2006/7/28, Attilio Rao <attilio@freebsd.org>:
>>>>> After some thinking, I think it's better using init/fini methods
>>>>> (since they hide the sizeof(struct turnstile) with size parameter).
>>>>>
>>>>> Feedbacks and comments are welcome:
>>>>> http://users.gufi.org/~rookie/works/patches/uma_sync_init.diff
>>>> [CC'ed all the interested people]
>>>>
>>>> Even if a long time is passed I did some benchmarks based on ebizzy 
> tool.
>>>> This program claims to reproduce a real httpd server behaviour and is
>>>> used into the Linux world for benchmarks, AFAIK.
>>>> I think that results of the comparison on this patch is very
>>>> interesting, and I think it worths a commit :)
>>>> I think that results can be even better on a Xeon machine (I had no
>>>> chance to reproduce this on some of these).
>>>> (Results taken in consideration have been measured after some starts,
>>>> in order to minimize caching differences).
>>>>
>>>> The patch:
>>>> http://users.gufi.org/~rookie/works/patches/ts-sq/ts-sq.diff
>>> Looks good.  Some minor nits are that in subr_turnstile.c in the comment I
>>> would say "a turnstile is allocated" rather than "a turnstile is got from 
> a
>>> specific UMA zone" as it reads a little bit clearer.  Also, I would
>>> say "Allocate a" rather than "Get a" for the two _alloc() functions.  
> Also,
>>> why not just use UMA_ALIGN_CACHE and make UMA_ALIGN_CACHE (128 - 1) on 
> i386
>>> and amd64 rather than adding a new UMA_ALIGN_SYNC?
>> I was thinking that in this way anyone who wants to replace the
>> syncronizing primitive boundary to an appropriate value can do it.
>> I just used UMA_ALIGN_CACHE as default value beacause I don't know the
>> better boundary (for syncronizing primitives) for other arches.
> 
> Is there a good reason to not cache-align synch primitives?  That is, why 
> would an arch not use cache-align?  Also, is there a reason to not update 
> UMA_ALIGN_CACHE on x86?
> 

If you always cache-line-align them, that also addresses the Intel
recommendation to always keep them from sharing cache lines.

Scott



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?45AD4148.90002>