Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 18 Mar 2011 16:19:32 +1100 (EST)
From:      Bruce Evans <brde@optusnet.com.au>
To:        Jung-uk Kim <jkim@FreeBSD.org>
Cc:        src-committers@FreeBSD.org, Peter Jeremy <peterjeremy@acm.org>, Roman Divacky <rdivacky@FreeBSD.org>, Bruce Evans <brde@optusnet.com.au>, svn-src-head@FreeBSD.org, svn-src-all@FreeBSD.org, Maxim Dounin <mdounin@mdounin.ru>
Subject:   Re: svn commit: r219679 - head/sys/i386/include
Message-ID:  <20110318161019.M984@besplex.bde.org>
In-Reply-To: <201103171701.57546.jkim@FreeBSD.org>
References:  <201103152145.p2FLjAlt060256@svn.freebsd.org> <201103161634.08104.jkim@FreeBSD.org> <20110317195729.GA65858@server.vk2pj.dyndns.org> <201103171701.57546.jkim@FreeBSD.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, 17 Mar 2011, Jung-uk Kim wrote:

> On Thursday 17 March 2011 03:57 pm, Peter Jeremy wrote:
>> On 2011-Mar-16 16:34:04 -0400, Jung-uk Kim <jkim@FreeBSD.org> wrote:
>>> On Wednesday 16 March 2011 01:45 pm, Roman Divacky wrote:
>>>> if we drop i486 I think it makes sense to require something that
>>>> has at least SSE2, thus we can have the same expectations as on
>>>> amd64.
>>
>> I think it's stil a bit early for that - especially the SSE2
>> requirement.
>>
>>> This is a proof-of-concept patch for sys/x86/isa/clock.c:
>>>
>>> http://people.freebsd.org/~jkim/clock.diff
>>>
>>> You see the complexity, just because I wanted to load 64-bit value
>>> atomically... :-(
>>
>> An alternative approach is to have _fetch_frequency() be
>>   uint64_t (*_fetch_frequency)(uint64_t *);
>> if i386 and I486 are defined (otherwise it's just the #define
>> (*(p))) then initialise it to either atomic_fetch_quad_i386 or
>> atomic_fetch_quad_i586 as part of the CPU detection process.  This
>> is the way bcopy() is/was handled on Pentium.
>>
>> Another approach would be to always have cmpxchg8b instructions
>> (followed by a suitably large NOP) always inlined in the code and
>> if it traps, patch the code to call a function that emulates it.
>
> I think the former makes more sense for atomic read/write because we
> don't need complete cmpxchg8b support but kind of movq support,
> actually.

Both require a function call.  With a function call, patching becomes
much easier since there is only 1 place to patch, so patching is almost
as easy as changing a function pointer (might need an instruction
queue flush and/or prevention of the function being called before or
while it is being patched).

Patching the code also makes it easier to null out the lock prefix
in the !SMP case when it is presumably not needed.  The function call
to a function without a lock prefix will then be faster than inline
code with a lock prefix.  With a function pointer, you start getting
combinatorial explosion in the number of separate functions needed (1
without cmpxchg8b or a lock prefix (for i486), 1 with cmpxchg8b without
a lock prefix (for !SMP i586+), and 1 with both (for SMP i586+).

Bruce



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20110318161019.M984>