Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 12 Jan 2006 14:27:01 -0500
From:      John Baldwin <jhb@freebsd.org>
To:        freebsd-hackers@freebsd.org
Cc:        Peter Jeremy <PeterJeremy@optushome.com.au>
Subject:   Re: Atomic operations across multiple processors
Message-ID:  <200601121427.02185.jhb@freebsd.org>
In-Reply-To: <20060112085237.GA64401@cirb503493.alcatel.com.au>
References:  <20060112085237.GA64401@cirb503493.alcatel.com.au>

next in thread | previous in thread | raw e-mail | index | archive | help
On Thursday 12 January 2006 03:52 am, Peter Jeremy wrote:
> atomic(9) states:
>  The current set of atomic operations do not necessarily guarantee atomic-
>  ity across multiple processors.  ...  On the i386 architecture, the cache
>  coherency model requires that the hardware perform this task, thus the
>  atomic operations are atomic across multiple processors.  On the ia64
>  architecture, coherency is only guaranteed for pages that are configured
>  to using a caching policy of either uncached or write back.
>
> Unfortunately, this doesn't document the behaviour for other
> architectures - this makes it difficult to write portable code.
>
> For the ia64, the statement isn't especially helpful because there's
> no indication of what caching policy is used by default and how to
> change it.  Also, it seems odd that write-back pages would be coherent
> whilst write-through pages aren't - is this a typo?  The man page is
> also inconsistent with /sys/ia64/include/atomic.h which states that
> atomic operations _are_ SMP safe.
>
> I've tried looking at the mutex code to see how the iA64 achieves
> inter-processor synchronisation on top of (supposedly) non-
> synchronised atomic(9) primitives but can't find anything.
>
> I'd appreciate comments from people familiar with non-iA32 architectures.

What it is trying to communicate is the fact that the results of an atomic 
operation are not immediately visible on other CPUs.  That is, other CPUs may 
have stale values in cache, etc.  On i386 the caching policy doesn't really 
allow that much as lines get evicted when other CPUs write to them.  On 
sparc9 for example, writes can sit in a store buffer for a while before they 
are posted to main memory, and other CPUs in the system won't see the effect 
of the write until then.  However, if another CPU tries to do a cas on the 
same variable, it will either block or fail (not sure which, might be 
implementation dependent).  That's all the mutex code needs though.  Here's 
the non-hairy parts of _mtx_lock_sleep() to show how they work:

	while (!_obtain_lock(m, tid)) {
		turnstile_lock(&m->mtx_object);
		v = m->mtx_lock;

		/*
		 * Check if the lock has been released while spinning for
		 * the turnstile chain lock.
		 */
		if (v == MTX_UNOWNED) {
			turnstile_release(&m->mtx_object);
			cpu_spinwait();
			continue;
		}

		/*
		 * If the mutex isn't already contested and a failure occurs
		 * setting the contested bit, the mutex was either released
		 * or the state of the MTX_RECURSED bit changed.
		 */
		if ((v & MTX_CONTESTED) == 0 &&
		    !atomic_cmpset_ptr(&m->mtx_lock, v, v | MTX_CONTESTED)) {
			turnstile_release(&m->mtx_object);
			cpu_spinwait();
			continue;
		}

		/*
		 * Block on the turnstile.
		 */
		turnstile_wait(&m->mtx_object, mtx_owner(m));
	}

1) First we try to obtain the lock vi atomic_cmpset_acq().  If it succeeds, 
all is happy and we return.

2) If it fails we acquire the turnstile spin lock (really, it's one of several 
turnstile locks based on a hash of the lock's KVA).  It's important to note 
that all manipulation of the MTX_CONTESTED bit happens while this spin lock 
is held.

3) We read the value of mtx_lock after acquiring the turnstile lock.

4) We check to see if the lock is now free after we acquired the turnstile 
lock.  If so, we drop the turnstile lock and try again from the top.

5) If MTX_CONTESTED is set, then we know that the owning thread is going to 
fail its simple mutex unlock and will end up in the _mtx_unlock_sleep() 
function where it will wake us up, so we go ahead and add ourselves to the 
thread queue on the turnstile via turnstile_wait() which will block us and 
handle any races in the queue mechanics itself.

6) If MTX_CONTESTED is clear, we need to make sure it is set before we block 
to ensure that the owning thread will wake us up when it drops the lock.  If 
we can't set MTX_CONTESTED, then that means that the value of mtx_lock 
doesn't match what we think it is (v), so we drop the turnstile lock and try 
again.

That's the simple overview anyway.  Now suppose that an arch will fail 
atomic_cmpset_ptr() even if mtx_lock == v, but it knows that some other CPU 
has a pending write to mtx_lock that it can't see the value of yet (think of 
linked-load conditional-store as on Alpha, and available (but not currently 
used) on ia64), in that case, it doesn't hurt to just fail 
atomic_cmpset_ptr() as we will just spin until other CPUs writes post.  So, 
in the mutex code, I don't really care if atomic_cmpset_ptr() has failed 
because v was stale and the CPU could tell because the actual compare failed, 
or if the CPU just knows that mtx_lock is in a cache-line that it knows is 
dirty in another CPU and thus can't be trusted.  I'll loop until either I see 
that somebody else succeeded in setting MTX_CONTESTED or until I set it ok as 
well.

-- 
John Baldwin <jhb@FreeBSD.org>  <><  http://www.FreeBSD.org/~jhb/
"Power Users Use the Power to Serve"  =  http://www.FreeBSD.org



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200601121427.02185.jhb>