From owner-freebsd-hackers@FreeBSD.ORG  Wed Apr 20 22:13:15 2005
Return-Path: <owner-freebsd-hackers@FreeBSD.ORG>
Delivered-To: freebsd-hackers@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id DC53F16A4CE
	for <freebsd-hackers@freebsd.org>;
	Wed, 20 Apr 2005 22:13:15 +0000 (GMT)
Received: from duchess.speedfactory.net (duchess.speedfactory.net
	[66.23.201.84])	by mx1.FreeBSD.org (Postfix) with SMTP id 33E2B43D2D
	for <freebsd-hackers@freebsd.org>;
	Wed, 20 Apr 2005 22:13:15 +0000 (GMT)	(envelope-from ups@tree.com)
Received: (qmail 17123 invoked by uid 89); 20 Apr 2005 22:13:13 -0000
Received: from duchess.speedfactory.net (66.23.201.84)
  by duchess.speedfactory.net with SMTP; 20 Apr 2005 22:13:13 -0000
Received: (qmail 17117 invoked by uid 89); 20 Apr 2005 22:13:13 -0000
Received: from unknown (HELO palm.tree.com) (66.23.216.49)
  by duchess.speedfactory.net with SMTP; 20 Apr 2005 22:13:13 -0000
Received: from [127.0.0.1] (localhost.tree.com [127.0.0.1])
	by palm.tree.com (8.12.10/8.12.10) with ESMTP id j3KMDDw6030108;
	Wed, 20 Apr 2005 18:13:13 -0400 (EDT)
	(envelope-from ups@tree.com)
From: Stephan Uphoff <ups@tree.com>
To: John Giacomoni <John.Giacomoni@colorado.edu>
In-Reply-To: <adbc63802b6da49e587f294644f41a75@colorado.edu>
References: <adbc63802b6da49e587f294644f41a75@colorado.edu>
Content-Type: text/plain
Message-Id: <1114035193.17300.307.camel@palm>
Mime-Version: 1.0
X-Mailer: Ximian Evolution 1.4.6 
Date: Wed, 20 Apr 2005 18:13:13 -0400
Content-Transfer-Encoding: 7bit
cc: freebsd-hackers@freebsd.org
Subject: Re: what goes wrong with barrier free atomic_load/store?
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
	<freebsd-hackers.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>,
	<mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>,
	<mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 20 Apr 2005 22:13:16 -0000

On Wed, 2005-04-20 at 16:39, John Giacomoni wrote:
> in reading /src/sys/i386/include/atomic.h
> 
> I found this comment and I'm having trouble understanding what the
> problem being
> referred to below is.
> 
> /*
>   * We assume that a = b will do atomic loads and stores.  However, on a
>   * PentiumPro or higher, reads may pass writes, so for that case we have
>   * to use a serializing instruction (i.e. with LOCK) to do the load in
>   * SMP kernels.  For UP kernels, however, the cache of the single
> processor
>   * is always consistent, so we don't need any memory barriers.
>   */
> 
> can someone give me an example of a situation where one needs to use
> memory barriers to ensure "correctness" when doing writes as above?

volatile int status = NOT_READY;
volatile int data = -1;

Thread 1: (CPU 0)
----------
data = 123;
status = READY;

Thread 2: (CPU 1)
---------
if (status == READY) {
	my_data = data;	
}

Read reordering my the CPUs may cause the following:

Thread 2:   out_of_order_read = data;
Thread 1:   data = 123;
Thread 1:   status = READY;
Thread 2:   if (status == READY) { 
Thread 2:   my_data = out_of_order_read  ; /* XXXX Unexpected VALUE */ 

Basically volatile does not work as expected.

> the examples I can come up with seem to boil down to requiring locks
> or accepting stale values, given that without a synchronization
> mechanism
> one shouldn't expect two processes to act in any specific order.

The problem is that writes from another CPU (or DMA device) can be
observed out of order.

> In my case I can accept reading a stale value so I'm not understanding
> the
> purpose of only having atomic_load/atomic_store wrappers with memory
> barriers.
> 
> I saw a brief discussion where someone proposed barrier free load/store
> but
> don't think I saw any resolution.

Do you mean load/store fences?

A load fence could solve the problem above by preventing the out of
order read of the data by thread 2.

I actually found a race condition close to the one mentioned above in
the kernel yesterday. So we may need to add fences real soon or rewrite
the code to use a spin mutex.

Stephan