From owner-freebsd-smp  Tue Jan  2 20: 5:36 2001
From owner-freebsd-smp@FreeBSD.ORG  Tue Jan  2 20:05:33 2001
Return-Path: <owner-freebsd-smp@FreeBSD.ORG>
Delivered-To: freebsd-smp@freebsd.org
Received: from fast.cs.utah.edu (fast.cs.utah.edu [155.99.212.1])
	by hub.freebsd.org (Postfix) with ESMTP
	id 4AEFB37B402; Tue,  2 Jan 2001 20:05:33 -0800 (PST)
Received: (from vanmaren@localhost)
	by fast.cs.utah.edu (8.9.1/8.9.1) id VAA16941;
	Tue, 2 Jan 2001 21:05:32 -0700 (MST)
Date: Tue, 2 Jan 2001 21:05:32 -0700 (MST)
From: Kevin Van Maren <vanmaren@fast.cs.utah.edu>
Message-Id: <200101030405.VAA16941@fast.cs.utah.edu>
To: jhb@freebsd.org
Subject: Re: atomic increment?
Cc: smp@freebsd.org
Sender: owner-freebsd-smp@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.org

John,

> > Note that the only atomic_load and atomic_store primities are those that
> > include memory barriers (and I think they are broken on the x86 for that
> > matter; they need to use a lock'd cmpxchgl in the load case and a lock'd xchgl
> > in the store case I think.)

Okay, I finally got it!

These are also in the man page:
>    The atomic_load() functions always have acquire semantics.
along with an example using atomic_load() instead of atomic_load_acq().

>    The atomic_store() functions always have release semantics.

So while atomic_load() is used in an example in the man page,
there is no prototype for atomic_load/atomic_store, nor is there
an implementation.  You are saying that the ONLY form is atomic_store_rel
instead of atomic_store (which doesn't exist).  I think the man page
could use some minor clarification.


My point about the acquire/release "bugs" is that incorrect code
will work on IA32 because of the stricter memory ordering guarantees.
Getting every use correct is non-trivial, especially if there are a
lot of them, or if other code changes around the atomic op.  It is
certainly easiest to program if all atomic ops have acquire/release
semantics, but non-optimally-performing on IA64.  So I guess that
is one more thing against the use of atomic operations.

load/cmpxchg code shouldn't be too hard: you just need a scratch
register, set it equal to eax (and ANY garbage value), and LOCK
cmpxchg it with the address.  The read value is placed in (part of,
for 8/16 bit ops) eax.  One register-register mov and one atomic
memory RMW cycle (which works because Intel always does the RMW
cycle; it writes back the original value if the cmp fails, and
eax will always contain the value that was in memory).  Should
be a pretty efficient inline asm.

Do you want me to send in a patch?


At least the ia64 atomic code does "the right thing" with acquire/release
semantics on load/store.

Kevin


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-smp" in the body of the message