From owner-freebsd-smp Tue Jan 2 20: 5:36 2001 From owner-freebsd-smp@FreeBSD.ORG Tue Jan 2 20:05:33 2001 Return-Path: Delivered-To: freebsd-smp@freebsd.org Received: from fast.cs.utah.edu (fast.cs.utah.edu [155.99.212.1]) by hub.freebsd.org (Postfix) with ESMTP id 4AEFB37B402; Tue, 2 Jan 2001 20:05:33 -0800 (PST) Received: (from vanmaren@localhost) by fast.cs.utah.edu (8.9.1/8.9.1) id VAA16941; Tue, 2 Jan 2001 21:05:32 -0700 (MST) Date: Tue, 2 Jan 2001 21:05:32 -0700 (MST) From: Kevin Van Maren Message-Id: <200101030405.VAA16941@fast.cs.utah.edu> To: jhb@freebsd.org Subject: Re: atomic increment? Cc: smp@freebsd.org Sender: owner-freebsd-smp@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org John, > > Note that the only atomic_load and atomic_store primities are those that > > include memory barriers (and I think they are broken on the x86 for that > > matter; they need to use a lock'd cmpxchgl in the load case and a lock'd xchgl > > in the store case I think.) Okay, I finally got it! These are also in the man page: > The atomic_load() functions always have acquire semantics. along with an example using atomic_load() instead of atomic_load_acq(). > The atomic_store() functions always have release semantics. So while atomic_load() is used in an example in the man page, there is no prototype for atomic_load/atomic_store, nor is there an implementation. You are saying that the ONLY form is atomic_store_rel instead of atomic_store (which doesn't exist). I think the man page could use some minor clarification. My point about the acquire/release "bugs" is that incorrect code will work on IA32 because of the stricter memory ordering guarantees. Getting every use correct is non-trivial, especially if there are a lot of them, or if other code changes around the atomic op. It is certainly easiest to program if all atomic ops have acquire/release semantics, but non-optimally-performing on IA64. So I guess that is one more thing against the use of atomic operations. load/cmpxchg code shouldn't be too hard: you just need a scratch register, set it equal to eax (and ANY garbage value), and LOCK cmpxchg it with the address. The read value is placed in (part of, for 8/16 bit ops) eax. One register-register mov and one atomic memory RMW cycle (which works because Intel always does the RMW cycle; it writes back the original value if the cmp fails, and eax will always contain the value that was in memory). Should be a pretty efficient inline asm. Do you want me to send in a patch? At least the ia64 atomic code does "the right thing" with acquire/release semantics on load/store. Kevin To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-smp" in the body of the message