Date: Thu, 6 May 2004 10:15:44 -0400 (EDT) From: Andrew Gallatin <gallatin@cs.duke.edu> To: Don Bowman <don@sandvine.com> Cc: Gerrit Nagelhout <gnagelhout@sandvine.com> Subject: RE: 4.7 vs 5.2.1 SMP/UP bridging performance Message-ID: <16538.18576.320694.79356@grasshopper.cs.duke.edu> In-Reply-To: <FE045D4D9F7AED4CBFF1B3B813C85337045D8CB5@mail.sandvine.com> References: <FE045D4D9F7AED4CBFF1B3B813C85337045D8CB5@mail.sandvine.com>
next in thread | previous in thread | raw e-mail | index | archive | help
Don Bowman writes: > > On the P4, there are mfence,lfence,sfence instructions to enforce > memory ordering. These are cheaper than "lock; andl" or "cpuid", > which are the traditional 'sync' instructions. For what its worth, using those operations yeilds these results on my 2.53GHz P4 (for UP) Mutex (atomic_store_rel_int) cycles per iteration: 208 Mutex (sfence) cycles per iteration: 85 Mutex (lfence) cycles per iteration: 63 Mutex (mfence) cycles per iteration: 169 Mutex (none) cycles per iteration: 18 lfence looks like a winner.. Drew
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?16538.18576.320694.79356>