From owner-freebsd-alpha@FreeBSD.ORG Thu Aug 7 12:02:17 2003 Return-Path: Delivered-To: freebsd-alpha@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id DE18837B401; Thu, 7 Aug 2003 12:02:17 -0700 (PDT) Received: from zmamail04.zma.compaq.com (mailout.zma.compaq.com [161.114.64.104]) by mx1.FreeBSD.org (Postfix) with ESMTP id 6874143FDD; Thu, 7 Aug 2003 12:02:16 -0700 (PDT) (envelope-from peter.portante@hp.com) Received: from tayexg11.americas.cpqcorp.net (tayexg11.americas.cpqcorp.net [16.103.130.96]) by zmamail04.zma.compaq.com (Postfix) with ESMTP id EC720A156; Thu, 7 Aug 2003 15:02:15 -0400 (EDT) Received: from tayexc17.americas.cpqcorp.net ([16.103.130.15]) by tayexg11.americas.cpqcorp.net with Microsoft SMTPSVC(5.0.2195.6673); Thu, 7 Aug 2003 15:02:15 -0400 X-MimeOLE: Produced By Microsoft Exchange V6.0.6375.0 content-class: urn:content-classes:message MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Date: Thu, 7 Aug 2003 15:02:14 -0400 Message-ID: X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: Atomic swap Thread-Index: AcNdC4p7z5rzPukBTNK7E4iR0uf5xAACObsB From: "Portante, Peter" To: X-OriginalArrivalTime: 07 Aug 2003 19:02:15.0707 (UTC) FILETIME=[6AAFB6B0:01C35D16] cc: alpha@freebsd.org Subject: RE: Atomic swap X-BeenThere: freebsd-alpha@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Porting FreeBSD to the Alpha List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 07 Aug 2003 19:02:18 -0000 Dan, > ---------- > From: Daniel Eischen > Reply To: deischen@freebsd.org > Sent: Thursday, August 7, 2003 1:44 PM > To: Portante, Peter > Cc: alpha@freebsd.org; deischen@freebsd.org > Subject: RE: Atomic swap >=20 > On Thu, 7 Aug 2003, Portante, Peter wrote: >=20 > > Dan, > >=20 > > I don't think you want to do the stq_c if the location already holds = the > > same value. Instead, check the loaded value to see if it is the = same as the >=20 > The purpose of the atomic swap is to make a FIFO queueing > list. The values should never be the same. It's not meant > to be used as test_and_set. >=20 Reasonable. We had a major performance bug in our code when we assumed = a routine performed a certain way based on its name. You might want to = change the name, because an atomic swap long could be used to implement = a mutex if one didn't know better and then this code will tube an MP = system under contention. > > value to be stored, and branch out of the loop returning the result = if it is > > they are the same. And starting with EV56, the need to do the = branch > > forward/branch back logic has been removed. And EV6 and later CPUs = do such > > a good job predicting the branching that it is not worth the = instruction > > stream space when that space can be used to avoid a stq_c. > >=20 > > Additionally, the stq_c destroys the contents of %2, so you need to = move the > > value in %2 into another register for use in the stq_c. I don't = know how to > > do that in the ASM, so I just used raw register names below, = highlighted in > > red. >=20 > How about this? >=20 Not too bad, except every time you loop you make another memory = reference to get the value. If you load it into a register once, you = can just move it into place each time before the store with out = referencing memory. For performance, don't reference memory unless you = absolutely have to. Also, you might want to issue a ldq, once, before = the actual loop of ldq_l so that the processor gets the cache line using = the normal load instruction avoiding the heavier load-locked logic. I just read Marcel's note, and his code looks pretty good. Just add a = ldq before the "1: ldq_l" that code will perform quite well. If you = don't want to add the ldq to the asm, just read the destination value = before call the atomic_swap_long(), it will really help this perform = well. -Peter