From owner-freebsd-arch Fri Jan 4 3:19:50 2002 Delivered-To: freebsd-arch@freebsd.org Received: from srv1.cosmo-project.de (srv1.cosmo-project.de [213.83.6.106]) by hub.freebsd.org (Postfix) with ESMTP id DDF3937B405; Fri, 4 Jan 2002 03:19:43 -0800 (PST) Received: (from uucp@localhost) by srv1.cosmo-project.de (8.11.6/8.11.6) with UUCP id g04BJG452612; Fri, 4 Jan 2002 12:19:16 +0100 (CET) (envelope-from ticso@cicely8.cicely.de) Received: from mail.cicely.de (cicely20.cicely.de [10.1.1.22]) by cicely5.cicely.de (8.12.1/8.12.1) with ESMTP id g04BF6tx057433; Fri, 4 Jan 2002 12:15:07 +0100 (CET)?g (envelope-from ticso@cicely8.cicely.de) Received: from cicely8.cicely.de (cicely8.cicely.de [10.1.2.10]) by mail.cicely.de (8.11.0/8.11.0) with ESMTP id g04BF6W15714; Fri, 4 Jan 2002 12:15:06 +0100 (CET) Received: (from ticso@localhost) by cicely8.cicely.de (8.11.6/8.11.6) id g04BF1q08440; Fri, 4 Jan 2002 12:15:01 +0100 (CET) (envelope-from ticso) Date: Fri, 4 Jan 2002 12:15:01 +0100 From: Bernd Walter To: Terry Lambert Cc: Bernd Walter , Matthew Dillon , Alfred Perlstein , John Baldwin , arch@FreeBSD.ORG, Mike Smith , Bruce Evans , Michal Mertl , Peter Jeremy Subject: Re: When to use atomic_ functions? (was: 64 bit counters) Message-ID: <20020104121500.E5294@cicely8.cicely.de> References: <200201030002.g0302Eo60575@apollo.backplane.com> <20020102180734.A82406@elvis.mu.org> <200201030012.g030Cgp60752@apollo.backplane.com> <3C33D3E3.38E758CA@mindspring.com> <20020103094239.GH53199@cicely9.cicely.de> <3C34C27C.C7FE6713@mindspring.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <3C34C27C.C7FE6713@mindspring.com> User-Agent: Mutt/1.3.23i X-Operating-System: FreeBSD cicely8.cicely.de 5.0-CURRENT i386 Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On Thu, Jan 03, 2002 at 12:43:40PM -0800, Terry Lambert wrote: > Bernd Walter wrote: > > > In any case, multiple reader, single write doesn't need locks > > > unless it has to be done atomically. > > > > Single write means always with the same CPU or thread. > > Yes, of course. > > > If you always write with the same CPU the other might keep an > > old value in his cache for an unknown long time. > > You simply trust that caches gets syncronised automaticaly by > > regular events like context switching. > > I'm not shure this is valid. > > Actually, I'd suggest putting it in a non-cacheable page, if > multiple CPUs needed to get at it, and it must be 100% accurate > at all times; if it were in a cacheable page, there would be an > Invalidate (MESI, remember?), and the cache line in the other > CPUs that had it cached would be flushed. Only if you have a coherent cache design. On alphas you explicitly have to request that either based on the cacheline, what happens with ldx_l/stx_c, or globaly. Using non-cacheable is worse than using syncronisation primitives. > Personally, I'd suggest making the algorithms insensitive to > the data, in the neighborhood of 2 times the update latency. > > This is what I suggested to Alfred, when I suggested per CPU > run queues (and three years ago, when we had the threads > design meeting at the Whistle facility FreeBSD user group). I'm not commenting your example, because I'm far away from knowing the current alternative. > --- > > The point of this example is that the issue isn't about how much > or how little locks cost, but about designs that don't require > locks in the common case. > > So it really doesn't *matter* what the costs are up front, as > long as the costs are incurrect so infrequently that their > amortized value is lower than the lowest cost method being hit > much more frequently. Right - as long as the design is still correct. > In other words: Matt was right, when he said "Maybe we are looking > at this all wrong ... maybe we don't need locks ..." (paraphrasing). We have address 0x0 with initial value 1 and address 0x58 with inital value 2 Remember this scenario is possible: CPU-A write value 3 to 0x0 write value 4 to 0x58 CPU-B read value 4 from 0x58 read value 1 from 0x0 > > > An easy way to do this atomically, even for fairly large items, > > > is to toggle between an active and inactive structure of data, > > > where the pointer to the strucutre is assigned atomically. > > > > Only you use rel/acq behavour - atomic alone isn't strong enough. > > It depends on your cycle time on the objects. So long as the > update/read cost is very small relative to the update/read > frequency, you're OK. On alpha you only have special cost for the first access in a given cache-line. You savely access as often as you want as long as it's still locked in the cache without any performance loss other then strict odering of the locked/conditional instructions. What you realy want to avoid is CPU changing of that cache-line. But unless you 100% exclude another CPU accessing your values in any way you still need to care about syncronisation. > Basically, you need N copies, where the update/read frequency > divided by the update/read cost is greater than N*2. For my > example of a zero system call gettimeofday() and time(), using > a page with PG_U set on it to contain a reflection of the > timecounter data, N == 2 (the minimum N permissable). For > something else, N might be a larger value, and then instead of > two pointers for "current" and "previous", you are talking two > pointers for a two-handed clock through a circularly linked > list of structures containing the reflected data (the hands > are both moved on update-forward, so (1) the data is never > incorrect, and (2) the data is always one-behind during the > update process itself). Sorry - I can't follow you here enough to even see if it would at least work correct. -- B.Walter COSMO-Project http://www.cosmo-project.de ticso@cicely.de Usergroup info@cosmo-project.de To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message