Date: Tue, 13 Jul 1999 16:33:11 +1000 From: Peter Jeremy <jeremyp@gsmx07.alcatel.com.au> To: dillon@apollo.backplane.com Cc: freebsd-current@FreeBSD.ORG Subject: Re: "objtrm" problem probably found (was Re: Stuck in "objtrm") Message-ID: <99Jul13.161525est.40326@border.alcanet.com.au> In-Reply-To: <199907130528.WAA74299@apollo.backplane.com>
next in thread | previous in thread | raw e-mail | index | archive | help
Matthew Dillon <dillon@apollo.backplane.com> wrote: >:[1] A locked instruction implies a synchronous RMW cycle. In order >: to meet write-ordering guarantees (without which, a locked RMW >: cycle would be useless as a semaphore primitive), it implies a >: complete write serialization, and probably some level of >: instruction serialisation. Since write-back pipelines will get > > A locked instruction only implies cache coherency across the > instruction. It does not imply serialization. Intel blows it > big time, but that's intel for you. Ooops, looks like foot-in-mouth time for me :-(. Maybe I should have said that "without any other cache coherency protocol, you need serialisation" :-). Given this correction, the lock degradation is much less than I suggested. I suspect there _will_ be gradual degradation though. >: longer and parallel execution units more numerous, the cost of >: a serialisation operation will get relatively higher. Also, > > It is not a large number of execution units that implies a higher > cost of serialization but instead data interdependancies. I was thinking more that a locked instruction is inconsistent with parallel execution, but that's probably not true either. > Modern cache coherency protocols do not have a problem with > a large number of caches in a parallel processing subsystem. I thought we were talking about Intel :-). > So, using the above rules as an example, a locked instruction can cost > as little as 0 extra cycles no matter how many cpu's you have running > in parallel. There is no need to serialize or synchronize anything. Assuming a non-contested access. If you've got two CPU's fighting over a lock, then you'll have a bus cycle - and CPU core speeds are increasing faster than bus speeds. (486's were normally 1 or 2 times the bus speed, a PIII-450 is 4.5 times bus speed). And as you pointed out elsewhere, call/return sequences can't get too much faster - which suggests that the relative costs should stay fairly similar. At least for a well-designed architecture... Peter To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-current" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?99Jul13.161525est.40326>