Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 13 Jul 1999 16:33:11 +1000
From:      Peter Jeremy <jeremyp@gsmx07.alcatel.com.au>
To:        dillon@apollo.backplane.com
Cc:        freebsd-current@FreeBSD.ORG
Subject:   Re: "objtrm" problem probably found (was Re: Stuck in "objtrm")
Message-ID:  <99Jul13.161525est.40326@border.alcanet.com.au>
In-Reply-To: <199907130528.WAA74299@apollo.backplane.com>

next in thread | previous in thread | raw e-mail | index | archive | help
Matthew Dillon <dillon@apollo.backplane.com> wrote:
>:[1] A locked instruction implies a synchronous RMW cycle.  In order
>:    to meet write-ordering guarantees (without which, a locked RMW
>:    cycle would be useless as a semaphore primitive), it implies a
>:    complete write serialization, and probably some level of
>:    instruction serialisation.  Since write-back pipelines will get
>
>    A locked instruction only implies cache coherency across the 
>    instruction.  It does not imply serialization.  Intel blows it
>    big time, but that's intel for you.

Ooops, looks like foot-in-mouth time for me :-(.

Maybe I should have said that "without any other cache coherency
protocol, you need serialisation" :-).

Given this correction, the lock degradation is much less than I
suggested.  I suspect there _will_ be gradual degradation though.

>:    longer and parallel execution units more numerous, the cost of
>:    a serialisation operation will get relatively higher.  Also,
>
>    It is not a large number of execution units that implies a higher
>    cost of serialization but instead data interdependancies.

I was thinking more that a locked instruction is inconsistent with
parallel execution, but that's probably not true either.

>    Modern cache coherency protocols do not have a problem with 
>    a large number of caches in a parallel processing subsystem.
I thought we were talking about Intel :-).

>    So, using the above rules as an example, a locked instruction can cost
>    as little as 0 extra cycles no matter how many cpu's you have running
>    in parallel.  There is no need to serialize or synchronize anything.

Assuming a non-contested access.  If you've got two CPU's fighting
over a lock, then you'll have a bus cycle - and CPU core speeds are
increasing faster than bus speeds.  (486's were normally 1 or 2
times the bus speed, a PIII-450 is 4.5 times bus speed).

And as you pointed out elsewhere, call/return sequences can't get
too much faster - which suggests that the relative costs should stay
fairly similar.  At least for a well-designed architecture...

Peter


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-current" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?99Jul13.161525est.40326>