Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 2 Feb 1997 15:02:38 -0500
From:      "David S. Miller" <davem@jenolan.rutgers.edu>
To:        terry@lambert.org
Cc:        michaelh@cet.co.jp, netdev@roxanne.nuclecu.unam.mx, roque@di.fc.ul.pt, freebsd-smp@freebsd.org, smpdev@roxanne.nuclecu.unam.mx
Subject:   Re: SMP
Message-ID:  <199702022002.PAA18631@jenolan.caipgeneral>
In-Reply-To: <199702021944.MAA08273@phaeton.artisoft.com> (message from Terry Lambert on Sun, 2 Feb 1997 12:44:50 -0700 (MST))

next in thread | previous in thread | raw e-mail | index | archive | help
   From: Terry Lambert <terry@lambert.org>
   Date: Sun, 2 Feb 1997 12:44:50 -0700 (MST)

   For instance, I get a page of memory, and I allocate two 50 byte
   objects out of it.  I modify both of the objects, then I give the
   second object to another processor.  The other processor modifies
   the second object and the first processor modifies the first object
   again.

   In theory, there will be a cache overlap, where the cache line on
   CPU 1 contains stale data for object two, and the cache line on CPU
   2 contains stale data for object one.  When either cache line is
   written through, the other object will be damaged, right?  Not
   immediately, but in the case of a cache line reload.  In other
   words, in the general case of a process context switch with a
   rather full ready-to-run-queue, with the resource held such that it
   goes out of scope in the cache.

   How do you resolve this?

Unless you hardware cache coherency support _really_ blows (ie. the
architecture is essentially uninteresting) the most recent
modification will claim ownership in that processors cache, it's
called hardware cache coherency last time I checked.

For example if what you are describing is:

cpu1:

	obj1 = foo_alloc();
	obj2 = foo_alloc();

	dirty_up(obj1);
	dirty_up(obj2);

	/* obj2 is no longer ours. */
	pass_reference(obj2, cpu2);

cpu2:

	obj2 = get_reference_from(cpu1);
	dirty_up(obj2);

cpu2 will see cpu1's modifications to obj2 + whatever modifications it
has made on top of cpu1's.  Unless you have really bad cache coherency
hardware, this is guarenteed.

Only exception is with store buffers, on some implementations you can
have available a memory model where either:

	1) stores are executed out of order
	2) the store buffer is not snooped during cache transactions

In such a case, most of this ugliness is hidden around memory barrier
instructions (either explicitly in the code, or inside of the locking
primitive implementations themselves).

Or are you describing something completely different.

---------------------------------------------////
Yow! 11.26 MB/s remote host TCP bandwidth & ////
199 usec remote TCP latency over 100Mb/s   ////
ethernet.  Beat that!                     ////
-----------------------------------------////__________  o
David S. Miller, davem@caip.rutgers.edu /_____________/ / // /_/ ><



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199702022002.PAA18631>