Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 03 Jun 2005 21:26:02 -0600
From:      Scott Long <scottl@samsco.org>
To:        Matthew Dillon <dillon@apollo.backplane.com>
Cc:        freebsd-hackers@freebsd.org, John-Mark Gurney <gurney_j@resnet.uoregon.edu>
Subject:   Re: Possible instruction pipelining problem between HT's on the	same die ?
Message-ID:  <42A11F4A.40502@samsco.org>
In-Reply-To: <200506040257.j542veCm063487@apollo.backplane.com>
References:  <200506032057.j53KvOFw062012@apollo.backplane.com>	<20050604021812.GG594@funkthat.com> <200506040257.j542veCm063487@apollo.backplane.com>

next in thread | previous in thread | raw e-mail | index | archive | help
Matthew Dillon wrote:
[...]
>     It seems so unlikely that this could occur across physical cpus that
>     I was not surprised at all by this.  But 16 instructions seemed unlikely
>     to me.  The only scenario I can come up with is that the READ SIDE on
>     the HT cpu (logical cpu #1) did a speculative read of B before logical
>     cpu #0 wrote to it, then somehow held that speculative read for 16 
>     whole instructions on logical cpu #1.
> 
>     Is that even possible ?  holding speculative read data across
>     16 instructions ?

Yes

> 
>     The only other possibility is that there are major interactions in the
>     instruction pipeline and cpu #1 is reading e.g. the index B from the
>     pipeline or write buffer and data A from memory prior to data A being
>     retired to memory by cpu #0.  That seems ridiculous to me, but I 
>     wonder if it's possible without an SFENCE.
> 
>     This crash occurs fairly rarely.  It takes a lot of packets for it to
>     occur... perhaps a million or more.
> 
>     In anycase, we are now testing a kernel with a locked bus cycle inbetwen
>     the READ B and the READ A to see if that fixes the problem.  If that
>     doesn't work I will put an SFENCE between the WRITE A and the WRITE B.
>     And if that doesn't work then I'm shooting up the wrong alley and it
>     isn't an instruction/memory ordering issue.

I would expect that putting the fence on the write side will solve the 
problem.  As Stephen discussed, the writes will land in a store buffer
for a period of time, during which a fence on the write CPU will flush 
it out and make it visible to the other CPUs.  Doing a fence on the read
CPU will have no effect on the store buffers of the write CPU and will
be a waste of time.

Another thing to keep in mind is that there is no difference here 
between HT and non HT SMP protocol.  While HT cores share execution 
units, they DO NOT share registers, store buffers, or cache (at least,
not in a way that is visible outside of the low-level implementation of
the chip).

Scott



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?42A11F4A.40502>