From owner-freebsd-smp  Sun Nov  9 15:40:13 1997
Return-Path: <owner-freebsd-smp>
Received: (from root@localhost)
          by hub.freebsd.org (8.8.7/8.8.7) id PAA14949
          for smp-outgoing; Sun, 9 Nov 1997 15:40:13 -0800 (PST)
          (envelope-from owner-freebsd-smp)
Received: from ns.mt.sri.com (SRI-56K-FR.mt.net [206.127.65.42])
          by hub.freebsd.org (8.8.7/8.8.7) with ESMTP id PAA14930;
          Sun, 9 Nov 1997 15:40:04 -0800 (PST)
          (envelope-from nate@rocky.mt.sri.com)
Received: from rocky.mt.sri.com (rocky.mt.sri.com [206.127.76.100])
	by ns.mt.sri.com (8.8.7/8.8.7) with ESMTP id QAA12001;
	Sun, 9 Nov 1997 16:39:55 -0700 (MST)
Received: (from nate@localhost) by rocky.mt.sri.com (8.7.5/8.7.3) id QAA06530; Sun, 9 Nov 1997 16:39:53 -0700 (MST)
Date: Sun, 9 Nov 1997 16:39:53 -0700 (MST)
Message-Id: <199711092339.QAA06530@rocky.mt.sri.com>
From: Nate Williams <nate@mt.sri.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
To: "John S. Dyson" <toor@dyson.iquest.net>
Cc: nate@mt.sri.com (Nate Williams), perlsta@cs.sunyit.edu,
        gpalmer@freebsd.org, freebsd-smp@freebsd.org
Subject: Re: Best processor?
In-Reply-To: <199711092159.QAA27125@dyson.iquest.net>
References: <199711092131.OAA05995@rocky.mt.sri.com>
	<199711092159.QAA27125@dyson.iquest.net>
X-Mailer: VM 6.29 under 19.15 XEmacs Lucid
Sender: owner-freebsd-smp@freebsd.org
X-Loop: FreeBSD.org
Precedence: bulk

> > > dual 300mhz PIIs will beat dual PPro 200mhz
> > 
> > >From the noise I've been hearing lately on the mailing lists, this
> > suprises me.  Do you have #'s to back it up?
> > 

> I am not responding with numbers, but if you look at it, it is likely true:
>
> 1) The PIIs have 512K cache, while the PPro has (normally) 256K cache.

The big cache runs at half-speed though, which is a *huge* performance
hit.  (They have a bigger L1 cache, which is a win, more on that below.)

> Therefore bus utilization is likely less with the PII.  Even in the
> case of a 512K cache, the bus utilization is going to be nearly the
> same.

Not quite.  The PII has to 'spin' alot more waiting for data since it
can't get to it at bus-speeds, while the PPro doesn't have to.  Going
from 256 -> 512K doesn't equal a double in cache performance (I'd
suspect somewhere around 15-20% at best), so I would think the two #'s
would be close to break-even.  If you get a 512K PPro it would be a big
win.

> In a DP system, bus utilization is likely less important than
> in 4-way systems anyway.

DP?  Distributed Processing?  SMP?  Help me out here.

> 2) Expect about 3-5% miss rate with an 8K or 16K 1st level cache.  (I
> have really measured it on real applications.)

Heck, let's use the #'s from Hennessy and Patterson, considered to be
'THE' hardware/cache reference in many folks minds.  (The processor in
this case is one of the later VAX sets, but it's architecture is similar
enough for cache performance to be pretty close).

Cache/size vs. miss rate:

8K:  8 - 16%
16K: 7 - 11%
32K: 2 - 6%
64K: 1 - 3%

> Miss rate can be much lower than that though.  The miss rate does not
> scale linearly downward with 1st level cache size, but it does go down
> (especially with n-way associative cache schemes.)

This is for the L1 cache numbers, and the numbers given assume a data +
instruction combined cache.

> 3) Single processor PIIs at 300MHz are almost always (if not always)
> faster than a PPro at 200MHz running real code. PII MB's can support
> SDRAM now, and that really does help mitigate the aggregate
> performance loss due to the 1/2 speed 2nd level cache.

I'd like to see real #'s to back that up. 

> If you are talking about 233MHz PII processors vs. 200MHz PPro processors, it
> is harder to decide on which processor is faster, but I do think that the PII
> will win out on average.

We're talking about SMP support, not UP support.  For UP stuff, there's
no doubt that the high-clock PII chips will outperform a (relatively
speaking) low-clock PPro chip, but for SMP everything I've read and seen
tells me that the PPro *kills* the PII for SMP work, mostly due to the
L2 cache (and motherboard design??)


Nate