From owner-freebsd-current  Mon Jul 12 22: 1:48 1999
Delivered-To: freebsd-current@freebsd.org
Received: from apollo.backplane.com (apollo.backplane.com [209.157.86.2])
	by hub.freebsd.org (Postfix) with ESMTP id 2F03C1517D
	for <freebsd-current@FreeBSD.ORG>; Mon, 12 Jul 1999 22:01:45 -0700 (PDT)
	(envelope-from dillon@apollo.backplane.com)
Received: (from dillon@localhost)
	by apollo.backplane.com (8.9.3/8.9.1) id WAA74171;
	Mon, 12 Jul 1999 22:01:39 -0700 (PDT)
	(envelope-from dillon)
Date: Mon, 12 Jul 1999 22:01:39 -0700 (PDT)
From: Matthew Dillon <dillon@apollo.backplane.com>
Message-Id: <199907130501.WAA74171@apollo.backplane.com>
To: Peter Jeremy <jeremyp@gsmx07.alcatel.com.au>
Cc: mike@smith.net.au, freebsd-current@FreeBSD.ORG
Subject: Re: "objtrm" problem probably found (was Re: Stuck in "objtrm")
References:  <99Jul13.141832est.40326@border.alcanet.com.au>
Sender: owner-freebsd-current@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.ORG


:
:I'm not sure there's any reason why you shouldn't.  If you changed the
:semantics of a stack segment so that memory addresses below the stack
:pointer were irrelevant, you could implement a small, 0-cycle, on-chip
:stack (that overflowed into memory).  I don't know whether this
:semantic change would be allowable (and whether the associated silicon
:could be justified) for the IA-32.
:
:Peter

    This would be relatively complex and also results in cache coherency
    problems.  A solution already exists:  It's called branch-and-link,
    but Intel cpu's do not use it because Intel cpu's do not have enough
    registers (makes you just want to throw up -- all that MMX junk and they
    couldn't add a branch and link register! ).  The key with branch-and-link
    is that the lowest subroutine level does not have to save/restore the 
    register, making entry and return two or three times faster then 
    subroutine calls that make other subroutine calls.

    The big problem with implementing complex caches is that it takes up
    a serious amount of die space and power.  Most modern cpu's revolve 
    almost entirely around their L1 cache and their register file.  The
    remaining caches tend to be ad-hoc.  Intel's branch prediction cache
    is like this.  

    In order for a memory-prediction cache to be useful, it really needs
    to be cache-coherent, which basically kills the idea of having a separate
    little special case for the stack.  Only the L1 cache is coherent.  If
    you wanted you could implement multiple L1 data caches on-chip - that 
    might be of some benefit, but otherwise branch-and-link is the better
    way to do it.

					-Matt
					Matthew Dillon 
					<dillon@backplane.com>


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-current" in the body of the message