From owner-freebsd-current Mon Jul 12 22: 1:48 1999 Delivered-To: freebsd-current@freebsd.org Received: from apollo.backplane.com (apollo.backplane.com [209.157.86.2]) by hub.freebsd.org (Postfix) with ESMTP id 2F03C1517D for ; Mon, 12 Jul 1999 22:01:45 -0700 (PDT) (envelope-from dillon@apollo.backplane.com) Received: (from dillon@localhost) by apollo.backplane.com (8.9.3/8.9.1) id WAA74171; Mon, 12 Jul 1999 22:01:39 -0700 (PDT) (envelope-from dillon) Date: Mon, 12 Jul 1999 22:01:39 -0700 (PDT) From: Matthew Dillon Message-Id: <199907130501.WAA74171@apollo.backplane.com> To: Peter Jeremy Cc: mike@smith.net.au, freebsd-current@FreeBSD.ORG Subject: Re: "objtrm" problem probably found (was Re: Stuck in "objtrm") References: <99Jul13.141832est.40326@border.alcanet.com.au> Sender: owner-freebsd-current@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.ORG : :I'm not sure there's any reason why you shouldn't. If you changed the :semantics of a stack segment so that memory addresses below the stack :pointer were irrelevant, you could implement a small, 0-cycle, on-chip :stack (that overflowed into memory). I don't know whether this :semantic change would be allowable (and whether the associated silicon :could be justified) for the IA-32. : :Peter This would be relatively complex and also results in cache coherency problems. A solution already exists: It's called branch-and-link, but Intel cpu's do not use it because Intel cpu's do not have enough registers (makes you just want to throw up -- all that MMX junk and they couldn't add a branch and link register! ). The key with branch-and-link is that the lowest subroutine level does not have to save/restore the register, making entry and return two or three times faster then subroutine calls that make other subroutine calls. The big problem with implementing complex caches is that it takes up a serious amount of die space and power. Most modern cpu's revolve almost entirely around their L1 cache and their register file. The remaining caches tend to be ad-hoc. Intel's branch prediction cache is like this. In order for a memory-prediction cache to be useful, it really needs to be cache-coherent, which basically kills the idea of having a separate little special case for the stack. Only the L1 cache is coherent. If you wanted you could implement multiple L1 data caches on-chip - that might be of some benefit, but otherwise branch-and-link is the better way to do it. -Matt Matthew Dillon To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-current" in the body of the message