Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 13 Jul 1999 14:36:20 +1000
From:      Peter Jeremy <jeremyp@gsmx07.alcatel.com.au>
To:        dillon@apollo.backplane.com, mike@smith.net.au
Cc:        freebsd-current@FreeBSD.ORG
Subject:   Re: "objtrm" problem probably found (was Re: Stuck in "objtrm")
Message-ID:  <99Jul13.141832est.40326@border.alcanet.com.au>
In-Reply-To: <199907130238.TAA73524@apollo.backplane.com>

next in thread | previous in thread | raw e-mail | index | archive | help
Matthew Dillon <dillon@apollo.backplane.com> wrote:
>    The change in code flow used to be the expensive piece, but not any
>    more.  You typically either see a branch prediction cache (Intel)
>    offering a best-case of 0-cycle latency, or a single-cycle latency 
>    that is slot-fillable (MIPS).

In the case of an indirect branch, you also need to fetch the
destination address from memory.  This is presumably 1 cycle (if it's
cached).  It may be possible to pre-fetch the address, but this
requires a substantial amount of silicon for the interlocks.

>    Since the jump portion of a subroutine call to a direct label is nothing
>    more then a deterministic branch, the branch prediction cache actually
>    operates in this case.  You do not quite get 0-cycle latency due to
>    the push/pop, and potential arguments, but it is very fast.

I'm not sure there's any reason why you shouldn't.  If you changed the
semantics of a stack segment so that memory addresses below the stack
pointer were irrelevant, you could implement a small, 0-cycle, on-chip
stack (that overflowed into memory).  I don't know whether this
semantic change would be allowable (and whether the associated silicon
could be justified) for the IA-32.

Peter


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-current" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?99Jul13.141832est.40326>