Date: Tue, 13 Jul 1999 14:36:20 +1000 From: Peter Jeremy <jeremyp@gsmx07.alcatel.com.au> To: dillon@apollo.backplane.com, mike@smith.net.au Cc: freebsd-current@FreeBSD.ORG Subject: Re: "objtrm" problem probably found (was Re: Stuck in "objtrm") Message-ID: <99Jul13.141832est.40326@border.alcanet.com.au> In-Reply-To: <199907130238.TAA73524@apollo.backplane.com>
next in thread | previous in thread | raw e-mail | index | archive | help
Matthew Dillon <dillon@apollo.backplane.com> wrote: > The change in code flow used to be the expensive piece, but not any > more. You typically either see a branch prediction cache (Intel) > offering a best-case of 0-cycle latency, or a single-cycle latency > that is slot-fillable (MIPS). In the case of an indirect branch, you also need to fetch the destination address from memory. This is presumably 1 cycle (if it's cached). It may be possible to pre-fetch the address, but this requires a substantial amount of silicon for the interlocks. > Since the jump portion of a subroutine call to a direct label is nothing > more then a deterministic branch, the branch prediction cache actually > operates in this case. You do not quite get 0-cycle latency due to > the push/pop, and potential arguments, but it is very fast. I'm not sure there's any reason why you shouldn't. If you changed the semantics of a stack segment so that memory addresses below the stack pointer were irrelevant, you could implement a small, 0-cycle, on-chip stack (that overflowed into memory). I don't know whether this semantic change would be allowable (and whether the associated silicon could be justified) for the IA-32. Peter To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-current" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?99Jul13.141832est.40326>