From owner-freebsd-hackers@FreeBSD.ORG Mon Feb 16 10:54:48 2004 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 37F1816A4CE for ; Mon, 16 Feb 2004 10:54:48 -0800 (PST) Received: from kientzle.com (h-66-166-149-50.SNVACAID.covad.net [66.166.149.50]) by mx1.FreeBSD.org (Postfix) with ESMTP id C558B43D1F for ; Mon, 16 Feb 2004 10:54:47 -0800 (PST) (envelope-from kientzle@acm.org) Received: from acm.org ([66.166.149.54]) by kientzle.com (8.12.9/8.12.9) with ESMTP id i1GIskkX043592; Mon, 16 Feb 2004 10:54:47 -0800 (PST) (envelope-from kientzle@acm.org) Message-ID: <403111F6.2040505@acm.org> Date: Mon, 16 Feb 2004 10:54:46 -0800 From: Tim Kientzle User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:1.4) Gecko/20031006 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Trent Nelson References: <20040215164231.GA54709@teleri.net> In-Reply-To: <20040215164231.GA54709@teleri.net> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit cc: freebsd-hackers@freebsd.org Subject: Re: Branch prediction X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list Reply-To: kientzle@acm.org List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 16 Feb 2004 18:54:48 -0000 Trent Nelson wrote: > For as long as I've been programming, I've always been under the > impression that CPUs will always predict a branch as being false > the first time they come across it. The state of the art has advanced considerably since then. > Many, many years ago, I came across a DEC programming guide that > said the same thing. It suggested using 'do { } while()' over > 'while() {}' for loops as this ensured that the loop block was > correctly pre-fetched (rather than whatever followed the loop) > the first time the branch was encountered. That must have been a long time ago. while() {} loops routinely get compiled as: bra condition_check looptop: <... body of loop ...> condition: <... test condition ...> bcc looptop: A 'do' loop differs only in omitting the initial forward branch. Since this is an unconditional forward branch, it's always correctly predicted. ;-) Thus, as a practical matter, there's no performance difference between do/while and while. > ... have either CPU architectures or compilers improved > over time such that this isn't an issue anymore? Are CPUs able > to intelligently predict a branch the first time they encounter > it? Do compilers re-order blocks to optimise pre-fetching and > minimise branch mis-prediction? Yes, compilers re-order blocks. I once spent a couple of weeks trying to hand-assemble some code to run faster than the compiler. I was only able to accomplish it if there were particular processor features (SIMD instructions or explicit prefetch) that the compiler was unaware of. For serial C code, the compilers are pretty smart these days. In particular, modern instruction sets allow the compiler to specify whether the branch should be default-taken or default-not-taken. Processors contain "branch history" that can detect and record certain common patterns (e.g., "branch is not taken every third time") to improve prediction. (This history is maintained as long as the branch remains in the instruction cache, and can be very effective for tight inner loops.) If there are no compiler hints and no branch history, I think Colin was right: default taken for backward branches (assume loops happen) default not taken for forward branches (assume if tests succeed). Practically speaking, I find it very rarely worth the trouble of trying to predict this sort of thing unless you're dealing with unusually performance-critical code. Tim Kientzle