From owner-svn-src-projects@FreeBSD.ORG Fri Jul 1 17:16:51 2011 Return-Path: Delivered-To: svn-src-projects@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id B121A106566B; Fri, 1 Jul 2011 17:16:51 +0000 (UTC) (envelope-from marcel@xcllnt.net) Received: from mail.xcllnt.net (mail.xcllnt.net [70.36.220.4]) by mx1.freebsd.org (Postfix) with ESMTP id 5EDC38FC08; Fri, 1 Jul 2011 17:16:51 +0000 (UTC) Received: from dhcp-192-168-2-22.wifi.xcllnt.net (atm.xcllnt.net [70.36.220.6]) (authenticated bits=0) by mail.xcllnt.net (8.14.5/8.14.5) with ESMTP id p61HGjUf040765 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=NO); Fri, 1 Jul 2011 10:16:50 -0700 (PDT) (envelope-from marcel@xcllnt.net) Mime-Version: 1.0 (Apple Message framework v1084) Content-Type: text/plain; charset=us-ascii From: Marcel Moolenaar In-Reply-To: <20110701165151.GA6877@freebsd.org> Date: Fri, 1 Jul 2011 10:16:53 -0700 Content-Transfer-Encoding: 7bit Message-Id: References: <201107010329.p613Tn8s071270@svn.freebsd.org> <20110701084224.GA43291@freebsd.org> <00211D6B-F882-43C1-9D93-5ED2D72C5132@xcllnt.net> <20110701165151.GA6877@freebsd.org> To: Roman Divacky X-Mailer: Apple Mail (2.1084) Cc: svn-src-projects@freebsd.org, Marcel Moolenaar , src-committers@freebsd.org Subject: Re: svn commit: r223705 - projects/llvm-ia64/lib/clang/libllvmjit X-BeenThere: svn-src-projects@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "SVN commit messages for the src " projects" tree" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 01 Jul 2011 17:16:51 -0000 On Jul 1, 2011, at 9:51 AM, Roman Divacky wrote: >> The following open items are on my mind: >> >> 1. On ia64, function prologues allocate a register frame that has >> enough (stacked) registers for local scratch registers and >> outgoing function arguments. This means that I need to know >> (after register allocation) how many (unique) scratch registers >> are in use and what the largest number of arguments that need >> to be passed in registers to children (the max being 8). Without >> this information the compiler is forced to allocate the maximum >> size (which is 96 stacked registers, of which 8 are outgoing). >> This obviously eats into the register stack and probably causes >> runtime failures on deep call chains. > > I recommend you to do this little experiment (on amd64 or so): *snip* > # Machine code for function foo: > Frame Objects: > fi#0: size=4, align=4, at location [SP+8] > Function Live Ins: %EDI in %vreg0 > > I believe this is what you asked. *snip* I'm not sure we're in sync. The general registers on ia64 are split in 2: 1. r0-r31 static registers 2. r32-r127 stacked registers The purpose of the stacked registers is to optimize function calls by having the CPU manage a rotating register file and an engine that flushes "dirty" register to memory. All a function has to do is tell the CPU how many stacked registers it wants (max 96) and the CPU will handle all the pushing and popping on function call entry and exit so to speak. Before register allocation one can assume the max: 96 registers, of which 8 are for argument passing. This gives 88 preserved (non-scratch) registers. Emitting code where every function allocates the max is really bad, so after register allocation you want to adjust the alloc with the actual number of registers used by the function. Second or third order is analyzing the behaviour of the above. If the register allocator is really wasteful, the above yields register frames that are generally too big. More back pressure is needed to get more optimal code. Anyway: I don't know yet how to get the actual number of (stacked) registers used for locals and outgoing arguments, so that's an open item. It probably means adding a function pass that runs after register allocation to scan the function and then adjust the prologue code. FYI, -- Marcel Moolenaar marcel@xcllnt.net