From owner-freebsd-hackers@FreeBSD.ORG Sun Feb 15 19:51:30 2004 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id B429316A4CE for ; Sun, 15 Feb 2004 19:51:30 -0800 (PST) Received: from smtp.omnis.com (smtp.omnis.com [216.239.128.26]) by mx1.FreeBSD.org (Postfix) with ESMTP id AAF9243D1D for ; Sun, 15 Feb 2004 19:51:30 -0800 (PST) (envelope-from wes@softweyr.com) Received: from 204.68.178.129 (66-91-236-204.san.rr.com [66.91.236.204]) by smtp-relay.omnis.com (Postfix) with ESMTP id 463A78836A3; Sun, 15 Feb 2004 19:51:26 -0800 (PST) From: Wes Peters To: des@des.no (Dag-Erling =?iso-8859-1?q?Sm=F8rgrav?=), Alexandr Kovalenko Date: Mon, 16 Feb 2004 03:52:16 -0800 User-Agent: KMail/1.5.4 References: <20040214082420.GB77411@nevermind.kiev.ua> In-Reply-To: MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Content-Disposition: inline Message-Id: <200402160352.16477.wes@softweyr.com> cc: freebsd-hackers@freebsd.org cc: Juan Tumani Subject: Re: FreeBSD 5.2 v/s FreeBSD 4.9 MFLOPS performance (gcc3.3.3 v/s gcc2.9.5) X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 16 Feb 2004 03:51:30 -0000 On Sunday 15 February 2004 12:46, Dag-Erling Sm=F8rgrav wrote: > Alexandr Kovalenko writes: > > Could you please explain me this? Result is fully reproduceable. Please > > note, that the only difference is the output file name. Even resulting > > files match bit-to-bit. [...] > > Definitely some kind of alignment problem, but it only shows up at > some optimization levels and not others. I've tested the patch Dan mentioned before and the results were astonishing= =2E =20 Running the flops.c 1.2 program in a loop, lengthening the environment stri= ng=20 by one byte each time, I get 8 successive runs of fast, then 8 successive=20 runs of slow, where fast and slow vary between 650 and 990 mflops. With th= e=20 patch, the performance is always 990, within a few percent. Should I commit this? RCS file: /big/ncvs/src/sys/kern/kern_exec.c,v retrieving revision 1.235 diff -u -w -r1.235 kern_exec.c =2D-- kern_exec.c 28 Dec 2003 04:37:59 -0000 1.235 +++ kern_exec.c 11 Feb 2004 16:47:28 -0000 @@ -1014,6 +1014,15 @@ */ vectp =3D (char **)(destp - (imgp->argc + imgp->envc + 2) * sizeof(char *)); +=20 + /* + * Align stack to a multiple of 0x20. + * XXX vectp has the wrong type; we usually want a vm_offset_t; + * the suword() family takes a void *, but should take a vm_offset_= t. + * XXX should align stack for signals too. + * XXX should do this more machine/compiler-independently. + */ + vectp =3D (char **)(((vm_offset_t)vectp & ~(vm_offset_t)0x1F) - 4); =20 /* * vectp also becomes our initial stack base =2D-=20 "Where am I, and what am I doing in this handbasket?" Wes Peters Softweyr LLC wes@softweyr.com http://softweyr.com/