From owner-cvs-all Thu Jan 9 14:55:49 2003 Delivered-To: cvs-all@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 3BD6637B401; Thu, 9 Jan 2003 14:55:47 -0800 (PST) Received: from corbulon.video-collage.com (corbulon.video-collage.com [64.35.99.179]) by mx1.FreeBSD.org (Postfix) with ESMTP id 8B4CD43E4A; Thu, 9 Jan 2003 14:55:46 -0800 (PST) (envelope-from mi@corbulon.video-collage.com) Received: from corbulon.video-collage.com (localhost.video-collage.com [127.0.0.1]) by corbulon.video-collage.com (8.12.6/8.12.6) with ESMTP id h09MtfAj025903 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=NO); Thu, 9 Jan 2003 17:55:42 -0500 (EST) (envelope-from mi@corbulon.video-collage.com) Received: (from mi@localhost) by corbulon.video-collage.com (8.12.6/8.12.6/Submit) id h09MtfHY025902; Thu, 9 Jan 2003 17:55:41 -0500 (EST) (envelope-from mi) From: Mikhail Teterin Message-Id: <200301092255.h09MtfHY025902@corbulon.video-collage.com> Subject: Re: cvs commit: src/sys/i386/i386 mp_machdep.c In-Reply-To: <15901.64772.844070.407901@grasshopper.cs.duke.edu> To: Andrew Gallatin Date: Thu, 9 Jan 2003 17:55:41 -0500 (EST) Cc: Mikhail Teterin , Alexander Leidinger , cvs-all@FreeBSD.ORG, cvs-committers@FreeBSD.ORG, marius@alchemy.franken.de X-Mailer: ELM [version 2.4ME+ PL92b (25)] MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset=US-ASCII X-Scanned-By: MIMEDefang 2.21 (www . roaringpenguin . com / mimedefang) Sender: owner-cvs-all@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG > Mikhail Teterin writes: > > You may wish to take a look at > > > > http://www.FreeBSD.org/cgi/query-pr.cgi?pr=bin/43299 > > > > Especially, the follow up to it, where using SSE2 appears to slow things > > down -- at least for double values. > > > > -mi > Strange. The intel compiler is slower too. But at least it gets the > right answer, which is more than gcc can do (unless O0 is used) As I note in my follow up, gcc now gives the right answer too on my system. I suspect -- thanks to the commit I quote there. How recent is your system? > icc -O3 -tpp7 -xW: (P4) > > 2^2.1 is 4.28709 > 11^-2.1 is 0.00650243 > 5.77 real 5.68 user 0.02 sys > > > icc -O3 -tpp6 -xK: (PIII) > > 2^2.1 is 4.28709 > 11^-2.1 is 0.00650243 > 5.38 real 5.13 user 0.00 sys > > gcc -O3 -march=pentium4 > 2^2.1 is 0.5 > 11^-2.1 is 0.0909091 > 0.63 real 0.62 user 0.00 sys Yep, this lighting speed and incorrectness is what I was seeing, when I submitted the PR. > gcc -O3 -march=pentium3 > > 2^2.1 is 4.28709 > 11^-2.1 is 0.00650243 > 6.68 real 6.50 user 0.01 sys > I still build my system with CPUTYPE=p3, so I think my libs are OK. My example in there explicitly avoids using -lm anyway :-) -mi To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe cvs-all" in the body of the message