From owner-freebsd-performance@FreeBSD.ORG Thu Mar 10 21:33:40 2011 Return-Path: Delivered-To: freebsd-performance@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 91405106564A; Thu, 10 Mar 2011 21:33:40 +0000 (UTC) (envelope-from mm@FreeBSD.org) Received: from mail.vx.sk (mail.vx.sk [IPv6:2a01:4f8:100:1043::3]) by mx1.freebsd.org (Postfix) with ESMTP id 297FA8FC1B; Thu, 10 Mar 2011 21:33:40 +0000 (UTC) Received: from core.vx.sk (localhost [127.0.0.1]) by mail.vx.sk (Postfix) with ESMTP id 5564013AC60; Thu, 10 Mar 2011 22:33:39 +0100 (CET) X-Virus-Scanned: amavisd-new at mail.vx.sk Received: from mail.vx.sk ([127.0.0.1]) by core.vx.sk (mail.vx.sk [127.0.0.1]) (amavisd-new, port 10024) with LMTP id 72FWt1ymc4Ci; Thu, 10 Mar 2011 22:33:37 +0100 (CET) Received: from [10.9.8.1] (chello085216231078.chello.sk [85.216.231.78]) by mail.vx.sk (Postfix) with ESMTPSA id 0109113AC58; Thu, 10 Mar 2011 22:33:36 +0100 (CET) Message-ID: <4D7943B1.1030604@FreeBSD.org> Date: Thu, 10 Mar 2011 22:33:37 +0100 From: Martin Matuska User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; sk; rv:1.8.1.23) Gecko/20090812 Lightning/0.9 Thunderbird/2.0.0.23 Mnenhy/0.7.5.0 MIME-Version: 1.0 To: freebsd-current@FreeBSD.org, freebsd-performance@FreeBSD.org X-Enigmail-Version: 1.1.2 Content-Type: text/plain; charset=windows-1250 Content-Transfer-Encoding: 7bit Cc: Subject: FreeBSD Compiler Benchmark: gcc-base vs. gcc-ports vs. clang X-BeenThere: freebsd-performance@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Performance/tuning List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 10 Mar 2011 21:33:40 -0000 Hi everyone, we have performed a benchmark of the perl binary compiled with base gcc, ports gcc and ports clang using the perlbench benchmark suite. Our benchmark was performed solely on amd64 with 10 different processors and we have tried different -march= flags to compare binary performance of the same compiler with different flags. Here is some statistics from the results: - clang falls 10% behind the base gcc 4.2.1 (test average) - gcc 4.5 from ports gives 5-10% better average performance than the base gcc 4.2.1 - 4% average penalty for Intel Atom and -march=nocona (using gcc from base) - core i7 class processors run best with -march=nocona (using gcc from base) This benchmark speaks only for perl, but it tests quite a lot of "generic" features so we a are seriously considering using ports gcc for heavily used ports (e.g. PHP, MySQL, PostgreSQL) and suggesting that an user should be provided with a easily settable choice of using gcc 4.5 for ports. A first step in this direction is in this PR (allowing build-only dependency on GCC): http://www.freebsd.org/cgi/query-pr.cgi?pr=ports/155408 More information, detailed test results and test configuration are at our blog: http://blog.vx.sk/archives/25-FreeBSD-Compiler-Benchmark-gcc-base-vs-gcc-ports-vs-clang.html From owner-freebsd-performance@FreeBSD.ORG Fri Mar 11 02:37:49 2011 Return-Path: Delivered-To: freebsd-performance@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 3134C106566B for ; Fri, 11 Mar 2011 02:37:49 +0000 (UTC) (envelope-from giffunip@tutopia.com) Received: from nm22-vm0.bullet.mail.sp2.yahoo.com (nm22-vm0.bullet.mail.sp2.yahoo.com [98.139.91.222]) by mx1.freebsd.org (Postfix) with SMTP id 0DB508FC14 for ; Fri, 11 Mar 2011 02:37:48 +0000 (UTC) Received: from [98.139.91.65] by nm22.bullet.mail.sp2.yahoo.com with NNFMP; 11 Mar 2011 02:25:08 -0000 Received: from [98.139.91.27] by tm5.bullet.mail.sp2.yahoo.com with NNFMP; 11 Mar 2011 02:25:08 -0000 Received: from [127.0.0.1] by omp1027.mail.sp2.yahoo.com with NNFMP; 11 Mar 2011 02:25:08 -0000 X-Yahoo-Newman-Property: ymail-3 X-Yahoo-Newman-Id: 681259.34540.bm@omp1027.mail.sp2.yahoo.com Received: (qmail 23042 invoked by uid 60001); 11 Mar 2011 02:25:08 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s1024; t=1299810308; bh=Ze0AYOTyZ3R8uO6kOkk3mqFyeLfy67I85pr4rc8nezc=; h=Message-ID:X-YMail-OSG:Received:X-RocketYMMF:X-Mailer:Date:From:Reply-To:Subject:To:MIME-Version:Content-Type; b=1md/vzAuApYWNl2ih6ltrJ6znnyw6DVBY6tJI3WmaDaHNdWPxUPbvkRrAtcQqXruZJ009LnwPTYstFihRoYggIrHDTvDFYJqLO8XoRB0YqjK7b0uWshXAhIxoSy3c4llhEnSiBwfhb8CJ9B0ih4PIKnishJhJvz3HK3gyAU8W2Y= Message-ID: <298963.19624.qm@web113507.mail.gq1.yahoo.com> X-YMail-OSG: BPstVgwVM1l14xdxIcUg2vi3FFfE.FlPxYUmgmCP4f_CMD9 mZh30J_T9_j1KJwXKJuk6htJNkirj8X5B.FA7Q3IeHApbQbgdWqY5OE3vWgz mOuE.bFcLO9ALwOb0sOSEJT6mCcfru64qG5w4yDLq.BqSJCrqz.5LVdpV6Re 84ih1c4EwPEcMP6DkDMt2sDPKoOIRueew4WBMx4boW9mvuEMAYnyh6Ku7Enm mkxeaY.VMHp4_Mb_0eTdd5AQLk4yNEfdDp2_qYthYe7qVDcvCRfWP3xhoeet rPA1Ui8wZN9GVriKTqayaXQW5H2xk4ZNv2Mex3KSBNgw0chgo8h6svOox6PC v82I.uJQuhoM_yRVwJWQyMeYJ9B2CLbyTS7LH1.iTYdq7SNRhnFP7kwmcFMt TkXUGqlkPDeyN4g-- Received: from [190.157.140.248] by web113507.mail.gq1.yahoo.com via HTTP; Thu, 10 Mar 2011 18:25:08 PST X-RocketYMMF: giffunip X-Mailer: YahooMailClassic/11.4.20 YahooMailWebService/0.8.109.295617 Date: Thu, 10 Mar 2011 18:25:08 -0800 (PST) From: "Pedro F. Giffuni" To: freebsd-performance@FreeBSD.org MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Mailman-Approved-At: Fri, 11 Mar 2011 02:41:59 +0000 Cc: Subject: Re: FreeBSD Compiler Benchmark: gcc-base vs. gcc-ports vs. clang X-BeenThere: freebsd-performance@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: giffunip@tutopia.com List-Id: Performance/tuning List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 11 Mar 2011 02:37:49 -0000 FWIW .. I think the phoronix benchmarks got similar results. http://www.phoronix.com/scan.php?page=article&item=gcc_llvm_clang&num=1 http://www.phoronix.com/scan.php?page=article&item=llvm_gcc_atom&num=1 IMHO, 10% is not a huge performance difference though. Pedro. From owner-freebsd-performance@FreeBSD.ORG Fri Mar 11 11:51:20 2011 Return-Path: Delivered-To: freebsd-performance@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 0CE97106566B for ; Fri, 11 Mar 2011 11:51:20 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from mail.zoral.com.ua (mx0.zoral.com.ua [91.193.166.200]) by mx1.freebsd.org (Postfix) with ESMTP id 92AB48FC18 for ; Fri, 11 Mar 2011 11:51:19 +0000 (UTC) Received: from deviant.kiev.zoral.com.ua (root@deviant.kiev.zoral.com.ua [10.1.1.148]) by mail.zoral.com.ua (8.14.2/8.14.2) with ESMTP id p2BB6lJr040061 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Fri, 11 Mar 2011 13:06:47 +0200 (EET) (envelope-from kostikbel@gmail.com) Received: from deviant.kiev.zoral.com.ua (kostik@localhost [127.0.0.1]) by deviant.kiev.zoral.com.ua (8.14.4/8.14.4) with ESMTP id p2BB6l6E023880; Fri, 11 Mar 2011 13:06:47 +0200 (EET) (envelope-from kostikbel@gmail.com) Received: (from kostik@localhost) by deviant.kiev.zoral.com.ua (8.14.4/8.14.4/Submit) id p2BB6lfA023879; Fri, 11 Mar 2011 13:06:47 +0200 (EET) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: deviant.kiev.zoral.com.ua: kostik set sender to kostikbel@gmail.com using -f Date: Fri, 11 Mar 2011 13:06:47 +0200 From: Kostik Belousov To: Martin Matuska Message-ID: <20110311110647.GN78089@deviant.kiev.zoral.com.ua> References: <4D7943B1.1030604@FreeBSD.org> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="7UNBg6+RNQqlggSk" Content-Disposition: inline In-Reply-To: <4D7943B1.1030604@FreeBSD.org> User-Agent: Mutt/1.4.2.3i X-Spam-Status: No, score=-3.4 required=5.0 tests=ALL_TRUSTED,AWL,BAYES_00, DNS_FROM_OPENWHOIS autolearn=no version=3.2.5 X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on skuns.kiev.zoral.com.ua Cc: freebsd-performance@freebsd.org, freebsd-current@freebsd.org Subject: Re: FreeBSD Compiler Benchmark: gcc-base vs. gcc-ports vs. clang X-BeenThere: freebsd-performance@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Performance/tuning List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 11 Mar 2011 11:51:20 -0000 --7UNBg6+RNQqlggSk Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Thu, Mar 10, 2011 at 10:33:37PM +0100, Martin Matuska wrote: > Hi everyone, >=20 > we have performed a benchmark of the perl binary compiled with base gcc, > ports gcc and ports clang using the perlbench benchmark suite. > Our benchmark was performed solely on amd64 with 10 different processors > and we have tried different -march=3D flags to compare binary performance > of the same compiler with different flags. >=20 > Here is some statistics from the results: > - clang falls 10% behind the base gcc 4.2.1 (test average) > - gcc 4.5 from ports gives 5-10% better average performance than the > base gcc 4.2.1 > - 4% average penalty for Intel Atom and -march=3Dnocona (using gcc from b= ase) > - core i7 class processors run best with -march=3Dnocona (using gcc from = base) >=20 > This benchmark speaks only for perl, but it tests quite a lot of > "generic" features so we a are seriously considering using ports gcc for > heavily used ports (e.g. PHP, MySQL, PostgreSQL) and suggesting that an > user should be provided with a easily settable choice of using gcc 4.5 > for ports. >=20 > A first step in this direction is in this PR (allowing build-only > dependency on GCC): > http://www.freebsd.org/cgi/query-pr.cgi?pr=3Dports/155408 >=20 > More information, detailed test results and test configuration are at > our blog: > http://blog.vx.sk/archives/25-FreeBSD-Compiler-Benchmark-gcc-base-vs-gcc-= ports-vs-clang.html Putting the 'speed' question completely aside, I would like to comment on other issue(s) there. The switching of the ports to use the port-provided compiler (and binutils) would be very useful and often talked about feature. Your approach of USE_GCC_BUILD as implemented is probably not going to work. The problem is that gcc provides two libraries, libgcc and libstdc++, that are not forward-compatible with the same libraries from older compilers and our base. libstdc++ definitely did grown new symbols and new versions of old symbols, and I suspect that libgcc did the same. Also, we are trusting the ABI stability premise. For this scheme to work, we at least need a gcc-runtime port with dsos provided by full port, and some mechnanism to force the binaries compiled with port gcc to use gcc-runtime libs instead of base. Might be, -R linker cludge. --7UNBg6+RNQqlggSk Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (FreeBSD) iEYEARECAAYFAk16AkcACgkQC3+MBN1Mb4iIRwCeIT06hU87Qh6XusOKxwZIcFn1 XaAAoKJYekOQhkw7GDStE8a5cqsLfsd3 =vWaL -----END PGP SIGNATURE----- --7UNBg6+RNQqlggSk-- From owner-freebsd-performance@FreeBSD.ORG Fri Mar 11 14:16:41 2011 Return-Path: Delivered-To: freebsd-performance@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 21129106566B; Fri, 11 Mar 2011 14:16:41 +0000 (UTC) (envelope-from phk@critter.freebsd.dk) Received: from phk.freebsd.dk (phk.freebsd.dk [130.225.244.222]) by mx1.freebsd.org (Postfix) with ESMTP id D73E88FC16; Fri, 11 Mar 2011 14:16:40 +0000 (UTC) Received: from critter.freebsd.dk (critter.freebsd.dk [192.168.61.3]) by phk.freebsd.dk (Postfix) with ESMTP id 811335E2D; Fri, 11 Mar 2011 14:01:36 +0000 (UTC) Received: from critter.freebsd.dk (localhost [127.0.0.1]) by critter.freebsd.dk (8.14.4/8.14.4) with ESMTP id p2BE1aXI090326; Fri, 11 Mar 2011 14:01:36 GMT (envelope-from phk@critter.freebsd.dk) To: Martin Matuska From: "Poul-Henning Kamp" In-Reply-To: Your message of "Thu, 10 Mar 2011 22:33:37 +0100." <4D7943B1.1030604@FreeBSD.org> Date: Fri, 11 Mar 2011 14:01:36 +0000 Message-ID: <90325.1299852096@critter.freebsd.dk> Sender: phk@critter.freebsd.dk Cc: freebsd-performance@FreeBSD.org, freebsd-current@FreeBSD.org Subject: Re: FreeBSD Compiler Benchmark: gcc-base vs. gcc-ports vs. clang X-BeenThere: freebsd-performance@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Performance/tuning List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 11 Mar 2011 14:16:41 -0000 In message <4D7943B1.1030604@FreeBSD.org>, Martin Matuska writes: >More information, detailed test results and test configuration are at >our blog: >http://blog.vx.sk/archives/25-FreeBSD-Compiler-Benchmark-gcc-base-vs-gcc-ports-vs-clang.html Please don't take this personally Martin, but you have triggered my periodic rant about proper running, evaluation and reporting of benchmarks. These results are not published at a level of detail that allows anybody to draw any kind of conclusions from them. In particular, your use of "overall best" result selection is totally bogus from a statistical point of view. At the very least, we need to see standard-deviations on your numbers, and preferably, when you claim that "X is N% better than Y", you should also provide the confidence interval on that judgment, "Student's T" being the canonical test. The ministat(1) program does both of these things, and is now in FreeBSD/src, so there is absolutely no excuse for not using it. In practice this means that you have to run each test at least three times, to get a standardeviation, and you have to make sure that your testconditions are as identical as possible. Therefore, proper benchmarking procedure is something like: (boot machine single-user // Improves reproducibility) (mount md(4)/malloc filesystem // ditto) (newfs test-partition // ditto) for at least 4 iterations: run test A run test B run test C ... Throw first result away for all tests Run remaining results through ministat(1) This was a public service announcement. Poul-Henning PS: Recommended reading: http://www.larrygonick.com/html/pub/books/sci7.html -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 phk@FreeBSD.ORG | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence. From owner-freebsd-performance@FreeBSD.ORG Fri Mar 11 15:18:23 2011 Return-Path: Delivered-To: freebsd-performance@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id AC2751065672; Fri, 11 Mar 2011 15:18:23 +0000 (UTC) (envelope-from alexander@leidinger.net) Received: from mail.ebusiness-leidinger.de (mail.ebusiness-leidinger.de [217.11.53.44]) by mx1.freebsd.org (Postfix) with ESMTP id 6104A8FC15; Fri, 11 Mar 2011 15:18:23 +0000 (UTC) Received: from outgoing.leidinger.net (p5B15535C.dip.t-dialin.net [91.21.83.92]) by mail.ebusiness-leidinger.de (Postfix) with ESMTPSA id 0C58384400E; Fri, 11 Mar 2011 16:01:30 +0100 (CET) Received: from webmail.leidinger.net (unknown [IPv6:fd73:10c7:2053:1::2:102]) by outgoing.leidinger.net (Postfix) with ESMTP id A15642905; Fri, 11 Mar 2011 16:01:26 +0100 (CET) Received: (from www@localhost) by webmail.leidinger.net (8.14.4/8.13.8/Submit) id p2BF1LO4079981; Fri, 11 Mar 2011 16:01:21 +0100 (CET) (envelope-from Alexander@Leidinger.net) Received: from pslux.ec.europa.eu (pslux.ec.europa.eu [158.169.9.14]) by webmail.leidinger.net (Horde Framework) with HTTP; Fri, 11 Mar 2011 16:01:20 +0100 Message-ID: <20110311160120.16406m9ivk2id90c@webmail.leidinger.net> Date: Fri, 11 Mar 2011 16:01:20 +0100 From: Alexander Leidinger To: Martin Matuska References: <4D7943B1.1030604@FreeBSD.org> In-Reply-To: <4D7943B1.1030604@FreeBSD.org> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8; DelSp="Yes"; format="flowed" Content-Disposition: inline Content-Transfer-Encoding: 7bit User-Agent: Dynamic Internet Messaging Program (DIMP) H3 (1.1.4) X-EBL-MailScanner-Information: Please contact the ISP for more information X-EBL-MailScanner-ID: 0C58384400E.A619A X-EBL-MailScanner: Found to be clean X-EBL-MailScanner-SpamCheck: not spam, spamhaus-ZEN, SpamAssassin (not cached, score=1.274, required 6, autolearn=disabled, RDNS_NONE 1.27) X-EBL-MailScanner-SpamScore: s X-EBL-MailScanner-From: alexander@leidinger.net X-EBL-MailScanner-Watermark: 1300460490.97498@fA3lm8covrc9iwJh8xbdzQ X-EBL-Spam-Status: No Cc: freebsd-performance@FreeBSD.org, freebsd-current@FreeBSD.org Subject: Re: FreeBSD Compiler Benchmark: gcc-base vs. gcc-ports vs. clang X-BeenThere: freebsd-performance@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Performance/tuning List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 11 Mar 2011 15:18:23 -0000 Quoting Martin Matuska (from Thu, 10 Mar 2011 22:33:37 +0100): > Hi everyone, > > we have performed a benchmark of the perl binary compiled with base gcc, > ports gcc and ports clang using the perlbench benchmark suite. > Our benchmark was performed solely on amd64 with 10 different processors > and we have tried different -march= flags to compare binary performance > of the same compiler with different flags. > > Here is some statistics from the results: > - clang falls 10% behind the base gcc 4.2.1 (test average) > - gcc 4.5 from ports gives 5-10% better average performance than the > base gcc 4.2.1 Can you rule out gcc specific optimizations as a cause of this difference for clang? As an example of what I mean: the configure script of LAME will use additional optimization flags if it detects gcc (even depending on the version of gcc). For clang (or other compilers which have similar flags than gcc but are not identified as gcc) there it will not use add those flags. Another possibility are preprocessor checks for gcc-specific defines (in case clang does not provide the same predefined defines, I do not know)? Bye, Alexander. -- This MUST be a good party -- My RIB CAGE is being painfully pressed up against someone's MARTINI!! http://www.Leidinger.net Alexander @ Leidinger.net: PGP ID = B0063FE7 http://www.FreeBSD.org netchild @ FreeBSD.org : PGP ID = 72077137 From owner-freebsd-performance@FreeBSD.ORG Fri Mar 11 15:42:09 2011 Return-Path: Delivered-To: freebsd-performance@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id E85BA106566B; Fri, 11 Mar 2011 15:42:08 +0000 (UTC) (envelope-from mm@FreeBSD.org) Received: from mail.vx.sk (mail.vx.sk [IPv6:2a01:4f8:100:1043::3]) by mx1.freebsd.org (Postfix) with ESMTP id 7D95A8FC18; Fri, 11 Mar 2011 15:42:08 +0000 (UTC) Received: from core.vx.sk (localhost [127.0.0.1]) by mail.vx.sk (Postfix) with ESMTP id BE58D1410E5; Fri, 11 Mar 2011 16:42:06 +0100 (CET) X-Virus-Scanned: amavisd-new at mail.vx.sk Received: from mail.vx.sk ([127.0.0.1]) by core.vx.sk (mail.vx.sk [127.0.0.1]) (amavisd-new, port 10024) with LMTP id Zm4m6Sdvw8Aa; Fri, 11 Mar 2011 16:42:04 +0100 (CET) Received: from [192.168.1.103] (chello089173152121.chello.sk [89.173.152.121]) by mail.vx.sk (Postfix) with ESMTPSA id 8B7CA1410D5; Fri, 11 Mar 2011 16:42:04 +0100 (CET) Message-ID: <4D7A42CC.8020807@FreeBSD.org> Date: Fri, 11 Mar 2011 16:42:04 +0100 From: Martin Matuska User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.13) Gecko/20101208 Thunderbird/3.1.7 MIME-Version: 1.0 To: Poul-Henning Kamp References: <90325.1299852096@critter.freebsd.dk> In-Reply-To: <90325.1299852096@critter.freebsd.dk> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Cc: freebsd-performance@FreeBSD.org, freebsd-current@FreeBSD.org Subject: Re: FreeBSD Compiler Benchmark: gcc-base vs. gcc-ports vs. clang X-BeenThere: freebsd-performance@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Performance/tuning List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 11 Mar 2011 15:42:09 -0000 I don't take this personally and fully understand your point. But even if all conditions you described are met, I am still not able to say "this is better" as I am not doing a microbenchmark. The +x% score is just an average of all test scores weightened by factor 1 - this does not reflect any real application out there, as these applications don't use the tested functions in that exact weighting ratio. If one function had score 0%, the program actually would be stale forever when executing this function but the score of this average would still look promising :-) But what I can say, e.g. for the Intel Atom processor, if there are performance gains in all but one test (that falls 2% behind), generic perl code (the routines benchmarked) on this processor is very likely to run faster with that setup. On the other hand, if clang generated code falls short in all tests, I can say it is very likely that it will run slower. But again, I am benchmarking just a subset of generic perl functions. Cheers, mm Dňa 11.03.2011 15:01, Poul-Henning Kamp wrote / napísal(a): > In message <4D7943B1.1030604@FreeBSD.org>, Martin Matuska writes: > >> More information, detailed test results and test configuration are at >> our blog: >> http://blog.vx.sk/archives/25-FreeBSD-Compiler-Benchmark-gcc-base-vs-gcc-ports-vs-clang.html > Please don't take this personally Martin, but you have triggered > my periodic rant about proper running, evaluation and reporting of > benchmarks. > > These results are not published at a level of detail that allows > anybody to draw any kind of conclusions from them. > > In particular, your use of "overall best" result selection is totally > bogus from a statistical point of view. > > At the very least, we need to see standard-deviations on your numbers, > and preferably, when you claim that "X is N% better than Y", you should > also provide the confidence interval on that judgment, "Student's T" > being the canonical test. > > The ministat(1) program does both of these things, and is now in > FreeBSD/src, so there is absolutely no excuse for not using it. > > In practice this means that you have to run each test at least three > times, to get a standardeviation, and you have to make sure that > your testconditions are as identical as possible. > > Therefore, proper benchmarking procedure is something like: > > (boot machine single-user // Improves reproducibility) > (mount md(4)/malloc filesystem // ditto) > (newfs test-partition // ditto) > for at least 4 iterations: > run test A > run test B > run test C > ... > Throw first result away for all tests > Run remaining results through ministat(1) > > This was a public service announcement. > > Poul-Henning > > PS: Recommended reading: http://www.larrygonick.com/html/pub/books/sci7.html > From owner-freebsd-performance@FreeBSD.ORG Fri Mar 11 16:46:20 2011 Return-Path: Delivered-To: freebsd-performance@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 13390106566C; Fri, 11 Mar 2011 16:46:20 +0000 (UTC) (envelope-from phk@critter.freebsd.dk) Received: from phk.freebsd.dk (phk.freebsd.dk [130.225.244.222]) by mx1.freebsd.org (Postfix) with ESMTP id C8E7B8FC1A; Fri, 11 Mar 2011 16:46:19 +0000 (UTC) Received: from critter.freebsd.dk (critter.freebsd.dk [192.168.61.3]) by phk.freebsd.dk (Postfix) with ESMTP id 799755DB8; Fri, 11 Mar 2011 16:46:18 +0000 (UTC) Received: from critter.freebsd.dk (localhost [127.0.0.1]) by critter.freebsd.dk (8.14.4/8.14.4) with ESMTP id p2BGkIup098497; Fri, 11 Mar 2011 16:46:18 GMT (envelope-from phk@critter.freebsd.dk) To: Martin Matuska From: "Poul-Henning Kamp" In-Reply-To: Your message of "Fri, 11 Mar 2011 16:42:04 +0100." <4D7A42CC.8020807@FreeBSD.org> Date: Fri, 11 Mar 2011 16:46:18 +0000 Message-ID: <98496.1299861978@critter.freebsd.dk> Sender: phk@critter.freebsd.dk Cc: freebsd-performance@FreeBSD.org, freebsd-current@FreeBSD.org Subject: Re: FreeBSD Compiler Benchmark: gcc-base vs. gcc-ports vs. clang X-BeenThere: freebsd-performance@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Performance/tuning List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 11 Mar 2011 16:46:20 -0000 In message <4D7A42CC.8020807@FreeBSD.org>, Martin Matuska writes: >But what I can say, e.g. for the Intel Atom processor, if there are >performance gains in all but one test (that falls 2% behind), generic >perl code (the routines benchmarked) on this processor is very likely to >run faster with that setup. No, actually you cannot say that, unless you run all the tests at least three times for each compiler(+flag), calculate the average and standard deviation of all the tests, and see which, if any of the results are statistically significant. Until you do that, you numbers are meaningless, because we have no idea what the signal/noise ratio is. -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 phk@FreeBSD.ORG | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence. From owner-freebsd-performance@FreeBSD.ORG Sat Mar 12 10:02:27 2011 Return-Path: Delivered-To: freebsd-performance@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 03ECE1065675; Sat, 12 Mar 2011 10:02:27 +0000 (UTC) (envelope-from mm@FreeBSD.org) Received: from mail.vx.sk (mail.vx.sk [IPv6:2a01:4f8:100:1043::3]) by mx1.freebsd.org (Postfix) with ESMTP id 8BBB68FC1F; Sat, 12 Mar 2011 10:02:26 +0000 (UTC) Received: from core.vx.sk (localhost [127.0.0.1]) by mail.vx.sk (Postfix) with ESMTP id 9E6A6141F80; Sat, 12 Mar 2011 11:02:25 +0100 (CET) X-Virus-Scanned: amavisd-new at mail.vx.sk Received: from mail.vx.sk ([127.0.0.1]) by core.vx.sk (mail.vx.sk [127.0.0.1]) (amavisd-new, port 10024) with LMTP id 8dDsTdjVAMte; Sat, 12 Mar 2011 11:02:23 +0100 (CET) Received: from [10.9.8.1] (chello085216231078.chello.sk [85.216.231.78]) by mail.vx.sk (Postfix) with ESMTPSA id 46B5A141F77; Sat, 12 Mar 2011 11:02:23 +0100 (CET) Message-ID: <4D7B44AF.7040406@FreeBSD.org> Date: Sat, 12 Mar 2011 11:02:23 +0100 From: Martin Matuska User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; sk; rv:1.8.1.23) Gecko/20090812 Lightning/0.9 Thunderbird/2.0.0.23 Mnenhy/0.7.5.0 MIME-Version: 1.0 To: Poul-Henning Kamp References: <98496.1299861978@critter.freebsd.dk> In-Reply-To: <98496.1299861978@critter.freebsd.dk> X-Enigmail-Version: 1.1.2 Content-Type: text/plain; charset=windows-1250 Content-Transfer-Encoding: 8bit Cc: freebsd-performance@FreeBSD.org, freebsd-current@FreeBSD.org Subject: Re: FreeBSD Compiler Benchmark: gcc-base vs. gcc-ports vs. clang X-BeenThere: freebsd-performance@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Performance/tuning List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 12 Mar 2011 10:02:27 -0000 Hi Poul-Henning, I have redone the test for majority of the processors, this time taking 5 samples of each whole testrun, calculating the average, standard deviation, relative standard deviation, standard error and relative standard error. The relative standard error is below 0.25% for ~91%, between 0.25% and 0.5% for ~7%, 0.5%-1.0% for ~1% and between 1.0%-2.0% for <1% of the tests. Under a "test" I mean 5 runs for the same setting of the same compiler on the same preocessor. So let's say I have now the string/base64 test for a core i7 showing the following (score +/- standard deviation): gcc421: 82.7892 points +/- 0.8314 (1%) gcc45-nocona: 96.0882 points +/- 1.1652 (1.21%) For a relative comparsion of two settings of the same test I could calculate the difference of averages = 13.299 (16.06%) points and sum of standard deviations = 2.4834 points (3.00%) Therefore if assuming normal distribution intervals I could say that: With a 95% probability gcc45-nocona is faster than gcc421 by at least 10.18% (16.06 - 1.96x3.00) or with a 99.9% probability by at least 6.12% (16,06 - 3.2906x3.00). So I should probably take a significance level (e.g. 95%, 99% or 99.9%) and normalize all the test scores for this level. Results out of the interval (difference is below zero) are then not significant. What significance level should I take? I hope this approach is better :) Dòa 11.03.2011 17:46, Poul-Henning Kamp wrote / napísal(a): > In message <4D7A42CC.8020807@FreeBSD.org>, Martin Matuska writes: > >> But what I can say, e.g. for the Intel Atom processor, if there are >> performance gains in all but one test (that falls 2% behind), generic >> perl code (the routines benchmarked) on this processor is very likely to >> run faster with that setup. > > No, actually you cannot say that, unless you run all the tests at > least three times for each compiler(+flag), calculate the average > and standard deviation of all the tests, and see which, if any of > the results are statistically significant. > > Until you do that, you numbers are meaningless, because we have no > idea what the signal/noise ratio is. > From owner-freebsd-performance@FreeBSD.ORG Sat Mar 12 11:33:18 2011 Return-Path: Delivered-To: freebsd-performance@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 5E4F31065670; Sat, 12 Mar 2011 11:33:18 +0000 (UTC) (envelope-from m.e.sanliturk@gmail.com) Received: from mail-vw0-f54.google.com (mail-vw0-f54.google.com [209.85.212.54]) by mx1.freebsd.org (Postfix) with ESMTP id E6A088FC08; Sat, 12 Mar 2011 11:33:17 +0000 (UTC) Received: by vws18 with SMTP id 18so1502771vws.13 for ; Sat, 12 Mar 2011 03:33:17 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:in-reply-to:references:date :message-id:subject:from:to:cc:content-type; bh=sC0AZo7A2J6ZpYhc16v8EiphF9GtDTBql8Di0rSF4PE=; b=xa1NMKosEWFBylhAi14r33iUrPJkNdfBN7BylrB6+JWo5D9LPgNPIpYOdClLqw2Qxn hC6wSTAUdK5xeirUVApg2+tNSFochhxuRYI67vVUlCVCenNY86Cz80EzLJSPiZIRX6xb B+WwBRHY1Ib7waozefb3RHmFIBqAXgNohvpnQ= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; b=RGFa1NDA46pqnpR5r9rnrA0AzCfPa+MCqmOjdWvnq92QZ9Jy1vdI1H8BoQkoHCp9yP 0g1kpvJLuyR62g8S8wUtfW4/i/mvczokWCtPhWwHZVFvYLe5khLF7DaEQHCJkP9Xw0V1 8BqFR567Xn3RDva0JYyzLvsze7pDhvOvBPxfU= MIME-Version: 1.0 Received: by 10.52.167.230 with SMTP id zr6mr15327704vdb.6.1299928286102; Sat, 12 Mar 2011 03:11:26 -0800 (PST) Received: by 10.52.169.165 with HTTP; Sat, 12 Mar 2011 03:11:26 -0800 (PST) In-Reply-To: <4D7B44AF.7040406@FreeBSD.org> References: <98496.1299861978@critter.freebsd.dk> <4D7B44AF.7040406@FreeBSD.org> Date: Sat, 12 Mar 2011 06:11:26 -0500 Message-ID: From: Mehmet Erol Sanliturk To: Martin Matuska Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: Poul-Henning Kamp , freebsd-current@freebsd.org, freebsd-performance@freebsd.org Subject: Re: FreeBSD Compiler Benchmark: gcc-base vs. gcc-ports vs. clang X-BeenThere: freebsd-performance@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Performance/tuning List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 12 Mar 2011 11:33:18 -0000 2011/3/12 Martin Matuska > Hi Poul-Henning, > > I have redone the test for majority of the processors, this time taking > 5 samples of each whole testrun, calculating the average, standard > deviation, relative standard deviation, standard error and relative > standard error. > > The relative standard error is below 0.25% for ~91%, between 0.25% and > 0.5% for ~7%, 0.5%-1.0% for ~1% and between 1.0%-2.0% for <1% of the > tests. Under a "test" I mean 5 runs for the same setting of the same > compiler on the same preocessor. > > So let's say I have now the string/base64 test for a core i7 showing the > following (score +/- standard deviation): > gcc421: 82.7892 points +/- 0.8314 (1%) > gcc45-nocona: 96.0882 points +/- 1.1652 (1.21%) > > For a relative comparsion of two settings of the same test I could > calculate the difference of averages =3D 13.299 (16.06%) points and sum o= f > standard deviations =3D 2.4834 points (3.00%) > > Therefore if assuming normal distribution intervals I could say that: > With a 95% probability gcc45-nocona is faster than gcc421 by at least > 10.18% (16.06 - 1.96x3.00) or with a 99.9% probability by at least 6.12% > (16,06 - 3.2906x3.00). > > So I should probably take a significance level (e.g. 95%, 99% or 99.9%) > and normalize all the test scores for this level. Results out of the > interval (difference is below zero) are then not significant. > > What significance level should I take? > > I hope this approach is better :) > > D=C5=88a 11.03.2011 17:46, Poul-Henning Kamp wrote / nap=C3=ADsal(a): > > In message <4D7A42CC.8020807@FreeBSD.org>, Martin Matuska writes: > > > >> But what I can say, e.g. for the Intel Atom processor, if there are > >> performance gains in all but one test (that falls 2% behind), generic > >> perl code (the routines benchmarked) on this processor is very likely = to > >> run faster with that setup. > > > > No, actually you cannot say that, unless you run all the tests at > > least three times for each compiler(+flag), calculate the average > > and standard deviation of all the tests, and see which, if any of > > the results are statistically significant. > > > > Until you do that, you numbers are meaningless, because we have no > > idea what the signal/noise ratio is. > > > > Additionally to possible answer by Poul-Henning Kamp , you may consider the following pages because strength ( sensitivity ) of hypothesis tests are determined by statistical power computations : http://en.wikipedia.org/wiki/Statistical_power http://en.wikipedia.org/wiki/Statistical_hypothesis_testing http://en.wikipedia.org/wiki/Category:Hypothesis_testing http://en.wikipedia.org/wiki/Category:Statistical_terminology Thank you very much . Mehmet Erol Sanliturk From owner-freebsd-performance@FreeBSD.ORG Sat Mar 12 12:43:09 2011 Return-Path: Delivered-To: freebsd-performance@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id ED7A9106564A; Sat, 12 Mar 2011 12:43:09 +0000 (UTC) (envelope-from m.e.sanliturk@gmail.com) Received: from mail-vx0-f182.google.com (mail-vx0-f182.google.com [209.85.220.182]) by mx1.freebsd.org (Postfix) with ESMTP id 747FE8FC13; Sat, 12 Mar 2011 12:43:09 +0000 (UTC) Received: by vxc34 with SMTP id 34so3748120vxc.13 for ; Sat, 12 Mar 2011 04:43:08 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:in-reply-to:references:date :message-id:subject:from:to:cc:content-type; bh=/MMRaNQwdG3xZ26aqPd4D0HaKhVa1hKsf3U9Kqe+hqM=; b=SMcZLUPgR2tB6ryWpyTb656USnujpID2G6Ojt6mZ8Vo6HTulEoY3qacAL6U8HFnIwR OTOZIao5NfkrJ8rZljVoH3847r1mahvgaf8EU/XFmQaq17EoV0CfT9C1Vw5iYWi9NqZy WzaT/iIgfO3lJ8uRmgISp015yPF/W0ihs5fN8= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; b=x37xyCRFjrlqhlwQELV0JXlCHHMOEXdZU4J4IN/IIUrPzmA1Oao/sWSBRBMSROsGEW oixWCmMDtGKLgJct6di+G1Uv43/VLeZfjA89iIN+Z+xS/ASPN0rhx30t4JMzDcaF0KzK X21j9e8IUSFLVaWS2pIkhuA33bQMg2TbvsxoA= MIME-Version: 1.0 Received: by 10.52.161.197 with SMTP id xu5mr3377137vdb.46.1299933788595; Sat, 12 Mar 2011 04:43:08 -0800 (PST) Received: by 10.52.169.165 with HTTP; Sat, 12 Mar 2011 04:43:08 -0800 (PST) In-Reply-To: <4D7B44AF.7040406@FreeBSD.org> References: <98496.1299861978@critter.freebsd.dk> <4D7B44AF.7040406@FreeBSD.org> Date: Sat, 12 Mar 2011 07:43:08 -0500 Message-ID: From: Mehmet Erol Sanliturk To: Martin Matuska Content-Type: text/plain; charset=UTF-8 X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: Poul-Henning Kamp , freebsd-current@freebsd.org, freebsd-performance@freebsd.org Subject: Re: FreeBSD Compiler Benchmark: gcc-base vs. gcc-ports vs. clang X-BeenThere: freebsd-performance@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Performance/tuning List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 12 Mar 2011 12:43:10 -0000 2011/3/12 Martin Matuska > Hi Poul-Henning, > > I have redone the test for majority of the processors, this time taking > 5 samples of each whole testrun, calculating the average, standard > deviation, relative standard deviation, standard error and relative > standard error. > > The relative standard error is below 0.25% for ~91%, between 0.25% and > 0.5% for ~7%, 0.5%-1.0% for ~1% and between 1.0%-2.0% for <1% of the > tests. ... > Under a "test" I mean 5 runs for the same setting of the same > compiler on the same processor. > > ... To have VALID test results , it is NECESSARY to obtain the results by using DIFFERENT computers . ( This point is NOT mentioned in your message . I am assuming that the SAME computer is used to get the results . ) If you repeat the same computations on the SAME computer , the values are CORRELATED , and the t test is NOT valid , because you are computing mean and standard deviation of CORRELATED values , where the correlation is introduced by the SAME processor . To obtain a proper test values set , you may use the following set up : ( CLang and GCC versions , compilation parameters will be the same in all of the computers ) CLang GCC --------- ------- Computer 1 v(1,1) v(1,2) Computer 2 v(2,1) v(2,2) . . . Computer n v(n,1) v(n,2) If you do NOT have so many computers , you may obtain test results from other reliable sources by using the same compilation parameters . Now it is possible to use t-test on PAIRED values . To determine the sample size , it is necessary to make power computations BEFORE execution of experiment by specifying required values a priori . If you want to compare ( Clang Version x ) ... ( Clang Version y ) ( GCC Version x ) ... ( GCC version y ) ... etc. as MORE than TWO compilers at the same time , it is necessary to use MULTIPLE COMPARISONS . Using two-by-two t-tests as isolated from the rest of the results ( variables as compilers ) will give distorted results unless differences are significant at the 0.001 level ( where actual significance level will be greater than 0.001 , but very likely that less than 0.05 ) . Such computations ( paired t-test , power , multiple comparisons and others ) are available in R statistical package which is in the Ports . It is my opinion that using different processor models with approximate speeds will not distort results very much . Personally I prefer such a different processors set up . In this set up it will be possible to test performance of the compilers on a mixture of processors ( likely as independent from processor model ) . Thank you very much . Mehmet Erol Sanliturk From owner-freebsd-performance@FreeBSD.ORG Sat Mar 12 13:35:40 2011 Return-Path: Delivered-To: freebsd-performance@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 38EAE106566C; Sat, 12 Mar 2011 13:35:40 +0000 (UTC) (envelope-from phk@critter.freebsd.dk) Received: from phk.freebsd.dk (phk.freebsd.dk [130.225.244.222]) by mx1.freebsd.org (Postfix) with ESMTP id F09F98FC19; Sat, 12 Mar 2011 13:35:39 +0000 (UTC) Received: from critter.freebsd.dk (critter.freebsd.dk [192.168.61.3]) by phk.freebsd.dk (Postfix) with ESMTP id 714295E2D; Sat, 12 Mar 2011 13:35:38 +0000 (UTC) Received: from critter.freebsd.dk (localhost [127.0.0.1]) by critter.freebsd.dk (8.14.4/8.14.4) with ESMTP id p2CDZbj8060072; Sat, 12 Mar 2011 13:35:38 GMT (envelope-from phk@critter.freebsd.dk) To: Martin Matuska From: "Poul-Henning Kamp" In-Reply-To: Your message of "Sat, 12 Mar 2011 11:02:23 +0100." <4D7B44AF.7040406@FreeBSD.org> Date: Sat, 12 Mar 2011 13:35:37 +0000 Message-ID: <60071.1299936937@critter.freebsd.dk> Sender: phk@critter.freebsd.dk Cc: freebsd-performance@FreeBSD.org, freebsd-current@FreeBSD.org Subject: Re: FreeBSD Compiler Benchmark: gcc-base vs. gcc-ports vs. clang X-BeenThere: freebsd-performance@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Performance/tuning List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 12 Mar 2011 13:35:40 -0000 In message <4D7B44AF.7040406@FreeBSD.org>, Martin Matuska writes: Thanks a lot for doing this properly. >What significance level should I take? I think I set ministat(1) to use 95 % confidence level by default and that is in general a pretty safe bet (1 in 20 chance) >I hope this approach is better :) Much, much better. As I said, this was not to go after you personally, but to point out that we need to be more rigorous with benchmarks in general. -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 phk@FreeBSD.ORG | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence.