From owner-freebsd-current@FreeBSD.ORG  Fri Mar 11 14:16:41 2011
Return-Path: <owner-freebsd-current@FreeBSD.ORG>
Delivered-To: freebsd-current@FreeBSD.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 21129106566B;
	Fri, 11 Mar 2011 14:16:41 +0000 (UTC)
	(envelope-from phk@critter.freebsd.dk)
Received: from phk.freebsd.dk (phk.freebsd.dk [130.225.244.222])
	by mx1.freebsd.org (Postfix) with ESMTP id D73E88FC16;
	Fri, 11 Mar 2011 14:16:40 +0000 (UTC)
Received: from critter.freebsd.dk (critter.freebsd.dk [192.168.61.3])
	by phk.freebsd.dk (Postfix) with ESMTP id 811335E2D;
	Fri, 11 Mar 2011 14:01:36 +0000 (UTC)
Received: from critter.freebsd.dk (localhost [127.0.0.1])
	by critter.freebsd.dk (8.14.4/8.14.4) with ESMTP id p2BE1aXI090326;
	Fri, 11 Mar 2011 14:01:36 GMT (envelope-from phk@critter.freebsd.dk)
To: Martin Matuska <mm@FreeBSD.org>
From: "Poul-Henning Kamp" <phk@phk.freebsd.dk>
In-Reply-To: Your message of "Thu, 10 Mar 2011 22:33:37 +0100."
	<4D7943B1.1030604@FreeBSD.org> 
Date: Fri, 11 Mar 2011 14:01:36 +0000
Message-ID: <90325.1299852096@critter.freebsd.dk>
Sender: phk@critter.freebsd.dk
Cc: freebsd-performance@FreeBSD.org, freebsd-current@FreeBSD.org
Subject: Re: FreeBSD Compiler Benchmark: gcc-base vs. gcc-ports vs. clang 
X-BeenThere: freebsd-current@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Discussions about the use of FreeBSD-current
	<freebsd-current.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-current>, 
	<mailto:freebsd-current-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-current>
List-Post: <mailto:freebsd-current@freebsd.org>
List-Help: <mailto:freebsd-current-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-current>,
	<mailto:freebsd-current-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 11 Mar 2011 14:16:41 -0000

In message <4D7943B1.1030604@FreeBSD.org>, Martin Matuska writes:

>More information, detailed test results and test configuration are at
>our blog:
>http://blog.vx.sk/archives/25-FreeBSD-Compiler-Benchmark-gcc-base-vs-gcc-ports-vs-clang.html

Please don't take this personally Martin, but you have triggered
my periodic rant about proper running, evaluation and reporting of
benchmarks.

These results are not published at a level of detail that allows
anybody to draw any kind of conclusions from them.

In particular, your use of "overall best" result selection is totally
bogus from a statistical point of view.

At the very least, we need to see standard-deviations on your numbers,
and preferably, when you claim that "X is N% better than Y", you should
also provide the confidence interval on that judgment, "Student's T"
being the canonical test.

The ministat(1) program does both of these things, and is now in
FreeBSD/src, so there is absolutely no excuse for not using it.

In practice this means that you have to run each test at least three
times, to get a standardeviation, and you have to make sure that
your testconditions are as identical as possible.

Therefore, proper benchmarking procedure is something like:

	(boot machine single-user  	// Improves reproducibility)
	(mount md(4)/malloc filesystem	// ditto)
	(newfs test-partition		// ditto)
	for at least 4 iterations:
		run test A
		run test B
		run test C
		...
	Throw first result away for all tests
	Run remaining results through ministat(1)

This was a public service announcement.

Poul-Henning

PS: Recommended reading: http://www.larrygonick.com/html/pub/books/sci7.html

-- 
Poul-Henning Kamp       | UNIX since Zilog Zeus 3.20
phk@FreeBSD.ORG         | TCP/IP since RFC 956
FreeBSD committer       | BSD since 4.3-tahoe    
Never attribute to malice what can adequately be explained by incompetence.