From owner-freebsd-current@FreeBSD.ORG  Wed Mar 16 06:00:47 2011
Return-Path: <owner-freebsd-current@FreeBSD.ORG>
Delivered-To: freebsd-current@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id A2051106564A
	for <freebsd-current@freebsd.org>; Wed, 16 Mar 2011 06:00:47 +0000 (UTC)
	(envelope-from m.e.sanliturk@gmail.com)
Received: from mail-vx0-f182.google.com (mail-vx0-f182.google.com
	[209.85.220.182])
	by mx1.freebsd.org (Postfix) with ESMTP id 54D3B8FC12
	for <freebsd-current@freebsd.org>; Wed, 16 Mar 2011 06:00:47 +0000 (UTC)
Received: by vxc34 with SMTP id 34so1519472vxc.13
	for <freebsd-current@freebsd.org>; Tue, 15 Mar 2011 23:00:46 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma;
	h=domainkey-signature:mime-version:date:message-id:subject:from:to
	:content-type; bh=o2+PerhMj4tvOYcZBeI6VfQieIJ+7jHA7vDDUNSFsI8=;
	b=I/uGJz0HUZfC6XMf4qnDwTTZR+uxDVG51WuaHp2gjF+imWlA9gInfZrurACqvTCa4b
	gX+o8zsnD6FKHjjJjcunn/3aV1rTE29nRqTaxxvZ3qoFPcOaZ3yw3XZBwRB4srTWcbaN
	qGzZHH2c12XQ8wwgABsoHqFxTwDrk+GsZvY8I=
DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma;
	h=mime-version:date:message-id:subject:from:to:content-type;
	b=Pp0Bm1kfjAhorFq/GNkUeI7HlZ83ueoaJyiW4vBFe3Hj/yYXUAJgh/TJvLNp+iRz9V
	l0urMxxYkNuR9K4DlKuEKnskD14sC1yr+BNpdEam5jIVnt9Zimuw4Ml+OYY4U1w6BbEf
	JK6vzRiujYdbcaVFskD1yFvJf3C622Dh8IB7s=
MIME-Version: 1.0
Received: by 10.52.94.167 with SMTP id dd7mr420436vdb.206.1300255246385; Tue,
	15 Mar 2011 23:00:46 -0700 (PDT)
Received: by 10.52.169.165 with HTTP; Tue, 15 Mar 2011 23:00:46 -0700 (PDT)
Date: Wed, 16 Mar 2011 02:00:46 -0400
Message-ID: <AANLkTi=zPQGk-7rwUMCSFHykGUGHiDMD-WiHut+tmGJe@mail.gmail.com>
From: Mehmet Erol Sanliturk <m.e.sanliturk@gmail.com>
To: freebsd-current <freebsd-current@freebsd.org>
Content-Type: text/plain; charset=UTF-8
X-Content-Filtered-By: Mailman/MimeDel 2.1.5
Subject: Comparison of quality of generated code by the compilers
X-BeenThere: freebsd-current@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Discussions about the use of FreeBSD-current
	<freebsd-current.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-current>, 
	<mailto:freebsd-current-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-current>
List-Post: <mailto:freebsd-current@freebsd.org>
List-Help: <mailto:freebsd-current-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-current>,
	<mailto:freebsd-current-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 16 Mar 2011 06:00:47 -0000

One important attribute of compilers is the quality of the generated code .

To assess the difference between the quality of the generated codes of the
compilers
an experimental design may be used .


Assume the following design is used .

Select n distinct ( large as much as possible ) programs in such a way that
any source file in a program does not appear in another program
( except compiler libraries ) to prevent correlation between
programs where programs should be independent from each other .


If sample size is not computed from power of the tests formulas ,
select a sample size at least greater than 15 .

A sample size greater than 60 is extremely valuable .

Only two compilers are compared .

All of the programs are compilable by the compilers .
Execute programs and record their success or failure in the following
structure :


Program    CLang         GCC
------------    ----------    ---------

1              0 or 1        0 or 1

2              0 or 1        0 or 1
.
.
.
n              0 or 1        0 or 1

where
      0 is success ( only correct results without a crash )
      1 is failure ( crash or incorrect results ) .


When there are failures ,
generate a cross tabulation of the above table :


                           GCC                            GCC
                           --------------------------------------------
                            Success  ( 0 )               Failure ( 1 )
                          | ----------------------------|-------------------
CLang   Success |  count of ( 0 , 0 )    | count of ( 0 , 1 )
                          |  pairs                    | pairs
                          | ----------------------------|-------------------

CLang   Failure   |  count of ( 1 , 0 )     | count of ( 1 , 1 )
                         |  pairs                     | pairs
                         |
-----------------------------|--------------------


One of the following tests with respect to table structure ( especially
number of programs )
may be applied .

http://en.wikipedia.org/wiki/Barnard%27s_exact_test
( Barnard's test )

http://en.wikipedia.org/wiki/Fisher%27s_exact_test
( Fisher's exact test )

http://en.wikipedia.org/wiki/Chi-square_test
( Chi-square test )

http://en.wikipedia.org/wiki/Pearson%27s_chi-square_test
( Pearson's chi-square test )

If the difference ( the contingency coefficient ) is significant ,
   one compiler is best ( small number of failures ),
   the other is worst ( large number of failures ) .


----------------------------------------------------------

Assume there is no any failure , and execution times are available .


Program    CLang         GCC
------------    ----------    ---------

1               t              t

2               t              t
.
.
.
n               t              t


where t is the execution time of the program .

Apply paired t test .

If the paired differences are significant ,
   one compiler is best ( short execution time , small mean ) ,
   the other is worst ( long execution time , large mean )  .

---------------------------------------------------------

The above paired t test may be used for the generated program sizes .


If the paired differences are significant ,
   one compiler is best ( small program size , small mean ) ,
   the other is worst ( large program size , large mean )  .


Thank you very much .

Mehmet Erol Sanliturk