From owner-cvs-src@FreeBSD.ORG  Thu Nov  1 14:21:21 2007
Return-Path: <owner-cvs-src@FreeBSD.ORG>
Delivered-To: cvs-src@FreeBSD.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 6234016A417;
	Thu,  1 Nov 2007 14:21:21 +0000 (UTC)
	(envelope-from brde@optusnet.com.au)
Received: from fallbackmx01.syd.optusnet.com.au
	(fallbackmx01.syd.optusnet.com.au [211.29.132.93])
	by mx1.freebsd.org (Postfix) with ESMTP id 1C59A13C4B6;
	Thu,  1 Nov 2007 14:21:19 +0000 (UTC)
	(envelope-from brde@optusnet.com.au)
Received: from mail02.syd.optusnet.com.au (mail02.syd.optusnet.com.au
	[211.29.132.183])
	by fallbackmx01.syd.optusnet.com.au (8.12.11.20060308/8.12.11) with
	ESMTP id lA1AZN5V026377; Thu, 1 Nov 2007 21:35:23 +1100
Received: from besplex.bde.org (c211-30-219-213.carlnfd3.nsw.optusnet.com.au
	[211.30.219.213])
	by mail02.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id
	lA1AYHar021619
	(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO);
	Thu, 1 Nov 2007 21:34:19 +1100
Date: Thu, 1 Nov 2007 21:34:17 +1100 (EST)
From: Bruce Evans <brde@optusnet.com.au>
X-X-Sender: bde@besplex.bde.org
To: Christoph Mallon <christoph.mallon@gmx.de>
In-Reply-To: <47292F79.9030102@gmx.de>
Message-ID: <20071101204931.C94960@besplex.bde.org>
References: <200710272232.l9RMWSbK072082@repoman.freebsd.org>
	<20071030200331.GA29309@toxic.magnesium.net>
	<20071031215526.GC89932@nagual.pp.ru> <47292F79.9030102@gmx.de>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
Cc: Andrey Chernov <ache@nagual.pp.ru>, src-committers@FreeBSD.org,
	Juli Mallett <juli@clockworksquid.com>, cvs-all@FreeBSD.org,
	cvs-src@FreeBSD.org
Subject: Re: cvs commit: src/include _ctype.h
X-BeenThere: cvs-src@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: CVS commit messages for the src tree <cvs-src.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/cvs-src>,
	<mailto:cvs-src-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/cvs-src>
List-Post: <mailto:cvs-src@freebsd.org>
List-Help: <mailto:cvs-src-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/cvs-src>,
	<mailto:cvs-src-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 01 Nov 2007 14:21:21 -0000

On Thu, 1 Nov 2007, Christoph Mallon wrote:

> Andrey Chernov wrote:
>> On Tue, Oct 30, 2007 at 10:03:31AM -1000, Juli Mallett wrote:
>>> * "Andrey A. Chernov" <ache@FreeBSD.org> [ 2007-10-27 ]
>>> 	[ cvs commit: src/include _ctype.h ]
>>>> ache        2007-10-27 22:32:28 UTC
>>>> 
>>>>   FreeBSD src repository
>>>> 
>>>>   Modified files:
>>>>     include              _ctype.h   Log:
>>>>   Micro-optimization of prev. commit, change
>>>>   (_c < 0 || _c >= 128) to (_c & ~0x7F)
>>> Isn't that a non-optimization in code and a minor pessimization of 
>>> readability?
>> ...
>> For ones who doubts there two tests compiled with -O2. As you may see the 
>> result is almost identical (andl vs cmpl):

We never doubted that it was a small negative or non-optimization :-).

Look closer and you will see that the andl version takes 2 extra
instructions, since both versions are smart enough to avoid a branch,
and for this they need the result of the condition code generated by
the andl or cmpl, and the cmpl generates the desired condition code
directly while 2 more instructions are needed after the andl.

>> -------------------- a.c --------------------
>> main () {
>> 
>> 	int c;
>> 
>> 	return (c & ~0x7f) ? 0 : c * 2;
>> }

This example has many flaws as pointed out by Cristoph:
- c is uninitialized
- the result depends on c in a way that is quite different than the table
   lookup for ctype.  The above expression happens to be more optimizable.

>> -------------------- a.s --------------------
>> 	.file	"a.c"
>> 	.text
>> 	.p2align 4,,15
>> .globl main
>> 	.type	main, @function
>> main:
>> 	leal	4(%esp), %ecx
>> 	andl	$-16, %esp
>> 	pushl	-4(%ecx)
>> 	movl	%eax, %edx             <--- extra instruction since andl
                                             clobbers a register.  Normally,
 					    testl should be used to avoid
 					    this clobber.
>> 	andl	$-128, %edx            <--- this sets %edx to something
                                             and also sets the condition
 					    codes, but not like we want
>> 	addl	%eax, %eax             <--- c * 2
>> 	cmpl	$1, %edx               <--- this sets the condition codes
                                             like we want
>> 	sbbl	%edx, %edx             <--- turn condition codes into a
                                             mask in %edx: mask = 0xffffffff
 					    if the result should be c *2
 					    and mask = 0 if the result should
 					    be 0
>> 	pushl	%ebp
>> 	andl	%edx, %eax             <--- result = (c * 2) & mask
>> 	movl	%esp, %ebp             <--- why is it bothering to set up
                                             a frame this late?
>> 	pushl	%ecx
>> 	popl	%ecx
>> 	popl	%ebp
>> 	leal	-4(%ecx), %esp
>> 	ret
>> 	.size	main, .-main
>> 	.ident	"GCC: (GNU) 4.2.1 20070719  [FreeBSD]"
>> -------------------- a1.c --------------------
>> main () {
>> 
>> 	int c;
>> 
>> 	return (c < 0 || c >= 128) ? 0 : c * 2;
>> 
>> 
>> }
>> -------------------- a1.s --------------------
>> 	.file	"a1.c"
>> 	.text
>> 	.p2align 4,,15
>> .globl main
>> 	.type	main, @function
>> main:
>> 	leal	4(%esp), %ecx
>> 	andl	$-16, %esp
>> 	pushl	-4(%ecx)
>> 	addl	%eax, %eax
>> 	cmpl	$128, %eax             <--- cmpl puts result in condition
                                             codes directly where we want it
>> 	sbbl	%edx, %edx             <--- same masking stuff ...
>> 	andl	%edx, %eax
>> 	pushl	%ebp
>> 	movl	%esp, %ebp
>> 	pushl	%ecx
>> 	popl	%ecx
>> 	popl	%ebp
>> 	leal	-4(%ecx), %esp
>> 	ret
>> 	.size	main, .-main
>> 	.ident	"GCC: (GNU) 4.2.1 20070719  [FreeBSD]"
>
> Your example is invalid. The value of c is undefined in this function and you 
> see random garbage as result (for example in the code snippet you see the c * 
> 2 (addl %eax, %eax) and after that is the cmpl, which uses %eax, too). In 
> fact it would be perfectly legal for the compiler to always return 0, call 
> abort(), or let demons fly out of your nose.

However, the uninitialized c = %eax seems to be transformed correctly in
both cases.  The first case even preserves %eax from the andl.

>
> Also the example is still unrealistic: You usually don't multiply chars by 
> two. Lets try something more realistic: an ASCII filter

Indeed.

Bruce