From owner-cvs-src@FreeBSD.ORG Thu Nov 1 02:45:11 2007 Return-Path: Delivered-To: cvs-src@FreeBSD.ORG Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 00CAB16A41A for ; Thu, 1 Nov 2007 02:45:11 +0000 (UTC) (envelope-from christoph.mallon@gmx.de) Received: from mail.gmx.net (mail.gmx.net [213.165.64.20]) by mx1.freebsd.org (Postfix) with SMTP id 0AFD913C4A3 for ; Thu, 1 Nov 2007 02:45:09 +0000 (UTC) (envelope-from christoph.mallon@gmx.de) Received: (qmail invoked by alias); 01 Nov 2007 01:44:26 -0000 Received: from p54A3EA53.dip.t-dialin.net (EHLO tron.homeunix.org) [84.163.234.83] by mail.gmx.net (mp002) with SMTP; 01 Nov 2007 02:44:26 +0100 X-Authenticated: #1673122 X-Provags-ID: V01U2FsdGVkX1+SDMY1IVGSZMzfozpV2QYSfOuVO69/YAAoya07sF zrI3CzXttmOh6c Message-ID: <47292F79.9030102@gmx.de> Date: Thu, 01 Nov 2007 02:44:25 +0100 From: Christoph Mallon User-Agent: Thunderbird 2.0.0.6 (X11/20070806) MIME-Version: 1.0 To: Andrey Chernov , Juli Mallett , src-committers@FreeBSD.ORG, cvs-src@FreeBSD.ORG, cvs-all@FreeBSD.ORG References: <200710272232.l9RMWSbK072082@repoman.freebsd.org> <20071030200331.GA29309@toxic.magnesium.net> <20071031215526.GC89932@nagual.pp.ru> In-Reply-To: <20071031215526.GC89932@nagual.pp.ru> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Y-GMX-Trusted: 0 Cc: Subject: Re: cvs commit: src/include _ctype.h X-BeenThere: cvs-src@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: CVS commit messages for the src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 01 Nov 2007 02:45:11 -0000 Andrey Chernov wrote: > On Tue, Oct 30, 2007 at 10:03:31AM -1000, Juli Mallett wrote: >> * "Andrey A. Chernov" [ 2007-10-27 ] >> [ cvs commit: src/include _ctype.h ] >>> ache 2007-10-27 22:32:28 UTC >>> >>> FreeBSD src repository >>> >>> Modified files: >>> include _ctype.h >>> Log: >>> Micro-optimization of prev. commit, change >>> (_c < 0 || _c >= 128) to (_c & ~0x7F) >> Isn't that a non-optimization in code and a minor pessimization of readability? >> Maybe I'm getting rusty, but those seem to result in nearly identical code on >> i386 with a relatively modern GCC. Did you look at the compiler output for this >> optimization? Is there a specific expensive instruction you're trying to avoid? >> For such thoroughyl bit-aligned range checks, you shouldn't even get a branch >> for the former case. Is there a platform other than i386 I should look at where >> the previous expression is more clearly pessimized? Or a different compiler >> than GCC? > > For ones who doubts there two tests compiled with -O2. As you may see the > result is almost identical (andl vs cmpl): > -------------------- a.c -------------------- > main () { > > int c; > > return (c & ~0x7f) ? 0 : c * 2; > } > -------------------- a.s -------------------- > .file "a.c" > .text > .p2align 4,,15 > .globl main > .type main, @function > main: > leal 4(%esp), %ecx > andl $-16, %esp > pushl -4(%ecx) > movl %eax, %edx > andl $-128, %edx > addl %eax, %eax > cmpl $1, %edx > sbbl %edx, %edx > pushl %ebp > andl %edx, %eax > movl %esp, %ebp > pushl %ecx > popl %ecx > popl %ebp > leal -4(%ecx), %esp > ret > .size main, .-main > .ident "GCC: (GNU) 4.2.1 20070719 [FreeBSD]" > -------------------- a1.c -------------------- > main () { > > int c; > > return (c < 0 || c >= 128) ? 0 : c * 2; > > > } > -------------------- a1.s -------------------- > .file "a1.c" > .text > .p2align 4,,15 > .globl main > .type main, @function > main: > leal 4(%esp), %ecx > andl $-16, %esp > pushl -4(%ecx) > addl %eax, %eax > cmpl $128, %eax > sbbl %edx, %edx > andl %edx, %eax > pushl %ebp > movl %esp, %ebp > pushl %ecx > popl %ecx > popl %ebp > leal -4(%ecx), %esp > ret > .size main, .-main > .ident "GCC: (GNU) 4.2.1 20070719 [FreeBSD]" Your example is invalid. The value of c is undefined in this function and you see random garbage as result (for example in the code snippet you see the c * 2 (addl %eax, %eax) and after that is the cmpl, which uses %eax, too). In fact it would be perfectly legal for the compiler to always return 0, call abort(), or let demons fly out of your nose. Also the example is still unrealistic: You usually don't multiply chars by two. Lets try something more realistic: an ASCII filter int filter_ascii0(int c) { return c < 0 || c >= 128 ? '?' : c; } int filter_ascii1(int c) { return c & ~0x7F ? '?' : c; } Especially mind that c is not dead after the condition. Even if your example did not used an undefined value, the value of c is dead after the test, which is unlikely for typical string handling code. And now the compiled code (GCC 3.4.6 with -O2 -march=athlon-xp -fomit-frame-pointer - I used these switches to get more compact code. It has no influence on the condition test.): 00000000 : 0: 8b 54 24 04 mov 0x4(%esp),%edx 4: b8 3f 00 00 00 mov $0x3f,%eax 9: 83 fa 7f cmp $0x7f,%edx c: 0f 46 c2 cmovbe %edx,%eax f: c3 ret 00000010 : 10: 8b 54 24 04 mov 0x4(%esp),%edx 14: b8 3f 00 00 00 mov $0x3f,%eax 19: f7 c2 80 ff ff ff test $0xffffff80,%edx 1f: 0f 44 c2 cmove %edx,%eax 22: c3 ret You see there is a test instruction used in filter_ascii1, because the value in %edx does not die at the test, but is used again in the cmove. Christoph