Date: Thu, 01 Nov 2007 02:44:25 +0100 From: Christoph Mallon <christoph.mallon@gmx.de> To: Andrey Chernov <ache@nagual.pp.ru>, Juli Mallett <juli@clockworksquid.com>, src-committers@FreeBSD.ORG, cvs-src@FreeBSD.ORG, cvs-all@FreeBSD.ORG Subject: Re: cvs commit: src/include _ctype.h Message-ID: <47292F79.9030102@gmx.de> In-Reply-To: <20071031215526.GC89932@nagual.pp.ru> References: <200710272232.l9RMWSbK072082@repoman.freebsd.org> <20071030200331.GA29309@toxic.magnesium.net> <20071031215526.GC89932@nagual.pp.ru>
next in thread | previous in thread | raw e-mail | index | archive | help
Andrey Chernov wrote: > On Tue, Oct 30, 2007 at 10:03:31AM -1000, Juli Mallett wrote: >> * "Andrey A. Chernov" <ache@FreeBSD.org> [ 2007-10-27 ] >> [ cvs commit: src/include _ctype.h ] >>> ache 2007-10-27 22:32:28 UTC >>> >>> FreeBSD src repository >>> >>> Modified files: >>> include _ctype.h >>> Log: >>> Micro-optimization of prev. commit, change >>> (_c < 0 || _c >= 128) to (_c & ~0x7F) >> Isn't that a non-optimization in code and a minor pessimization of readability? >> Maybe I'm getting rusty, but those seem to result in nearly identical code on >> i386 with a relatively modern GCC. Did you look at the compiler output for this >> optimization? Is there a specific expensive instruction you're trying to avoid? >> For such thoroughyl bit-aligned range checks, you shouldn't even get a branch >> for the former case. Is there a platform other than i386 I should look at where >> the previous expression is more clearly pessimized? Or a different compiler >> than GCC? > > For ones who doubts there two tests compiled with -O2. As you may see the > result is almost identical (andl vs cmpl): > -------------------- a.c -------------------- > main () { > > int c; > > return (c & ~0x7f) ? 0 : c * 2; > } > -------------------- a.s -------------------- > .file "a.c" > .text > .p2align 4,,15 > .globl main > .type main, @function > main: > leal 4(%esp), %ecx > andl $-16, %esp > pushl -4(%ecx) > movl %eax, %edx > andl $-128, %edx > addl %eax, %eax > cmpl $1, %edx > sbbl %edx, %edx > pushl %ebp > andl %edx, %eax > movl %esp, %ebp > pushl %ecx > popl %ecx > popl %ebp > leal -4(%ecx), %esp > ret > .size main, .-main > .ident "GCC: (GNU) 4.2.1 20070719 [FreeBSD]" > -------------------- a1.c -------------------- > main () { > > int c; > > return (c < 0 || c >= 128) ? 0 : c * 2; > > > } > -------------------- a1.s -------------------- > .file "a1.c" > .text > .p2align 4,,15 > .globl main > .type main, @function > main: > leal 4(%esp), %ecx > andl $-16, %esp > pushl -4(%ecx) > addl %eax, %eax > cmpl $128, %eax > sbbl %edx, %edx > andl %edx, %eax > pushl %ebp > movl %esp, %ebp > pushl %ecx > popl %ecx > popl %ebp > leal -4(%ecx), %esp > ret > .size main, .-main > .ident "GCC: (GNU) 4.2.1 20070719 [FreeBSD]" Your example is invalid. The value of c is undefined in this function and you see random garbage as result (for example in the code snippet you see the c * 2 (addl %eax, %eax) and after that is the cmpl, which uses %eax, too). In fact it would be perfectly legal for the compiler to always return 0, call abort(), or let demons fly out of your nose. Also the example is still unrealistic: You usually don't multiply chars by two. Lets try something more realistic: an ASCII filter int filter_ascii0(int c) { return c < 0 || c >= 128 ? '?' : c; } int filter_ascii1(int c) { return c & ~0x7F ? '?' : c; } Especially mind that c is not dead after the condition. Even if your example did not used an undefined value, the value of c is dead after the test, which is unlikely for typical string handling code. And now the compiled code (GCC 3.4.6 with -O2 -march=athlon-xp -fomit-frame-pointer - I used these switches to get more compact code. It has no influence on the condition test.): 00000000 <filter_ascii0>: 0: 8b 54 24 04 mov 0x4(%esp),%edx 4: b8 3f 00 00 00 mov $0x3f,%eax 9: 83 fa 7f cmp $0x7f,%edx c: 0f 46 c2 cmovbe %edx,%eax f: c3 ret 00000010 <filter_ascii1>: 10: 8b 54 24 04 mov 0x4(%esp),%edx 14: b8 3f 00 00 00 mov $0x3f,%eax 19: f7 c2 80 ff ff ff test $0xffffff80,%edx 1f: 0f 44 c2 cmove %edx,%eax 22: c3 ret You see there is a test instruction used in filter_ascii1, because the value in %edx does not die at the test, but is used again in the cmove. Christoph
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?47292F79.9030102>