Date: Thu, 1 Nov 2007 01:19:35 +0300 From: Andrey Chernov <ache@nagual.pp.ru> To: Christoph Mallon <christoph.mallon@gmx.de> Cc: cvs-src@FreeBSD.ORG, src-committers@FreeBSD.ORG, cvs-all@FreeBSD.ORG Subject: Re: cvs commit: src/include _ctype.h Message-ID: <20071031221934.GA90781@nagual.pp.ru> In-Reply-To: <47264710.2000500@gmx.de> References: <200710272232.l9RMWSbK072082@repoman.freebsd.org> <47264710.2000500@gmx.de>
next in thread | previous in thread | raw e-mail | index | archive | help
On Mon, Oct 29, 2007 at 09:48:16PM +0100, Christoph Mallon wrote: > Andrey A. Chernov wrote: >> ache 2007-10-27 22:32:28 UTC >> FreeBSD src repository >> Modified files: >> include _ctype.h Log: >> Micro-optimization of prev. commit, change >> (_c < 0 || _c >= 128) to (_c & ~0x7F) >> Revision Changes Path >> 1.33 +1 -1 src/include/_ctype.h > > Actually this is rather a micro-pessimisation. Every compiler worth its > money transforms the range check into single unsigned comparison. The > latter test on the other hand on x86 gets probably transformed into a test > instruction. This instruction has no form with sign extended 8bit > immediate, but only with 32bit immediate. This results in a significantly > longer opcode (three bytes more) than a single (unsigned)_c > 127, which a > sane compiler produces. I suspect some RISC machines need one more > instruction for the "micro-optimised" code, too. > In theory GCC could transform the _c & ~0x7F back into a (unsigned)_c > > 127, but it does not do this (the only compiler I found, which does this > transformation, is LLVM). > Further IMO it is hard to decipher what _c & ~0x7F is supposed to do. 1. My variant is compiler optimization level independent. F.e. without optimization completely there is no range check transform you talk about at all and very long asm code is generated. I also mean the case where gcc optimization bug was avoided, removing optimization (like compiling large part of Xorg server recently), using non-gcc compilers etc. cases. 2. _c & ~0x7F comes right from is{w}ascii() so there is no such enormously big problems to decifer. I just want to keep all ctype in style. 3. I see no "longer opcode (three bytes more)" you talk about in my tests (andl vs cmpl was there, no testl). -- http://ache.pp.ru/
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20071031221934.GA90781>