Date: Thu, 01 Nov 2007 02:52:13 +0100 From: Christoph Mallon <christoph.mallon@gmx.de> To: Andrey Chernov <ache@nagual.pp.ru>, Christoph Mallon <christoph.mallon@gmx.de>, src-committers@FreeBSD.ORG, cvs-src@FreeBSD.ORG, cvs-all@FreeBSD.ORG Subject: Re: cvs commit: src/include _ctype.h Message-ID: <4729314D.1090709@gmx.de> In-Reply-To: <20071031221934.GA90781@nagual.pp.ru> References: <200710272232.l9RMWSbK072082@repoman.freebsd.org> <47264710.2000500@gmx.de> <20071031221934.GA90781@nagual.pp.ru>
next in thread | previous in thread | raw e-mail | index | archive | help
Andrey Chernov wrote: > On Mon, Oct 29, 2007 at 09:48:16PM +0100, Christoph Mallon wrote: >> Andrey A. Chernov wrote: >>> ache 2007-10-27 22:32:28 UTC >>> FreeBSD src repository >>> Modified files: >>> include _ctype.h Log: >>> Micro-optimization of prev. commit, change >>> (_c < 0 || _c >= 128) to (_c & ~0x7F) >>> Revision Changes Path >>> 1.33 +1 -1 src/include/_ctype.h >> Actually this is rather a micro-pessimisation. Every compiler worth its >> money transforms the range check into single unsigned comparison. The >> latter test on the other hand on x86 gets probably transformed into a test >> instruction. This instruction has no form with sign extended 8bit >> immediate, but only with 32bit immediate. This results in a significantly >> longer opcode (three bytes more) than a single (unsigned)_c > 127, which a >> sane compiler produces. I suspect some RISC machines need one more >> instruction for the "micro-optimised" code, too. >> In theory GCC could transform the _c & ~0x7F back into a (unsigned)_c > >> 127, but it does not do this (the only compiler I found, which does this >> transformation, is LLVM). >> Further IMO it is hard to decipher what _c & ~0x7F is supposed to do. > > 1. My variant is compiler optimization level independent. F.e. without > optimization completely there is no range check transform you talk about > at all and very long asm code is generated. I also mean the case where gcc > optimization bug was avoided, removing optimization (like compiling large > part of Xorg server recently), using non-gcc compilers etc. cases. Compiling without any optimisations makes the code slow for a zillion other reasons (no load/store optimisations, constant folding, common subexpression elimination, if-conversion, partial redundant expression elimination, strength reduction, reassociation, code placement, and many more), so a not transformed range check is really not of any concern. > 2. _c & ~0x7F comes right from is{w}ascii() so there is no such enormously > big problems to decifer. I just want to keep all ctype in style. Repeating cryptic code does not make it better, IMO. > 3. I see no "longer opcode (three bytes more)" you talk about in my tests > (andl vs cmpl was there, no testl). See the reply to the mail with your code example. Christoph
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4729314D.1090709>