Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 01 Nov 2007 02:52:13 +0100
From:      Christoph Mallon <christoph.mallon@gmx.de>
To:        Andrey Chernov <ache@nagual.pp.ru>,  Christoph Mallon <christoph.mallon@gmx.de>, src-committers@FreeBSD.ORG, cvs-src@FreeBSD.ORG, cvs-all@FreeBSD.ORG
Subject:   Re: cvs commit: src/include _ctype.h
Message-ID:  <4729314D.1090709@gmx.de>
In-Reply-To: <20071031221934.GA90781@nagual.pp.ru>
References:  <200710272232.l9RMWSbK072082@repoman.freebsd.org> <47264710.2000500@gmx.de> <20071031221934.GA90781@nagual.pp.ru>

next in thread | previous in thread | raw e-mail | index | archive | help
Andrey Chernov wrote:
> On Mon, Oct 29, 2007 at 09:48:16PM +0100, Christoph Mallon wrote:
>> Andrey A. Chernov wrote:
>>> ache        2007-10-27 22:32:28 UTC
>>>   FreeBSD src repository
>>>   Modified files:
>>>     include              _ctype.h   Log:
>>>   Micro-optimization of prev. commit, change
>>>   (_c < 0 || _c >= 128) to (_c & ~0x7F)
>>>     Revision  Changes    Path
>>>   1.33      +1 -1      src/include/_ctype.h
>> Actually this is rather a micro-pessimisation. Every compiler worth its 
>> money transforms the range check into single unsigned comparison. The 
>> latter test on the other hand on x86 gets probably transformed into a test 
>> instruction. This instruction has no form with sign extended 8bit 
>> immediate, but only with 32bit immediate. This results in a significantly 
>> longer opcode (three bytes more) than a single (unsigned)_c > 127, which a 
>> sane compiler produces. I suspect some RISC machines need one more 
>> instruction for the "micro-optimised" code, too.
>> In theory GCC could transform the _c & ~0x7F back into a (unsigned)_c > 
>> 127, but it does not do this (the only compiler I found, which does this 
>> transformation, is LLVM).
>> Further IMO it is hard to decipher what _c & ~0x7F is supposed to do.
> 
> 1. My variant is compiler optimization level independent. F.e. without 
> optimization completely there is no range check transform you talk about 
> at all and very long asm code is generated. I also mean the case where gcc 
> optimization bug was avoided, removing optimization (like compiling large 
> part of Xorg server recently), using non-gcc compilers etc. cases.

Compiling without any optimisations makes the code slow for a zillion 
other reasons (no load/store optimisations, constant folding, common 
subexpression elimination, if-conversion, partial redundant expression 
elimination, strength reduction, reassociation, code placement, and many 
more), so a not transformed range check is really not of any concern.

> 2. _c & ~0x7F comes right from is{w}ascii() so there is no such enormously
> big problems to decifer. I just want to keep all ctype in style.

Repeating cryptic code does not make it better, IMO.

> 3. I see no "longer opcode (three bytes more)" you talk about in my tests 
> (andl vs cmpl was there, no testl).

See the reply to the mail with your code example.

	Christoph



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4729314D.1090709>