Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 8 May 2007 16:37:03 -0500 (CDT)
From:      "Sean C. Farley" <sean-freebsd@farley.org>
To:        Bruce Evans <bde@zeta.org.au>
Cc:        Daniel Eischen <deischen@freebsd.org>, arch@freebsd.org
Subject:   Re: HEADS DOWN
Message-ID:  <20070508162458.G6015@baba.farley.org>
In-Reply-To: <20070506091835.A43775@besplex.bde.org>
References:  <Pine.GSO.4.64.0705021332020.8590@sea.ntplx.net> <20070502183100.P1317@baba.farley.org> <Pine.GSO.4.64.0705022034180.8590@sea.ntplx.net> <20070502230413.Y30614@thor.farley.org> <20070503160351.GA15008@nagual.pp.ru> <20070504085905.J39482@thor.farley.org> <20070504213312.GA33163@nagual.pp.ru> <20070504174657.D1343@thor.farley.org> <20070505213202.GA49925@nagual.pp.ru> <20070505163707.J6670@thor.farley.org> <20070505221125.GA50439@nagual.pp.ru> <20070506091835.A43775@besplex.bde.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Sun, 6 May 2007, Bruce Evans wrote:

> On Sun, 6 May 2007, Andrey Chernov wrote:
>
>> On Sat, May 05, 2007 at 04:48:44PM -0500, Sean C. Farley wrote:
>>>  I have the same assembly output.  Inlined __strleneq() ends up
>>>  being faster on my system than GCC's strlen() when I changed all
>>>  calls where checkEquals equaled false.  I believe you that it
>>>  should be faster with GCC's version, but it is not ending up that
>>>  way on my Athlon XP and Pentium 4 systems running FreeBSD 6.2.
>>>
>>>  There is now a sysenv-strlen.c that I tested the timings.c program
>>>  in regressions/environment directory.  It keeps showing
>>>  __strleneq() to be faster.
>> 
>> I wonder how it possible. Your after "if" variant becomes
>> .L13:
>>        incl    %eax
>>        cmpb    $0, (%eax)
>>        jne .L13
>> which should be slower in general than gcc ones.
>
> No, it should be faster on most machines.  I just happened to look at
> an optimization manual which reminded me that most string instructions
> should never be used since they have large setup overheads and most of
> them are slower even after setup.  I thought that scasb wasn't so bad,
> but the manual went as far as saying that scasb is one of the string
> instructions that should never be used.

<nice comparison of assembly instructions for comparison snipped>

> Of course, optimizing strlen() is unimportant, since even the slowest
> method runs at nearly 1GB/S on modern machines and you rarely have
> more than a few MB of strings to process.

Here is a comparison of running setenv(name, value, 1) 1000 times before
and after using strlen (when not looking for an '=' character) and
inlined strlen respectively:

x setenv-strlen-1000.txt
+ setenv-inline-1000.txt
+--------------------------------------------------------------------------+
|    +                                                    x x              |
|  + +++                                                  x x              |
|  + +++                +                                 x x   x         x|
||____MA_____|                                           |__MA____|        |
+--------------------------------------------------------------------------+
     N           Min           Max        Median           Avg        Stddev
x  10      0.000256      0.000279     0.0002585     0.0002604 6.9474216e-06
+  10      0.000175      0.000206     0.0001785     0.0001808 9.0283504e-06
Difference at 95.0% confidence
 	-7.96e-05 +/- 7.56879e-06
 	-30.5684% +/- 2.9066%
 	(Student's t, pooled s = 8.05536e-06)

There is a nice decrease in time using inline'ing and setenv() over
strlen().

Would it be preferred to go ahead to use strlen() in preparation for a
faster strlen() in the future?  I would still use the inline'd version
when counting characters while watching for an '=' character.  Or should
it also be changed to perform a strlen() and then a strchr()?

Sean
-- 
sean-freebsd@farley.org



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20070508162458.G6015>