Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 6 May 2000 14:48:57 -0700 (PDT)
From:      Jean-Marc Zucconi <jmz@FreeBSD.org>
To:        dnelson@emsphone.com
Cc:        current@FreeBSD.ORG
Subject:   Re: Can someone explain this?
Message-ID:  <200005062148.OAA18135@freefall.freebsd.org>
In-Reply-To: <20000506002203.A6363@dan.emsphone.com> (message from Dan Nelson on Sat, 6 May 2000 00:22:03 -0500)
References:  <200005060030.RAA11795@freefall.freebsd.org> <20000506002203.A6363@dan.emsphone.com>

next in thread | previous in thread | raw e-mail | index | archive | help
>>>>> Dan Nelson writes:

 > In the last episode (May 05), Jean-Marc Zucconi said:
 >> Here is something I don't understand:
 >> 
 >> $ sh -c  '/usr/bin/time  ./a.out'
 >> 2.40 real         2.38 user         0.01 sys
 >> $ /usr/bin/time  ./a.out
 >> 7.19 real         7.19 user         0.00 sys
 >> 
 >> The same program is 3 times slower in the second case. The effect is
 >> systematic but depends on the program being run. I have seen inverse
 >> behavior with another program. Using time -l, I note that this seems
 >> to be related with a higher value of 'involuntary context switches'
 >> (3 times more switches in the slower case).

 > It has to do with your stack.  Calling the program via /bin/sh sets up
 > your environment differently, so your program's stack starts at a
 > different place.  Try running this:

 > main (int argc, char **argv)
 > {
 >     int i;
 >     double x=2, y=2, z=2;
 >     printf ("%p\n",&i);
 >     for (i = 0; i < 10000000; i++) z = y*x;
 >     return 0;
 > }

 > Run this commandline:

 > STR= ; export STR ; while : ; do ; STR=z$STR ; /usr/bin/time ./a,out ; done

 > And watch your execution time flip flop every 4 runs.

OK. The effect is indeed very clear.

 > Here are some bits from the gcc infopage explaining your options if you
 > want consistant speed from programs using doubles:

 > `-mpreferred-stack-boundary=NUM'
 >      Attempt to keep the stack boundary aligned to a 2 raised to NUM
 >      byte boundary.  If `-mpreferred-stack-boundary' is not specified,
 >      the default is 4 (16 bytes or 128 bits).
 >      The stack is required to be aligned on a 4 byte boundary.  On
 >      Pentium and PentiumPro, `double' and `long double' values should be
 >      aligned to an 8 byte boundary (see `-malign-double') or suffer
 >      significant run time performance penalties.  On Pentium III, the
 >      Streaming SIMD Extention (SSE) data type `__m128' suffers similar
 >      penalties if it is not 16 byte aligned.

 > `-mno-align-double'
 >      Control whether GCC aligns `double', `long double', and `long
 >      long' variables on a two word boundary or a one word boundary.
 >      Aligning `double' variables on a two word boundary will produce
 >      code that runs somewhat faster on a `Pentium' at the expense of
 >      more memory.

 >      *Warning:* if you use the `-malign-double' switch, structures
 >      containing the above types will be aligned differently than the
 >      published application binary interface specifications for the 386.

Now the problem is that the -mpreferred-stack-boundary=NUM option does
not solve the problem :-( I still get a penalty in 50% of the cases.

Jean-Marc

-- 
 Jean-Marc Zucconi                    PGP Key: finger jmz@FreeBSD.org


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-current" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200005062148.OAA18135>