Date: Tue, 21 Mar 1995 23:49:34 +1000 From: Bruce Evans <bde@zeta.org.au> To: phk@ref.tfs.com, pst@shockwave.com Cc: CVS-commiters@time.cdrom.com, bde@zeta.org.au, cvs-etc@time.cdrom.com, jkh@freebsd.org, rgrimes@gndrsh.aac.dev.com Subject: Re: cvs commit: src/etc make.conf Message-ID: <199503211349.XAA16990@godzilla.zeta.org.au>
next in thread | raw e-mail | index | archive | help
> > We also need dynamic support for the i387 functions. -DHAVE_FPU is no
> > good because it can't be used for the distribution libraries. Something
> > like
> >
> > if (_have_i387)
> > result = _i387_pow(x, y);
> > else
> > result = __ieee754_pow(x, y);
> >
> > would add less time overhead than shared linkage.
>The extra test on every operation is bad.
Let's replace `pow' by `sin'. pow() isn't an i387 function and is too
complicated to synthesize from a few i387 functions.
To be precise, it costs 6 cycles on a 486 for the _i387_sin case and 5
cycles for the __ieee754_sin case (plus cache misses...)
>Xonsider the following fragment or high-speed linkages with shared libraries
>instead (I don't know how fast or slow shared linkages are):
Shared linkage costs 4 cycles (1 wasted for a stupidly placed nop and much
more for the first call; plus cache misses...).
> static vec_pow = pow_init;
> pow (base, exp)
> {
> return (*vec_pow)(base, exp);
> }
This would only cost 2 cycle (plus cache misses...).
Anyone for self modifying code? :-) The shared library already uses it
to avoid these 2 cycles and it might not be too hard to get the shared
library to patch in the addresses of the i387-specifice functions
instead of the generic one. Unfortunately , this won't work for
statically linked programs.
The hardware sin() takes 193-279 cycles on a 486 and the msun wrappers
take many more (especially for shared libraries; position-independent
code costs about 10 cycles just for loading the global register), so
another 5 cycles would be hardly noticeable.
Bruce
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199503211349.XAA16990>
