Date: Sun, 24 Dec 1995 03:24:42 +0100 From: Torbjorn Granlund <tege@matematik.su.se> To: michael butler <imb@scgt.oz.au> Cc: tege@matematik.su.se (Torbjorn Granlund), freebsd-hackers@freebsd.org Subject: Re: Pentium bcopy Message-ID: <199512240224.DAA26871@insanus.matematik.su.se> In-Reply-To: Your message of "Sun, 24 Dec 1995 12:57:48 %2B1100." <199512240157.MAA09624@asstdc.scgt.oz.au>
next in thread | previous in thread | raw e-mail | index | archive | help
> The reason that this is so much faster is that it uses the dual-ported
> cache is a near-optimal way.
Does this approach demonstrate any significant penalties with less
sophisticated cache architectures, for example 386DX or non-pipelined ?
The approach has a significant penalty on a 386 (3x slower).
I suspect it might be a tad bit slower on a 486 with a write-through L1
cache. But the approach should help on 486 systems with write-back cache.
I don't have any 486 systems, so I cannot tell for sure. Here is a simple
test program that you can use for timing tests:
#include <sys/time.h>
#include <sys/resource.h>
unsigned long
cputime ()
{
struct rusage rus;
getrusage (0, &rus);
return rus.ru_utime.tv_sec * 1000 + rus.ru_utime.tv_usec / 1000;
}
#ifndef SIZE
#define SIZE 1000
#endif
main ()
{
int s[SIZE], d[SIZE];
int i;
long t0;
t0 = cputime ();
for (i = 0; i < 100000; i++)
copy (d, s, SIZE);
printf ("copy %ld\n", cputime () - t0);
t0 = cputime ();
for (i = 0; i < 100000; i++)
memcpy (d, s, SIZE * sizeof (int));
printf ("memcpy %ld\n", cputime () - t0);
exit (0);
}
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199512240224.DAA26871>
