Date: Sun, 24 Dec 1995 03:24:42 +0100 From: Torbjorn Granlund <tege@matematik.su.se> To: michael butler <imb@scgt.oz.au> Cc: tege@matematik.su.se (Torbjorn Granlund), freebsd-hackers@freebsd.org Subject: Re: Pentium bcopy Message-ID: <199512240224.DAA26871@insanus.matematik.su.se> In-Reply-To: Your message of "Sun, 24 Dec 1995 12:57:48 %2B1100." <199512240157.MAA09624@asstdc.scgt.oz.au>
next in thread | previous in thread | raw e-mail | index | archive | help
> The reason that this is so much faster is that it uses the dual-ported > cache is a near-optimal way. Does this approach demonstrate any significant penalties with less sophisticated cache architectures, for example 386DX or non-pipelined ? The approach has a significant penalty on a 386 (3x slower). I suspect it might be a tad bit slower on a 486 with a write-through L1 cache. But the approach should help on 486 systems with write-back cache. I don't have any 486 systems, so I cannot tell for sure. Here is a simple test program that you can use for timing tests: #include <sys/time.h> #include <sys/resource.h> unsigned long cputime () { struct rusage rus; getrusage (0, &rus); return rus.ru_utime.tv_sec * 1000 + rus.ru_utime.tv_usec / 1000; } #ifndef SIZE #define SIZE 1000 #endif main () { int s[SIZE], d[SIZE]; int i; long t0; t0 = cputime (); for (i = 0; i < 100000; i++) copy (d, s, SIZE); printf ("copy %ld\n", cputime () - t0); t0 = cputime (); for (i = 0; i < 100000; i++) memcpy (d, s, SIZE * sizeof (int)); printf ("memcpy %ld\n", cputime () - t0); exit (0); }
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199512240224.DAA26871>