Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 24 Dec 1995 03:24:42 +0100
From:      Torbjorn Granlund <tege@matematik.su.se>
To:        michael butler <imb@scgt.oz.au>
Cc:        tege@matematik.su.se (Torbjorn Granlund), freebsd-hackers@freebsd.org
Subject:   Re: Pentium bcopy 
Message-ID:  <199512240224.DAA26871@insanus.matematik.su.se>
In-Reply-To: Your message of "Sun, 24 Dec 1995 12:57:48 %2B1100." <199512240157.MAA09624@asstdc.scgt.oz.au> 

next in thread | previous in thread | raw e-mail | index | archive | help
  > The reason that this is so much faster is that it uses the dual-ported
  > cache is a near-optimal way.

  Does this approach demonstrate any significant penalties with less
  sophisticated cache architectures, for example 386DX or non-pipelined ?

The approach has a significant penalty on a 386 (3x slower).

I suspect it might be a tad bit slower on a 486 with a write-through L1
cache.  But the approach should help on 486 systems with write-back cache.

I don't have any 486 systems, so I cannot tell for sure.  Here is a simple
test program that you can use for timing tests:

#include <sys/time.h>
#include <sys/resource.h>

unsigned long
cputime ()
{
  struct rusage rus;

  getrusage (0, &rus);
  return rus.ru_utime.tv_sec * 1000 + rus.ru_utime.tv_usec / 1000;
}

#ifndef SIZE
#define SIZE 1000
#endif

main ()
{
  int s[SIZE], d[SIZE];
  int i;
  long t0;

  t0 = cputime ();
  for (i = 0; i < 100000; i++)
    copy (d, s, SIZE);
  printf ("copy %ld\n", cputime () - t0);

  t0 = cputime ();
  for (i = 0; i < 100000; i++)
    memcpy (d, s, SIZE * sizeof (int));
  printf ("memcpy %ld\n", cputime () - t0);

  exit (0);
}



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199512240224.DAA26871>