Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 27 Dec 2006 20:00:07 +1300
From:      Mark Kirkwood <markir@paradise.net.nz>
To:        David Xu <davidxu@freebsd.org>
Cc:        freebsd-performance@freebsd.org, bde@zeta.org.au
Subject:   Re: Cached file read performance
Message-ID:  <459219F7.7080104@paradise.net.nz>
In-Reply-To: <458B3E0C.6090104@freebsd.org>
References:  <458B3651.8090601@paradise.net.nz> <458B3E0C.6090104@freebsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help
David Xu wrote:
> Mark Kirkwood wrote:
.
>>
>> (snippage) I used the attached program to read a cached 
>> 781MB file sequentially and randomly with a specified block size (see 
>> below). The conclusion I came to was that our (i.e FreeBSD) cached 
>> read performance (particularly for smaller block sizes) could perhaps 
>> be improved...

> 
> I suspect in such a test, memory copying speed will be a key factor,
> I don't have number to back up my idea, but I think Linux has lots
> of tweaks, such as using MMX instruction to copy data.
> 
> 

Thought it would be good to test this too:

I used the small program (see below) to test memcpy'ing 50000 8192 byte 
chunks on both systems:

Gentoo (2.6.18):

$ ./memcpytest 8192 50000
50000 malloc in 0.5367s
50000 memcpy of 8192 byte blocks in 1.0887s


FreeBSD (6.2-PRE Nov 27):

$ ./memcpytest 8192 50000
50000 malloc in 0.1469s
50000 memcpy of 8192 byte blocks in 1.3599s


So we are a little slower (factor of 1.24) on the memcpy, this I guess 
contributes a *little* to us being slower with the cached reads, but is 
consistent with Bruce's findings of the cache design itself being the 
major factor!

One nice point to notice is that our malloc is substantially faster 
(~3.6 times for this test). These results were readily repeatable with 
minimal variation.

Cheers

Mark

P.s: I've included the program in-line below, as the attachment stripper 
for this list seems to be real aggressive :-)


----------------------------------------------------------------------

/*
  * memcpytest.c: Attempt to measure performance of memcpy.
  */

#include <stdio.h>
#include <stdlib.h>
#include <sys/types.h>
#include <sys/time.h>
#include <string.h>
#include <unistd.h>


int
main(int argc, char **argv) {

     int             blocksz;                /* The block size to copy. */
     int             numblocks;              /* How many copies to do. */

     char            *buf;                   /* Input buffer. */
     typedef         struct  _BlockArray {
         char        *blockentry;
     } BlockArray;
     BlockArray      *blockarray;            /* Array of block entry 
ptrs. */
     int             i;
     struct          timeval starttp, endtp, elapsedtp;
     double          elapsed;


     if (argc != 3) {
         printf("usage %s blocksize num_blocks\n",
                 argv[0]);
         exit(1);
     } else {
         blocksz = atoi(argv[1]);
         numblocks = atoi(argv[2]);
     }

     /* Start timing setup. */
     gettimeofday(&starttp, NULL);


     /* Allocate source buffer and initialize to something trivial. */
     buf = (char *) malloc(blocksz);
     if (buf == NULL) {
         printf("out of memory initializing source buffer!\n");
         exit(2);
     }
     memset(buf, 1, blocksz);

     /* Allocate destination array of buffer pointers. */
     blockarray  = malloc(numblocks * sizeof(blockarray));
     if (blockarray == NULL) {
         printf("out of memory initializing destination array!\n");
         exit(2);
     }    /* Allocate a block entry for each element of blockarray. */
     for (i = 0; i < numblocks; i++) {
         blockarray[i].blockentry = malloc(blocksz);
         if (blockarray[i].blockentry == NULL) {
             printf("out of memory initializing destination array 
contents!\n");
             exit(2);
         }
     }

     gettimeofday(&endtp, NULL);
     timersub(&endtp, &starttp, &elapsedtp);
     elapsed = (double)elapsedtp.tv_sec + 
(double)elapsedtp.tv_usec/1000000.0;

     printf("%d malloc in %.4fs\n", numblocks, elapsed);


     /* Start timing copy now. */
     gettimeofday(&starttp, NULL);


     /* Perform the copy. */
     for (i = 0; i < numblocks; i++) {
         memcpy((void *)blockarray[i].blockentry, (void *)buf, blocksz);
     }

     gettimeofday(&endtp, NULL);
     timersub(&endtp, &starttp, &elapsedtp);
     elapsed = (double)elapsedtp.tv_sec + 
(double)elapsedtp.tv_usec/1000000.0;


     printf("%d memcpy of %d byte blocks in %.4fs\n",
             numblocks, blocksz, elapsed);

     free(buf);
     for (i = 0; i < numblocks; i++) {
         free(blockarray[i].blockentry);
     }
     free(blockarray);

     exit(0);

}




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?459219F7.7080104>