Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 30 Aug 2002 20:00:10 -0700 (PDT)
From:      Mike Harding <mvh@netcom.com>
To:        freebsd-bugs@FreeBSD.org
Subject:   Re: misc/42167: du uses linear search for duplicate inodes - very slow!
Message-ID:  <200208310300.g7V30AUT094327@freefall.freebsd.org>

next in thread | raw e-mail | index | archive | help
The following reply was made to PR misc/42167; it has been noted by GNATS.

From: Mike Harding <mvh@netcom.com>
To: freebsd-gnats-submit@FreeBSD.org, mvh@ix.netcom.com
Cc:  
Subject: Re: misc/42167: du uses linear search for duplicate inodes - very
 slow!
Date: Fri, 30 Aug 2002 19:52:57 -0700

 Here's a unified diff, for perusal by the merely curious... the general 
 approach is to replace the single unordered list with HASHSIZE unordered 
 lists.  The hashing function is a simple divide, which should be 
 sufficient as inodes should be very evenly distributed (sequential, 
 no?).  The use of 'abs' below may be unnecessary if ino_t is guaranteed 
 to be unsigned, I am unaware of convention here...
 
 - Mike H.
 
 bash-2.05b$ cvs diff -u -r 1.1 -r 1.3 du.c
 Index: du.c
 ===================================================================
 RCS file: /usr/local/cvsroot/du/du.c,v
 retrieving revision 1.1
 retrieving revision 1.3
 diff -u -r1.1 -r1.3
 --- du.c    30 Aug 2002 14:42:41 -0000    1.1
 +++ du.c    31 Aug 2002 00:20:39 -0000    1.3
 @@ -314,30 +314,39 @@
      ino_t    inode;
  } ID;
  
 +#define HASHSIZE 256
  
  int
  linkchk(p)
      FTSENT *p;
  {
 -    static ID *files;
 -    static int maxfiles, nfiles;
 +    static ID **files;
 +    static int *maxfiles, *nfiles;
 +    static ID *filesph[HASHSIZE];
 +    static int maxfilesh[HASHSIZE], nfilesh[HASHSIZE];
      ID *fp, *start;
      ino_t ino;
      dev_t dev;
 +    int index;
  
      ino = p->fts_statp->st_ino;
 +    index = abs( ino % HASHSIZE);
 +    files = &filesph[index];
 +    maxfiles = &maxfilesh[index];
 +    nfiles = &nfilesh[index];
 +
      dev = p->fts_statp->st_dev;
 -    if ((start = files) != NULL)
 -        for (fp = start + nfiles - 1; fp >= start; --fp)
 +    if ((start = *files) != NULL)
 +        for (fp = start + *nfiles - 1; fp >= start; --fp)
              if (ino == fp->inode && dev == fp->dev)
                  return (1);
  
 -    if (nfiles == maxfiles && (files = realloc((char *)files,
 -        (u_int)(sizeof(ID) * (maxfiles += 128)))) == NULL)
 +    if (*nfiles == *maxfiles && (*files = realloc((char *)*files,
 +        (u_int)(sizeof(ID) * (*maxfiles += 128)))) == NULL)
          errx(1, "can't allocate memory");
 -    files[nfiles].inode = ino;
 -    files[nfiles].dev = dev;
 -    ++nfiles;
 +    (*files)[*nfiles].inode = ino;
 +    (*files)[*nfiles].dev = dev;
 +    ++(*nfiles);
      return (0);
  }
  
 

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-bugs" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200208310300.g7V30AUT094327>