From owner-freebsd-bugs Fri Aug 30 20: 0:21 2002 Delivered-To: freebsd-bugs@hub.freebsd.org Received: from mx1.FreeBSD.org (mx1.FreeBSD.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 3BB6F37B400 for ; Fri, 30 Aug 2002 20:00:14 -0700 (PDT) Received: from freefall.freebsd.org (freefall.FreeBSD.org [216.136.204.21]) by mx1.FreeBSD.org (Postfix) with ESMTP id A2FF443E6A for ; Fri, 30 Aug 2002 20:00:12 -0700 (PDT) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (gnats@localhost [127.0.0.1]) by freefall.freebsd.org (8.12.4/8.12.4) with ESMTP id g7V30AJU094328 for ; Fri, 30 Aug 2002 20:00:10 -0700 (PDT) (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.12.4/8.12.4/Submit) id g7V30AUT094327; Fri, 30 Aug 2002 20:00:10 -0700 (PDT) Date: Fri, 30 Aug 2002 20:00:10 -0700 (PDT) Message-Id: <200208310300.g7V30AUT094327@freefall.freebsd.org> To: freebsd-bugs@FreeBSD.org Cc: From: Mike Harding Subject: Re: misc/42167: du uses linear search for duplicate inodes - very slow! Reply-To: Mike Harding Sender: owner-freebsd-bugs@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org The following reply was made to PR misc/42167; it has been noted by GNATS. From: Mike Harding To: freebsd-gnats-submit@FreeBSD.org, mvh@ix.netcom.com Cc: Subject: Re: misc/42167: du uses linear search for duplicate inodes - very slow! Date: Fri, 30 Aug 2002 19:52:57 -0700 Here's a unified diff, for perusal by the merely curious... the general approach is to replace the single unordered list with HASHSIZE unordered lists. The hashing function is a simple divide, which should be sufficient as inodes should be very evenly distributed (sequential, no?). The use of 'abs' below may be unnecessary if ino_t is guaranteed to be unsigned, I am unaware of convention here... - Mike H. bash-2.05b$ cvs diff -u -r 1.1 -r 1.3 du.c Index: du.c =================================================================== RCS file: /usr/local/cvsroot/du/du.c,v retrieving revision 1.1 retrieving revision 1.3 diff -u -r1.1 -r1.3 --- du.c 30 Aug 2002 14:42:41 -0000 1.1 +++ du.c 31 Aug 2002 00:20:39 -0000 1.3 @@ -314,30 +314,39 @@ ino_t inode; } ID; +#define HASHSIZE 256 int linkchk(p) FTSENT *p; { - static ID *files; - static int maxfiles, nfiles; + static ID **files; + static int *maxfiles, *nfiles; + static ID *filesph[HASHSIZE]; + static int maxfilesh[HASHSIZE], nfilesh[HASHSIZE]; ID *fp, *start; ino_t ino; dev_t dev; + int index; ino = p->fts_statp->st_ino; + index = abs( ino % HASHSIZE); + files = &filesph[index]; + maxfiles = &maxfilesh[index]; + nfiles = &nfilesh[index]; + dev = p->fts_statp->st_dev; - if ((start = files) != NULL) - for (fp = start + nfiles - 1; fp >= start; --fp) + if ((start = *files) != NULL) + for (fp = start + *nfiles - 1; fp >= start; --fp) if (ino == fp->inode && dev == fp->dev) return (1); - if (nfiles == maxfiles && (files = realloc((char *)files, - (u_int)(sizeof(ID) * (maxfiles += 128)))) == NULL) + if (*nfiles == *maxfiles && (*files = realloc((char *)*files, + (u_int)(sizeof(ID) * (*maxfiles += 128)))) == NULL) errx(1, "can't allocate memory"); - files[nfiles].inode = ino; - files[nfiles].dev = dev; - ++nfiles; + (*files)[*nfiles].inode = ino; + (*files)[*nfiles].dev = dev; + ++(*nfiles); return (0); } To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-bugs" in the body of the message