From owner-freebsd-current@FreeBSD.ORG Sat Sep 4 08:30:13 2004 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 15CF916A4CE for ; Sat, 4 Sep 2004 08:30:13 +0000 (GMT) Received: from gw.catspoiler.org (217-ip-163.nccn.net [209.79.217.163]) by mx1.FreeBSD.org (Postfix) with ESMTP id A619043D31 for ; Sat, 4 Sep 2004 08:30:12 +0000 (GMT) (envelope-from truckman@FreeBSD.org) Received: from FreeBSD.org (mousie.catspoiler.org [192.168.101.2]) by gw.catspoiler.org (8.12.11/8.12.11) with ESMTP id i848U4jN031148; Sat, 4 Sep 2004 01:30:09 -0700 (PDT) (envelope-from truckman@FreeBSD.org) Message-Id: <200409040830.i848U4jN031148@gw.catspoiler.org> Date: Sat, 4 Sep 2004 01:30:04 -0700 (PDT) From: Don Lewis To: scrappy@hub.org In-Reply-To: <20040904003222.J812@ganymede.hub.org> MIME-Version: 1.0 Content-Type: TEXT/plain; charset=us-ascii cc: freebsd-current@FreeBSD.org Subject: Re: what is fsck's "slowdown"? X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 04 Sep 2004 08:30:13 -0000 On 4 Sep, Marc G. Fournier wrote: > On Fri, 3 Sep 2004, Don Lewis wrote: >> Would the file system in question happen to be full of UNREF files that >> fsck is deleting? > > mostly 'ZERO LENGTH DIRECTORY' ... I'm pretty sure that I understand the problem now. During pass 4, fsck looks at each inode. It checks each inode in the FSTATE and DFOUND states to see if their link counts need to be adjusted. If the link count does not need to be adjusted, fsck checks to see if the inode is on the list of inodes whose initial link counts were zero, and if it finds the inode on this list, it clears the inode. The problem is that the zero length directories get added to this list if their initial link count is zero, and they also don't get removed from the list because they are in the DCLEAR state, so the list doesn't shrink. This bloats the list, which greatly slows down processing of normal files and directories. Deleting unreferenced files is not the biggest bottleneck, so reversing the order of the list isn't going to help much. Probably the biggest speedup could be gained by keeping the zero length directories off the list. In -CURRENT the pass 1 code in checkinode() looks like: if (DIP(dp, di_nlink) <= 0) { zlnp = (struct zlncnt *)malloc(sizeof *zlnp); if (zlnp == NULL) { pfatal("LINK COUNT TABLE OVERFLOW"); if (reply("CONTINUE") == 0) { ckfini(0); exit(EEXIT); } } else { zlnp->zlncnt = inumber; zlnp->next = zlnhead; zlnhead = zlnp; } } if (mode == IFDIR) { if (DIP(dp, di_size) == 0) inoinfo(inumber)->ino_state = DCLEAR; else inoinfo(inumber)->ino_state = DSTATE; cacheino(dp, inumber); countdirs++; } else inoinfo(inumber)->ino_state = FSTATE; Change the first 'if' statement to: if (DIP(dp, di_nlink) <= 0 && (mode != IFDIR || DIP(dp, di_size) != 0)) { The equivalent change in 4.x would be: if (dp->di_nlink <= 0 && (mode != IFDIR || dp->di_size != 0)) { The order of the two blocks of code could also be reversed so that the inode would only be added to the list if it was not in the DCLEAR state. I think this is a cleaner change. The next step would probably be to convert the zln list to a hash table. Using the lower bits of the inode number as the key would probably be reasonable. A final step would be to build the hash chain lists in the reverse order to speed deletion, but I suspect there is lower hanging fruit elsewhere.