Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 4 Sep 2004 01:30:04 -0700 (PDT)
From:      Don Lewis <truckman@FreeBSD.org>
To:        scrappy@hub.org
Cc:        freebsd-current@FreeBSD.org
Subject:   Re: what is fsck's "slowdown"?
Message-ID:  <200409040830.i848U4jN031148@gw.catspoiler.org>
In-Reply-To: <20040904003222.J812@ganymede.hub.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On  4 Sep, Marc G. Fournier wrote:
> On Fri, 3 Sep 2004, Don Lewis wrote:

>> Would the file system in question happen to be full of UNREF files that
>> fsck is deleting?
> 
> mostly 'ZERO LENGTH DIRECTORY' ...

I'm pretty sure that I understand the problem now.  During pass 4, fsck
looks at each inode.  It checks each inode in the FSTATE and DFOUND
states to see if their link counts need to be adjusted.  If the link
count does not need to be adjusted, fsck checks to see if the inode is
on the list of inodes whose initial link counts were zero, and if it
finds the inode on this list, it clears the inode.

The problem is that the zero length directories get added to this list
if their initial link count is zero, and they also don't get removed
from the list because they are in the DCLEAR state, so the list doesn't
shrink.  This bloats the list, which greatly slows down processing of
normal files and directories.

Deleting unreferenced files is not the biggest bottleneck, so reversing
the order of the list isn't going to help much.  Probably the biggest
speedup could be gained by keeping the zero length directories off the
list.

In -CURRENT the pass 1 code in checkinode() looks like:

        if (DIP(dp, di_nlink) <= 0) {
                zlnp = (struct zlncnt *)malloc(sizeof *zlnp);
                if (zlnp == NULL) {
                        pfatal("LINK COUNT TABLE OVERFLOW");
                        if (reply("CONTINUE") == 0) {
                                ckfini(0);
                                exit(EEXIT);
                        }
                } else {
                        zlnp->zlncnt = inumber;
                        zlnp->next = zlnhead;
                        zlnhead = zlnp;
                }
        }
        if (mode == IFDIR) {
                if (DIP(dp, di_size) == 0)
                        inoinfo(inumber)->ino_state = DCLEAR;
                else
                        inoinfo(inumber)->ino_state = DSTATE;
                cacheino(dp, inumber);
                countdirs++;
        } else
                inoinfo(inumber)->ino_state = FSTATE;

Change the first 'if' statement to:

	if (DIP(dp, di_nlink) <= 0 && (mode != IFDIR ||
	    DIP(dp, di_size) != 0)) {

The equivalent change in 4.x would be:

	if (dp->di_nlink <= 0 && (mode != IFDIR || dp->di_size != 0)) {

The order of the two blocks of code could also be reversed so that the
inode would only be added to the list if it was not in the DCLEAR state.
I think this is a cleaner change.

The next step would probably be to convert the zln list to a hash table.
Using the lower bits of the inode number as the key would probably be
reasonable.

A final step would be to build the hash chain lists in the reverse order
to speed deletion, but I suspect there is lower hanging fruit elsewhere.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200409040830.i848U4jN031148>