Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 17 Aug 1997 15:13:25 -0400 (EDT)
From:      Thomas David Rivers <ponds!rivers@dg-rtp.dg.com>
To:        ponds!root.com!dg, ponds!lambert.org!terry
Cc:        ponds!FreeBSD.ORG!hackers, ponds!lakes.dignus.com!rivers
Subject:   Re: More info on slow "rm" times with 2.2.1+.
Message-ID:  <199708171913.PAA02003@lakes.dignus.com>

next in thread | raw e-mail | index | archive | help

Terry Lambert writes
> 
> > > It would be trivial for me to verify any slow or fast times - all
> > >I've got to do is make a big directory... seems that ~300 files is
> > >enough to run into whatever the problem may be...
> > 
> >    How many files are in the directory isn't important. What is important is
> > the size of the directory. You can have a 20MB directory and yet have only a
> > 100 files in it. There is code to free up unused space in directories, but
> > it only works if the free space is at the end. If the directory is large,
> > then it will take a large amount of time to search through it.
> 
> Specifically, if I have a directory with 20,000 entries, and I average
> 32 entries per block ((512/32) - 8 bytes entry data per 4-7 byte name),
> then I am using 625 blocks.
> 
> The opendir/readdir library routines operate by calling getdents()
> once per block to refill the user space "snapshot" of a directory
> block, on a block by block basis.  That is, I call opendir and it
> calls getdents() and grabs the first 32 entries.  I call readdir
> and move forward unitil I've returned all 32, and then call getdents()
> again.
> 
> Now you are linearly traversing the directory forward (using getdents())
> to do this.
> 
> Say you want to remove all entries.  You delete the first entry
> (ignore the vagries of "." and ".." for a second).  This modifies
> the first directory block.
> 
> Now you get halfway through.
> 
> You must linearly traverse 312 directory blocks in the lookup for the
> entry you are going to delete.
> 
> Now you are 3/4 of the way thorugh.
> 
> You must linearly traverse 466 directory blocks in the lookup for the
> entry you are going to delete.
> 
> In other words, your traversal is going to increase exponentially
> with a slope equalt to a square exponential progreassion divided
> by 32 (or whatever your average number of entries per block is,
> given your average file name length).
> 
> Deleteing the last entries are going to take the longest of all.
> 
> There were some additional things added into the unlink path in
> respect for the lease code, and there were some additional things
> added into the getdents path in support of NFSv3.  And there is
> a *lot* of additional function call and contention overhead from
> the BSD4.4-Lite2 locking code, which is upside down (IMO, anyway).
> 
> So yes, it's slower than it was, but no, it wasn't fast before; it
> had the same problems.
> 

 ...

 Ok - I can buy that - the last block will take longest to delete
(since you have to skip over the previous, un-coalesced, entries...)

 But - I'm not sure that explains why deleting the last 300 files
would have (3*300) 900 seconds with 2.2.1 in this situation, where
it only took 2.1.7 3 seconds.  Just from listening to the disk
and watching I/O I can see that 2.2.1 is doing an *awful* lot of 
I/O 2.1.7 didn't do... could there be something in this locking?

	- Dave R. -



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199708171913.PAA02003>