Date: Sun, 17 Aug 1997 15:13:25 -0400 (EDT) From: Thomas David Rivers <ponds!rivers@dg-rtp.dg.com> To: ponds!root.com!dg, ponds!lambert.org!terry Cc: ponds!FreeBSD.ORG!hackers, ponds!lakes.dignus.com!rivers Subject: Re: More info on slow "rm" times with 2.2.1+. Message-ID: <199708171913.PAA02003@lakes.dignus.com>
next in thread | raw e-mail | index | archive | help
Terry Lambert writes > > > > It would be trivial for me to verify any slow or fast times - all > > >I've got to do is make a big directory... seems that ~300 files is > > >enough to run into whatever the problem may be... > > > > How many files are in the directory isn't important. What is important is > > the size of the directory. You can have a 20MB directory and yet have only a > > 100 files in it. There is code to free up unused space in directories, but > > it only works if the free space is at the end. If the directory is large, > > then it will take a large amount of time to search through it. > > Specifically, if I have a directory with 20,000 entries, and I average > 32 entries per block ((512/32) - 8 bytes entry data per 4-7 byte name), > then I am using 625 blocks. > > The opendir/readdir library routines operate by calling getdents() > once per block to refill the user space "snapshot" of a directory > block, on a block by block basis. That is, I call opendir and it > calls getdents() and grabs the first 32 entries. I call readdir > and move forward unitil I've returned all 32, and then call getdents() > again. > > Now you are linearly traversing the directory forward (using getdents()) > to do this. > > Say you want to remove all entries. You delete the first entry > (ignore the vagries of "." and ".." for a second). This modifies > the first directory block. > > Now you get halfway through. > > You must linearly traverse 312 directory blocks in the lookup for the > entry you are going to delete. > > Now you are 3/4 of the way thorugh. > > You must linearly traverse 466 directory blocks in the lookup for the > entry you are going to delete. > > In other words, your traversal is going to increase exponentially > with a slope equalt to a square exponential progreassion divided > by 32 (or whatever your average number of entries per block is, > given your average file name length). > > Deleteing the last entries are going to take the longest of all. > > There were some additional things added into the unlink path in > respect for the lease code, and there were some additional things > added into the getdents path in support of NFSv3. And there is > a *lot* of additional function call and contention overhead from > the BSD4.4-Lite2 locking code, which is upside down (IMO, anyway). > > So yes, it's slower than it was, but no, it wasn't fast before; it > had the same problems. > ... Ok - I can buy that - the last block will take longest to delete (since you have to skip over the previous, un-coalesced, entries...) But - I'm not sure that explains why deleting the last 300 files would have (3*300) 900 seconds with 2.2.1 in this situation, where it only took 2.1.7 3 seconds. Just from listening to the disk and watching I/O I can see that 2.2.1 is doing an *awful* lot of I/O 2.1.7 didn't do... could there be something in this locking? - Dave R. -
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199708171913.PAA02003>