Date: Fri, 04 Apr 2003 14:52:23 -0800 From: Terry Lambert <tlambert2@mindspring.com> To: "Andrew R. Reiter" <arr@watson.org> Cc: freebsd-smp@freebsd.org Subject: Re: namecache Message-ID: <3E8E0CA7.E21BA871@mindspring.com> References: <20030404153953.S62312@fledge.watson.org> <3E8E001C.B39BF3ED@mindspring.com> <20030404170317.K63340@fledge.watson.org>
next in thread | previous in thread | raw e-mail | index | archive | help
"Andrew R. Reiter" wrote: > On Fri, 4 Apr 2003, Terry Lambert wrote: > :"Andrew R. Reiter" wrote: > :> Awhile back I shot a patch to this list for review that locked down some > :> stuff in the namecache arena. I feel the patch was OK, but not great. > :> > :> What I want to do is something w/o any locks, but I dont feel I have a > :> good enough grasp of that code base to do something really well since I > :> know it would most likely require a rewrite of that code. I think the rewrite will be necessary. It should be very easy, just knowing the Solaris data structures. For both the Solaris and SVR4 DNLC, the one thing I would caution you against is a *really* blind reimplementation. The SVR4 code, with which I am very familiar, did not allow entries with a NULL vp for the lookup target. In point of face, you should allow this. The reason is that a negative cache is twice as valuable as a postive cache, for any list of data that must be linearly iterated. The reason is that, on a hit, you iterate, on average 50% of the entries before you find the one you are looking for; on a miss, you must iterate all 100% of the entries, to make sure they are not the one. As a common example, as you look for a file before you go to create it; when this happens, you iterate all directory entries in the directory in which the create is taking place, and you do not find the item. So you go to create it; the insertion algorithm *again* iterates it, because it does not know if it is a create or replace operation. If you had a cache entry, it would still need to iterate the entries, but it could avoid the compare operation that would otherwise be necessary, and only look for free space to put the entry. A possible optimization would cache the offset of a first-fit entry in the NULL vp valued DNLC entry. This gives you a search start point for the insert which is "likely" to still be valid, and you can search from that point onward, because the earlier entries are likely to remain "too small". Maybe it should be bounded against "cache poisoning" (e.g. you should maybe limit the total number of negative entries to some maximum less than the size of the whole cache). Not sure this is valid, since you could "poison" with positive entries, too. Another possible optimization, which may be no good on FreeBSD is to add entries on traversal. This optimization is best suited to Samba, Appletalk, and similar servers, which expect the directory entry to be identical to the inode, so you have to have the inode information available immediately following knowing the directory entry is there. You'd only do this in the VOP_READDIR iteration case, not everywhere you iterate for just one file lookup, but it will gain you at least 15% performance on diretory operations on a file server for PC's (such servers iterate, then stat each file to get stat data to return both stat data and names as a single block of "file entry information"). > I read the stuff in the Solaris book (I've got that one), but I've not > read the section in Magic Garden (despite owning it as well). > > I'll read further. Really, either one is enough to write the implementation. The SVR4 book has API definitions, though, so it might be useful to you to use both books, if you are having trouble, and don't want to use the BSD namecache API for some reason. > Thanks! You're welcome. -- Terry
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?3E8E0CA7.E21BA871>