From owner-freebsd-fs Thu Apr 24 13:40:10 1997 Return-Path: Received: (from root@localhost) by hub.freebsd.org (8.8.5/8.8.5) id NAA29788 for fs-outgoing; Thu, 24 Apr 1997 13:40:10 -0700 (PDT) Received: from critter.dk.tfs.com (phk.cybercity.dk [195.8.129.17]) by hub.freebsd.org (8.8.5/8.8.5) with ESMTP id NAA29717; Thu, 24 Apr 1997 13:39:53 -0700 (PDT) Received: from critter (localhost [127.0.0.1]) by critter.dk.tfs.com (8.8.5/8.8.5) with ESMTP id WAA01422; Thu, 24 Apr 1997 22:38:55 +0200 (CEST) To: dg@root.com cc: Poul-Henning Kamp , fs@freebsd.org From: Poul-Henning Kamp Subject: Re: the namei cache... In-reply-to: Your message of "Thu, 24 Apr 1997 13:18:33 PDT." <199704242018.NAA10732@root.com> Date: Thu, 24 Apr 1997 22:38:54 +0200 Message-ID: <1420.861914334@critter> Sender: owner-fs@freebsd.org X-Loop: FreeBSD.org Precedence: bulk In message <199704242018.NAA10732@root.com>, David Greenman writes: >>In message <199704241934.MAA10488@root.com>, David Greenman writes: >>>>about 1/4th of the additions. Getting reed of the div wouldn't also be bad >>>>but might not be worth it. >>> >>> Not worth it? At greater than 50 cycles, the divide is more expensive >>>than the entire add loop. >> >>But loosing the hash entirely is >even< faster, right ? :-) > > Maybe. Here's the problem with your approach as I see it: On wcarchive, >we have about 44,000 files in the namei cache. The hash table size is >approximately 44,000 entries large (rounded up or down to the nearest Unlikely. You have a 44000 entry cache, but as far as I've been able to make out you only have about 40000 >valid< entries in it (if it's anything like the data I have here). I'm trying to get the last 10% back. >prime...I forget). This means that there is an average of about 1 file in >each hash bucket...or very little to search through. Unfortunately not so, finding a hash that good would be really, really, hard. But I agree that we're in the O(3) land. >each hash bucket...or very little to search through. Of course there are >far fewer than 1 directory per file, so now if you change this so that all >of the files for a directory are on one list, then you're going to have to do >a lot more work to find it. I think the average directory length on wcarchive >is about 50 files or so, so this would be about 50 times more entries in the >list compared to the old hash table scheme. This is what I thought, but appearantly not so. Part of the problem seems to be that there are multiple names pointing at the same directory ("foo" + N * "..") and that depletes your name-cache. With the current design it should probably be 1.5 .. 2 times bigger than desiredvnodes. I'm very reluctant to increase it, when entries cost 64 bytes each, and since data seems to indicate that 10% is stale (but we don't know how to find them), so we keep recycling valid entries instead. Another thing that bothers me is the size. The reason for the current size of 64 is the way malloc works. In reality we would get very close to the same efficiency from 48 bytes per entry. I may look into changing the way we allocate them. It would buy us 33% more entries in the same space. David: Can I get you to try one (mostly harmless) experiment on wcarchive ? In vfs_cache.c, where it checks the number of namecache entries against "desiredvnodes" could you try to use 2*desiredvnodes (or something similar) instead, and measure the difference in cache hit rate ? Poul-Henning PS: I guess that I'm doing "the bad thing" here, I'm changing two variables at the same time: 1. changing linkage of namecache entries to permit complete removal (This increases the efficiency of the cache) 2. dropping hash algorithm. (This may make the cache faster/slower depending on POM) But don't worry, until I have understood the impact, I'm not committing anything :-) -- Poul-Henning Kamp | phk@FreeBSD.ORG FreeBSD Core-team. http://www.freebsd.org/~phk | phk@login.dknet.dk Private mailbox. whois: [PHK] | phk@tfs.com TRW Financial Systems, Inc. Power and ignorance is a disgusting cocktail.