From owner-freebsd-hackers Sat May 26 13: 7:12 2001 Delivered-To: freebsd-hackers@freebsd.org Received: from dzerzhinsky.rem.cs.cmu.edu (DZERZHINSKY.REM.CS.CMU.EDU [128.2.80.192]) by hub.freebsd.org (Postfix) with ESMTP id EBC4837B422 for ; Sat, 26 May 2001 13:07:08 -0700 (PDT) (envelope-from nlanza@dzerzhinsky.rem.cs.cmu.edu) Received: (from nlanza@localhost) by dzerzhinsky.rem.cs.cmu.edu (8.11.3/8.11.3) id f4QK70m61414; Sat, 26 May 2001 16:07:00 -0400 (EDT) (envelope-from nlanza) To: Andrew Reilly Cc: gjb@gbch.net, jandrese@mitre.org, float@firedrake.org, hackers@FreeBSD.ORG Subject: Re: technical comparison References: <20010525044848.08CAC37B422@hub.freebsd.org> From: Nat Lanza Date: 26 May 2001 16:06:59 -0400 In-Reply-To: <20010525044848.08CAC37B422@hub.freebsd.org> Message-ID: Lines: 36 User-Agent: Gnus/5.0808 (Gnus v5.8.8) XEmacs/21.1 (Cuyahoga Valley) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Sender: owner-freebsd-hackers@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG Andrew Reilly writes: > Where in open(1) does it specify a limit on the number of files > permissible in a directory? The closest that it comes, that I can > see is: Well, read(2) doesn't tell you not to do your IO one character at a time, but that doesn't mean it's a good idea. The point here is not interface definitions, it's efficiency. Nobody's saying you shouldn't be _allowed_ to put thousands and thousands of files in a directory if you like. They're just saying that you shouldn't expect it to be fast. Similarly, you can read data one byte at a time if you like, but you shouldn't expect that to be fast either. Pointing to manpages and saying you weren't warned that a particular approach is slow is a really weak defense. Do you expect cliffs to have little "If you drive off this cliff, you will die" warning signs on them? If a documented part of the API simply did not work, then you'd have a point. Instead, what we have is a case where a method of storing files that most people reasonably expect to be slow is in fact slow. The folks who've pointed out the /a/a/aardvark solution are right -- directory hashing is a well-known solution to this problem. It isn't a hack at all. No matter what method you use for storing directories, larger directories are going to be slower to use than smaller ones, and hashing filenames fixes that. --nat -- nat lanza ----------------------------------- there are no whole truths; magus@cs.cmu.edu ---------------------------- all truths are half-truths http://www.cs.cmu.edu/~magus/ --------------- -- alfred north whitehead To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-hackers" in the body of the message