From owner-freebsd-fs@FreeBSD.ORG Thu Jan 22 01:10:42 2004 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 1E64716A4CE; Thu, 22 Jan 2004 01:10:42 -0800 (PST) Received: from VARK.homeunix.com (adsl-69-104-247-110.dsl.pltn13.pacbell.net [69.104.247.110]) by mx1.FreeBSD.org (Postfix) with ESMTP id CFD0E43D31; Thu, 22 Jan 2004 01:10:40 -0800 (PST) (envelope-from das@FreeBSD.ORG) Received: from VARK.homeunix.com (localhost [127.0.0.1]) by VARK.homeunix.com (8.12.10/8.12.10) with ESMTP id i0M9A3Ku007285; Thu, 22 Jan 2004 01:10:03 -0800 (PST) (envelope-from das@FreeBSD.ORG) Received: (from das@localhost) by VARK.homeunix.com (8.12.10/8.12.10/Submit) id i0M9A3uf007284; Thu, 22 Jan 2004 01:10:03 -0800 (PST) (envelope-from das@FreeBSD.ORG) Date: Thu, 22 Jan 2004 01:10:03 -0800 From: David Schultz To: Robert Watson Message-ID: <20040122091003.GA7231@VARK.homeunix.com> Mail-Followup-To: Robert Watson , Darcy Buskermolen , Greg 'groggy' Lehey , freebsd-fs@FreeBSD.ORG References: <200401210832.52068.darcy@wavefire.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: cc: Greg 'groggy' Lehey cc: freebsd-fs@FreeBSD.ORG cc: Darcy Buskermolen Subject: Re: 32k directory limit X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 22 Jan 2004 09:10:42 -0000 On Wed, Jan 21, 2004, Robert Watson wrote: > On Wed, 21 Jan 2004, Darcy Buskermolen wrote: > > > Problem is some brain dead software (to which I don't have source) > > creating these dirs all under one dir and not nesting them in a way to > > ensure that the 32k number isn't broken. > > The largest number of files (not directories) I have in a single directory > appears to be about 1.1 million. Other than the link count, there's no > real reason there couldn't be more, although you might well bump into > other scalability limits (I have to remember not to let ls sort the > directory listing for that directory, for example). The fact that UFS lacks a hash-based on-disk directory format limits scalability. Even though lookups are optimized via hashing once the directory has been read into memory, it is still necessary to transfer the entire directory from disk (or about half of it if all lookups are successful ones). If I recall correctly, some Linux folks addressed this problem in ext2 with a hackish but reverse-compatible trick, documented in Proc. FREENIX ('98 or 2000, IIRC). Several other filesystems use a hash-based on-disk format natively.