From owner-freebsd-current@FreeBSD.ORG Tue Jun 7 16:56:09 2005 Return-Path: X-Original-To: freebsd-current@freebsd.org Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 90B3F16A41C; Tue, 7 Jun 2005 16:56:09 +0000 (GMT) (envelope-from rwatson@FreeBSD.org) Received: from cyrus.watson.org (cyrus.watson.org [204.156.12.53]) by mx1.FreeBSD.org (Postfix) with ESMTP id 4A7ED43D53; Tue, 7 Jun 2005 16:56:09 +0000 (GMT) (envelope-from rwatson@FreeBSD.org) Received: from fledge.watson.org (fledge.watson.org [204.156.12.50]) by cyrus.watson.org (Postfix) with ESMTP id 4C3AB46B2D; Tue, 7 Jun 2005 12:56:08 -0400 (EDT) Date: Tue, 7 Jun 2005 17:57:02 +0100 (BST) From: Robert Watson X-X-Sender: robert@fledge.watson.org To: Eric Anderson In-Reply-To: <42A59367.6060307@centtech.com> Message-ID: <20050607175242.D61131@fledge.watson.org> References: <17059.7150.269428.448187@roam.psg.com> <42A4D5D0.9040500@elischer.org> <42A59367.6060307@centtech.com> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: Randy Bush , freebsd-fs@freebsd.org, FreeBSD Current , Julian Elischer Subject: Re: you are in an fs with millions of small files X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 07 Jun 2005 16:56:09 -0000 On Tue, 7 Jun 2005, Eric Anderson wrote: > Julian Elischer wrote: >> what. all in one directory? >> >> I've only had up to 500,000 files in one directory on FreeBSD. > > The only problems I've had with a directory with millions of files is > things like ls -al with attempt to sort the list, but the list doesn't > fit into memory. Access to the files is of course very snappy. Ditto. I regularly use directories with tens and hundreds of thousands of entries as a result of manipulating very large folders with the Cyrus server. I run into the following two classes of problems: - Some appliations behave poorly with large trees. ls(1) is the classic example -- sorting 150,000 strings is expensive, and should be avoided. It also requires holding al the strings in memory rather than continuing the iteration. fts ns bad about this, so many applications that use fts suffer from this. With the sort issue, -f makes a big difference. - Some operations become more expensive -- as directories grow, the cost of adding new entries gets more expensive. You'll notice this fairly substantailly if you untar a tar file with many entries in the same directory -- early on, cost of insert for a new item is very cheap, but it rapidly slows down from h thousands of inserts per second to hundreds or less. I notice this if I restore a large Cyrus directory from backup. - UFS_DIRHASH really helps with large directory performance by reducing the cost of lookup, but at the cost of memory. Make sure the box has lots of memory. All this said -- FreeBSD works really well for me with large file counts, I rarely hit the edge cases where there is a problem. Most problems are with applications, and when you are using more extreme file system layouts, you typically are using applications customized for that andso they do the right things. Robert N M Watson