From owner-freebsd-questions@FreeBSD.ORG Fri May 30 06:16:01 2003 Return-Path: Delivered-To: freebsd-questions@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 5076937B401 for ; Fri, 30 May 2003 06:16:01 -0700 (PDT) Received: from sccrmhc03.attbi.com (sccrmhc03.attbi.com [204.127.202.63]) by mx1.FreeBSD.org (Postfix) with ESMTP id 79BBC43F85 for ; Fri, 30 May 2003 06:16:00 -0700 (PDT) (envelope-from freebsd-questions-local@be-well.no-ip.com) Received: from be-well.ilk.org (lowellg.ne.client2.attbi.com[24.147.188.198]) by attbi.com (sccrmhc03) with ESMTP id <2003053013155900300k73h0e>; Fri, 30 May 2003 13:15:59 +0000 Received: from be-well.ilk.org (lowellg.ne.client2.attbi.com [24.147.188.198] (may be forged)) by be-well.ilk.org (8.12.9/8.12.7) with ESMTP id h4UDFxOA021979 for ; Fri, 30 May 2003 09:15:59 -0400 (EDT) (envelope-from freebsd-questions-local@be-well.no-ip.com) Received: (from lowell@localhost) by be-well.ilk.org (8.12.9/8.12.6/Submit) id h4UDFwcL021976; Fri, 30 May 2003 09:15:58 -0400 (EDT) X-Authentication-Warning: be-well.ilk.org: lowell set sender to freebsd-questions-local@be-well.ilk.org using -f Sender: lowell@be-well.no-ip.com To: freebsd-questions@freebsd.org References: From: Lowell Gilbert Date: 30 May 2003 09:15:58 -0400 In-Reply-To: Message-ID: <443ciw60dt.fsf@be-well.ilk.org> Lines: 29 User-Agent: Gnus/5.09 (Gnus v5.9.0) Emacs/21.3 MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Subject: Re: About reading and writing to files X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 30 May 2003 13:16:01 -0000 Rich Morin writes: > At 3:04 AM -0500 5/30/03, Bingrui Foo wrote: > >I'm wondering in freeBSD, if I have a directory with 10,000 files, or > >maybe even 100,000 files, each about 5 kb long. Wondering will reading and > >writing to any one of these files in C be affected by the sheer number of > >these files? Will the access time be affected significantly? > > > >Just wondering because not sure whether I should put these data in a > >database or just use files with unique names. > > > >Also will separating the files into many directories help? > > Looking up .../x/12/34/56 can be done in logarithmic time (i.e., look up > .../x/12, then .../x/12/34, then .../x/12/34/56); looking up > .../y/123456 (unless some optimization has been added) will require a > linear scan > through the directory. In short, don't go there... An optimization *has* been added. If you have options UFS_DIRHASH #Improve performance on big directories in your kernel (it's been in GENERIC for at least several months) then you should get (in the limit) logarithmic time on *each* lookup. And there's a large extra term in the denominator, as well. The size of the files doesn't matter, and the number of files shouldn't matter in the range of 10,000 files. Whether it matters on 100,000 I can't guess offhand, but obviously it will depend on how often the application is doing a lookup.