Date: 30 May 2003 09:15:58 -0400 From: Lowell Gilbert <freebsd-questions-local@be-well.no-ip.com> To: freebsd-questions@freebsd.org Subject: Re: About reading and writing to files Message-ID: <443ciw60dt.fsf@be-well.ilk.org> In-Reply-To: <p05200f25bafcc5432a6a@[192.168.254.205]> References: <Pine.SOL.4.51.0305300301180.4396@herald.cc.purdue.edu> <p05200f25bafcc5432a6a@[192.168.254.205]>
next in thread | previous in thread | raw e-mail | index | archive | help
Rich Morin <rdm@cfcl.com> writes: > At 3:04 AM -0500 5/30/03, Bingrui Foo wrote: > >I'm wondering in freeBSD, if I have a directory with 10,000 files, or > >maybe even 100,000 files, each about 5 kb long. Wondering will reading and > >writing to any one of these files in C be affected by the sheer number of > >these files? Will the access time be affected significantly? > > > >Just wondering because not sure whether I should put these data in a > >database or just use files with unique names. > > > >Also will separating the files into many directories help? > > Looking up .../x/12/34/56 can be done in logarithmic time (i.e., look up > .../x/12, then .../x/12/34, then .../x/12/34/56); looking up > .../y/123456 (unless some optimization has been added) will require a > linear scan > through the directory. In short, don't go there... An optimization *has* been added. If you have options UFS_DIRHASH #Improve performance on big directories in your kernel (it's been in GENERIC for at least several months) then you should get (in the limit) logarithmic time on *each* lookup. And there's a large extra term in the denominator, as well. The size of the files doesn't matter, and the number of files shouldn't matter in the range of 10,000 files. Whether it matters on 100,000 I can't guess offhand, but obviously it will depend on how often the application is doing a lookup.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?443ciw60dt.fsf>