Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 6 Nov 2000 18:14:41 -0600 (CST)
From:      Mike Meyer <mwm@mired.org>
To:        shashi@shift-f1.com
Cc:        Mike Meyer <mwm@mired.org>, questions@freebsd.org
Subject:   Re: filesystem question
Message-ID:  <14855.18801.662896.498884@guru.mired.org>
In-Reply-To: <20001106221740.16951.qmail@web6102.mail.yahoo.com>
References:  <20001106221740.16951.qmail@web6102.mail.yahoo.com>

next in thread | previous in thread | raw e-mail | index | archive | help
Shashi Joshi <shashi_kant_joshi@yahoo.com> types:
> Thanks Mike for your reply.
> I think I made a slight oversight in my question.
> 
> --- Mike Meyer <mwm@mired.org> wrote:
> > Shashi Joshi <shashi_kant_joshi@yahoo.com> types:
> > > Hi,
> > > What is the limit on number of files/dir/subsirs on a filesystem?
> > 
> > Yes. Each one takes up an inode, and there are a limited number of
> > inodes in a filesystem. You can set it when the filesystem is created
> > if you need to. See the newfs man page.
> I meant, is there a limit on a DIRECTORY having certain number of
> files/sub dirs under it? If yes, how does one find it out?
> And, also, doing df should be an easy way to find out how many inodes
> available and max for a FS, right?

Just the limits of the media. And no, not df - try dumpfs.

> > > e.g. If I have a flat file database which may result in a few
> > thousand
> > > files on a FS, (say 5 files per user), and will also result in
> > creating
> > > deleting and of course read/write of files. On the other hand, if I
> > > install MySql or some other database, then I have only say 10-20
> > files
> > > (including index files) and now the traffic passes through the
> > database
> > > thread. I mean instead of reading the OS files directly, the web
> > page
> > > will cause a DB query, which will pass file contents (data) to it.
> > > 
> > > Which one is better/How do they compare?
> > 
> > Unless the files are completely unrelated, use the SQL server. If
> > there is some file written to by multiple transactions, you'll have
> 
> The files are related only by the fact that each user has a certain
> number of files, web page, uploaded files, mail etc. No apparent
> connection as such. So, should I use a top level dir per user and then
> all teh files in it per user or use a database?
> In teh flat file system, if a user has an average of 200 files (one per
> mail, one per page, and some uploaded files/pics, and I have 10,000
> users that is 2,000,000 files + the dirs/subdirs!
> If I put stuff in DB, it might come down to 20-50 big fat files.
> I could get a 70G drive SCSI to handle the big files. I hope FreeBSD
> can take 70+ GB SCSI drives, can it?

One dir per user would be better than everything in one directory. I
don't know what the limit on drive size for FreeBSD is. For what
you're talking about, you probably want to look into using vinum.

> > If you use a server that supports transactions, you can make the same
> > magic apply over an entire interaction, and back things out by simply
> > aborting the transaction.
> MySql has no transactions :-(

PostGreSQL has transactions. However, the other advantages of using a
database (not having to worry about locking data, at least if you can
write one SQL statement to do the job) apply for MySQL.

> Also, is it true that files are searched for in a dir sequentially?
> I mean if you 5000 files in a dir, and do cat filename, will the system
> read the dir sequentially to find where the filename is, and then get
> its inode ... ? This would mean that the more files you have in a dir,
> the slower the access will become?

Yup, that's it exactly. There are some hacks to make common operations
perform better than that (i.e. - ls is O(n), not O(n**2)), but that
performance hit is liable to be more of a problem for you than any
directory limits.

I'd say your problem is large enough that you'd be crazy not using a
database server, with the user being an index so you can get quick
access to it. If you really want to go with flat files, look into a
directory hashing scheme so that you have 100 users in each of 100
directories instead of one directory with 10,000 users in it.

	<mike


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-questions" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?14855.18801.662896.498884>