Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 13 Feb 2007 09:35:36 +0100 (CET)
From:      Oliver Fromme <olli@lurza.secnetix.de>
To:        freebsd-fs@FreeBSD.ORG, aronesimi@yahoo.com
Subject:   Re: comments on newfs raw disk ?  Safe ? (7 terabyte array)
Message-ID:  <200702130835.l1D8Zams027346@lurza.secnetix.de>
In-Reply-To: <20070213055806.55404.qmail@web58601.mail.re3.yahoo.com>

next in thread | previous in thread | raw e-mail | index | archive | help
Arone Silimantia wrote:
 > Oliver Fromme wrote:
 > > That "1 GB per TB" requirement is just a rule of thumb.
 > > I don't know hoe accurate it is.  Also note that it is
 > > desirable to avoid having fsck use swap, because it will
 > > be even slower then.  A lot slower.
 > 
 > Ok, understood.  But regardless of performance, fsck will use
 > BOTH physical and swap,

Basically yes.  fsck runs as a normal userland process, so
it can use memory (RAM + swap) like any other programm,
but it is also subject of the usual limitations (e.g.
resource limits, address space limitations etc.).

 > so as far as fsck is concerned, I have 8 GB of
 > memory ?

Only if you run a 64bit operating system (FreeBSD/amd64,
/ia64 or /sparc64).  In 32bit operating systems the address
space is limited to 4 GB.  Also note that the kernel needs
some room from that address space, so the available space
will be even smaller, usually 3 GB or less, depending on
how your kernel is tuned.

 > > I suggest you test it before putting it into production,
 > > i.e. populate the file system with the expected number of
 > > files, then run fsck.
 > 
 > Well, here is what I am assuming, and I would like to get some
 > confirmation on these two points:
 > 
 > - The time it takes to fsck is not a function of how many inodes are
 > initialized from newfs, but how many you are _actually using_.
 > 
 > - But the amount of memory the fsck takes is a function of how many inodes
 > exist, regardless of how many you are actually using.
 > 
 > Are these two interpretations correct ?

The answer is yeas and no.  :-)  I have to admit that I'm
not 100% sure here, so please someone correct me if I'm
wrong ...

However, fsck runs several passes which do different things
on the file system.  One of the passes involves reading all
directory information -- this pass is obviously dependant
on the number of directories and files that are actually
allocated on the file system.  In another pass fsck checks
for lost inodes -- this pass involves visiting _all_ inodes,
no matter if they're currently marked as allocated or not.
So you have both parameters in the equation, and they affect
both the memory requirements and the run time of fsck.

The exact function of inodes vs. memory/runtime is probably
not very simple.  That's why I suggested you try it yourself
under the expected conditions before putting the machine
into production.

 > > > I just need to know if my 4+4 GB of memory is enough, and if this
 > > > option in loader.conf:
 > > > 
 > > > kern.maxdsiz="2048000000"
 > > 
 > > That will limit the process size to 2 GB.  You might need
 > > to set it higher if fsck needs more than that.  (I assume
 > > you're running FreeBSD/amd64, or otherwise you'll run into
 > > process size limitations anyway.)
 > 
 > Well ... no, I am using normal x86 FreeBSD on an Intel based system.  I
 > have 4 GB of physical ram, and 4 GB of swap.  So I am tempted to just make
 > that number 4096000000 and be done with it ... 

See my comment about 32bit vs. 64bit above.  If you're
running a 32bit OS (such as FreeBSD/i386), you have a 4GB
address space limit, and it is shared between kernel and
userland processes.  Of course, every process has its own
(virtual) address space, but the kernel virtual memory
(KVM) is always mapped into it.  So, for example, if the
kernel uses 1 GB of KVM, then a single userland process
can only be as big as 3 GB.

(By the way, the PAE option does _not_ change the limit of
the address space.  It's still only 4 GB even with PAE.)

 > if fsck doesn't need
 > that much memory, there is no harm to the system in simply having an
 > inflated limit like that, is there ?

Well, the process limits are useful for protection against
run-away processes that just keep growing (because of a bug,
an attack or other circumstances).  If there's no limit, a
single process can take the whol system down by using up
all of its resources.

However, there's a soft and a hard limit.  The maxdsize
parameter specifies the maximum hardlimit, so you can still
have a lower soft limit for certain processes or users.
You can modify the limits via /etc/login.conf.  (The soft
limit can only be increased up to the hard limit, and the
hard limit can never be increased.)

 > BUT, why isn't it possible to compute fsck _memory needs_ ?  If I have a
 > filesystem of A size with X inodes init'd, and Y inodes used, shouldn't I
 > be able to compute how much memory fsck will need ?

Yes, in theory that should be possible.  Either by carefully
reading the fsck source code, or by running fsck on various
test file systems and trying to build a function from the
observed process sizes.  However, it isn't _that_ trivial,
because it also depends on the malloc implementation and
on the malloc flags in use (e.g. via /etc/malloc.conf).

By the way, I think fsck also records and checks the path
names of all files, so those must be taken into the
equation, too.  Short names will take less space.  I just
did a "find /usr/src | wc" for testing, and it showed
about 50,000 files, and the path names are 2 MB total.
If you have 25,000,000 files with the same average file
name length, those names alone will take 1 GB to store.

(I'm assuming here that fsck indeed stores all the paths
names at the same time.  I don't know if it really does
that.  I haven't examined the source code closely.)

Best regards
   Oliver

-- 
Oliver Fromme, secnetix GmbH & Co. KG, Marktplatz 29, 85567 Grafing b. M.
Handelsregister: Registergericht Muenchen, HRA 74606,  Geschäftsfuehrung:
secnetix Verwaltungsgesellsch. mbH, Handelsregister: Registergericht Mün-
chen, HRB 125758,  Geschäftsführer: Maik Bachmann, Olaf Erb, Ralf Gebhart
Any opinions expressed in this message are personal to the author and may
not necessarily reflect the opinions of secnetix GmbH & Co KG in any way.
FreeBSD-Dienstleistungen, -Produkte und mehr:  http://www.secnetix.de/bsd

"It combines all the worst aspects of C and Lisp:  a billion different
sublanguages in one monolithic executable.  It combines the power of C
with the readability of PostScript."
        -- Jamie Zawinski, when asked: "What's wrong with perl?"



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200702130835.l1D8Zams027346>