From owner-freebsd-stable  Wed Jan 30 16:13: 9 2002
Delivered-To: freebsd-stable@freebsd.org
Received: from netau1.alcanet.com.au (ntp.alcanet.com.au [203.62.196.27])
	by hub.freebsd.org (Postfix) with ESMTP id 8351A37B404
	for <freebsd-stable@FreeBSD.ORG>; Wed, 30 Jan 2002 16:12:52 -0800 (PST)
Received: from mfg1.cim.alcatel.com.au (mfg1.cim.alcatel.com.au [139.188.23.1])
	by netau1.alcanet.com.au (8.9.3 (PHNE_22672)/8.9.3) with ESMTP id LAA11567;
	Thu, 31 Jan 2002 11:12:31 +1100 (EDT)
Received: from gsmx07.alcatel.com.au by cim.alcatel.com.au
 (PMDF V5.2-32 #37640) with ESMTP id <01KDPW5MNR80VLS7XC@cim.alcatel.com.au>;
 Thu, 31 Jan 2002 11:12:29 +1100
Received: (from jeremyp@localhost)	by gsmx07.alcatel.com.au (8.11.6/8.11.6)
 id g0V0Br082680; Thu, 31 Jan 2002 11:11:53 +1100
Content-return: prohibited
Date: Thu, 31 Jan 2002 11:11:53 +1100
From: Peter Jeremy <peter.jeremy@alcatel.com.au>
Subject: Re: Strange lock-ups during backup over nfs after adding 1024M RAM
In-reply-to: <791310002584.20020130150111@ur.ru>; from sg@ur.ru on Wed, Jan 30,
 2002 at 03:01:11PM +0500
To: Sergey Gershtein <sg@ur.ru>
Cc: freebsd-stable@FreeBSD.ORG
Mail-Followup-To: Sergey Gershtein <sg@ur.ru>,
	freebsd-stable@FreeBSD.ORG
Message-id: <20020131111153.Y72285@gsmx07.alcatel.com.au>
MIME-version: 1.0
Content-type: text/plain; charset=us-ascii
Content-disposition: inline
User-Agent: Mutt/1.2.5i
References: <20020126204941.H17540-100000@resnet.uoregon.edu>
 <1931130530386.20020128130947@ur.ru>
 <20020130073449.B78919@gsmx07.alcatel.com.au>
 <791310002584.20020130150111@ur.ru>
Sender: owner-freebsd-stable@FreeBSD.ORG
Precedence: bulk
List-ID: <freebsd-stable.FreeBSD.ORG>
List-Archive: <http://docs.freebsd.org/mail/> (Web Archive)
List-Help: <mailto:majordomo@FreeBSD.ORG?subject=help> (List Instructions)
List-Subscribe: <mailto:majordomo@FreeBSD.ORG?subject=subscribe%20freebsd-stable>
List-Unsubscribe: <mailto:majordomo@FreeBSD.ORG?subject=unsubscribe%20freebsd-stable>
X-Loop: FreeBSD.ORG

On 2002-Jan-30 15:01:11 +0500, Sergey Gershtein <sg@ur.ru> wrote:
>  ps output showed that most processes were in
>'inode' state (wmesg title of ps output).  There were about a hundred
>of httpd processes and 2 nfsd in 'inode' state.  Another one nfsd process
>was in 'FFS node' state.
...
>Could you tell me what this 'inode' state means and what conclusions
>can be done from the situation?

The wmesg column provides a short text description indicating the
resource that a process is sleeping on:
"FFS node" is the memory pool used to allocate UFS inodes.
"inode" is the lock used to manage UFS inodes.

It looks like you've run out of kernel memory.  At a quick guess, one
of the nfsd processes is trying to open a file and can't allocate
space for another inode whilst holding locks on other inodes.  The
lockup is either due to the lack of KVM, or the inode locks are
migrating up towards root and gathering more processes under their
clutches until nothing can run.

If you monitor the memory usage with "vmstat -m", you should be
able to see the free memory drop to zero, possibly all eaten by
the "FFS node".

>By the way, the file system that is being backuped has a lot (more
>than 1,000,000) of small files (less than 1Kb each).

That triggers a faint memory about a problem with doing this, but
I thought it was now fixed.  How old are your sources?

>  How all of this is related to the amount of system RAM (no lock-ups
>ever happened until we increased the amount of RAM from 1Gb to
>1,5Gb).

Increasing the amount of physical RAM increases the amount of KVM
required to manage the RAM, reducing the amount of memory available
for other things.  I didn't keep your original posting and I can't
remember what MAXUSERS is set to - from memory it is either 128
(which seems too small) or 1024 (which seems too large).  Try altering
maxusers to 400-500 and see if that helps.  If you still have
problems, I think you'll need one of the FS gurus.

Peter

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-stable" in the body of the message