From owner-freebsd-stable@FreeBSD.ORG Thu Aug 12 14:04:11 2004 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 76DFF16A4CE for ; Thu, 12 Aug 2004 14:04:11 +0000 (GMT) Received: from ganymede.hub.org (u46n208.hfx.eastlink.ca [24.222.46.208]) by mx1.FreeBSD.org (Postfix) with ESMTP id 0FC2543D54 for ; Thu, 12 Aug 2004 14:04:11 +0000 (GMT) (envelope-from scrappy@hub.org) Received: by ganymede.hub.org (Postfix, from userid 1000) id F3B263C23C; Thu, 12 Aug 2004 11:04:11 -0300 (ADT) Received: from localhost (localhost [127.0.0.1]) by ganymede.hub.org (Postfix) with ESMTP id E09D03C1FD for ; Thu, 12 Aug 2004 11:04:11 -0300 (ADT) Date: Thu, 12 Aug 2004 11:04:11 -0300 (ADT) From: "Marc G. Fournier" To: freebsd-stable@freebsd.org Message-ID: <20040812104827.Y62519@ganymede.hub.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Subject: Just locked up a file system ... but not the system ... X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 12 Aug 2004 14:04:11 -0000 I have a server that is a wee bit on the loaded side right now, but still functional ... there are currently 84 jail'd environments running on her ... Its a Dual Xeon, 4G of RAM, and 6x72G drives in a RAID5 configuration ... kernel/world is built as: CFLAGS= -O -mpentium -pipe -g -DKVA_PAGES=512 COPTFLAGS= -O -mpentium -pipe -DKVA_PAGES=512 and the system is running: 4.10-STABLE FreeBSD 4.10-STABLE #7: Sat Aug 7 20:47:34 ADT 2004 And has been up for 4 days: neptune# uptime 10:50AM up 4 days, 13:56, 11 users, load averages: 13.07, 21.89, 20.62 my first thought was vnodes, but that usually affects the whole system, and looking at sysctl, I have plenty left: kern.maxvnodes: 512000 kern.minvnodes: 61907 debug.sizeof.vnode: 168 debug.numvnodes: 439246 debug.wantfreevnodes: 25 debug.freevnodes: 298396 There are no errors that I can see in /var/log/messages to indicate a problem with the system, at least for the past 6 hours or so ... Up until about 30 minutes ago, I could do things on the file system itself (/vm), but couldn't do anything within a directory on the file system (/vm/323) ... then it 'widened' to affect anything I did on /vm ... Looking at top, I see some processes in an 'inode' state, and looking at a ps auxl listing, there are 47 processes in such a state right now ... I have a tech going down to reboot the machine, since I can't leave it down long ... is there something else that I should have looked at on this? Since there is no KVM/keyboard attached to this, and it wasn't booted with one in it, I don't have any way of getting to the DDB and trying to get a core :( Based on what is left running after I've killed off all VM processes that were killable, the 'inode' that its stuck on is the /vm/323 one, since the processes for that VM are the only one currently running (and nfsd) :( ---- Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email: scrappy@hub.org Yahoo!: yscrappy ICQ: 7615664