From owner-freebsd-stable@FreeBSD.ORG  Mon Jun 23 12:13:40 2003
Return-Path: <owner-freebsd-stable@FreeBSD.ORG>
Delivered-To: freebsd-stable@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP
	id 9F17337B401; Mon, 23 Jun 2003 12:13:40 -0700 (PDT)
Received: from hub.org (hub.org [64.117.225.220])
	by mx1.FreeBSD.org (Postfix) with ESMTP
	id 15C8843F75; Mon, 23 Jun 2003 12:13:40 -0700 (PDT)
	(envelope-from scrappy@hub.org)
Received: from hub.org (unknown [64.117.225.220])
	by hub.org (Postfix) with ESMTP
	id 151CD6BA7B7; Mon, 23 Jun 2003 16:13:37 -0300 (ADT)
Date: Mon, 23 Jun 2003 16:13:37 -0300 (ADT)
From: "Marc G. Fournier" <scrappy@hub.org>
To: David Schultz <das@freebsd.org>
In-Reply-To: <20030623190926.GA1049@HAL9000.homeunix.com>
Message-ID: <20030623161200.U10424@hub.org>
References: <20030623122939.N10424@hub.org>
	<20030623190926.GA1049@HAL9000.homeunix.com>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
cc: freebsd-stable@freebsd.org
cc: Ted Mittelstaedt <tedm@toybox.placo.com>
Subject: Re: Processes hung in 'inode' state ...
X-BeenThere: freebsd-stable@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Production branch of FreeBSD source code
	<freebsd-stable.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-stable>,
	<mailto:freebsd-stable-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-stable>
List-Post: <mailto:freebsd-stable@freebsd.org>
List-Help: <mailto:freebsd-stable-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-stable>,
	<mailto:freebsd-stable-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 23 Jun 2003 19:13:40 -0000


Can't this time, since we had to reboot (else alot of pissed off
postgresql.org users *grin*) ... but will add that to my debugging for
next time through ... the fun part is catching the system while its only
one of the VMs that is stuck and not the whole server :(

On Mon, 23 Jun 2003, David Schultz wrote:

> On Mon, Jun 23, 2003, Marc G. Fournier wrote:
> >
> > Not sure what all to do, but doing a 'gdb -k kernel.debug /dev/mem', a
> > backtrack on one of the processes shows ... server has been up 12 days
> > now, and running a June 8th kernel ...
> >
> > #0  0x20f4f0 in ?? ()
> > (kgdb) proc 67258
> > (kgdb) bt
> > #0  mi_switch () at machine/globals.h:119
> > #1  0x8014a1f9 in tsleep (ident=0x8a3d2200, priority=8, wmesg=0x80263d4a "inode", timo=0) at /usr/src/sys/kern/kern_synch.c:479
> > #2  0x80141507 in acquire (lkp=0x8a3d2200, extflags=16777280, wanted=1536) at /usr/src/sys/kern/kern_lock.c:147
> > #3  0x8014179c in lockmgr (lkp=0x8a3d2200, flags=16973826, interlkp=0xbdc27b6c, p=0xc26fa9c0) at /usr/src/sys/kern/kern_lock.c:355
>
> Ooh, a deadlock.  I'm guessing unionfs is responsible again.  ;-)
> Can you type in the following?
>
> 	print *(struct lock *)0x8a3d2200
>
> Then get a backtrace of the process with pid equal to lk_lockholder.
>