From owner-freebsd-stable@FreeBSD.ORG  Sun Mar 25 23:09:03 2007
Return-Path: <owner-freebsd-stable@FreeBSD.ORG>
X-Original-To: freebsd-stable@freebsd.org
Delivered-To: freebsd-stable@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
	by hub.freebsd.org (Postfix) with ESMTP id A046916A401
	for <freebsd-stable@freebsd.org>; Sun, 25 Mar 2007 23:09:03 +0000 (UTC)
	(envelope-from oleg@vsi.ru)
Received: from serv2.vsi.ru (serv2.vsi.ru [80.82.32.11])
	by mx1.freebsd.org (Postfix) with ESMTP id 223A913C448
	for <freebsd-stable@freebsd.org>; Sun, 25 Mar 2007 23:09:02 +0000 (UTC)
	(envelope-from oleg@vsi.ru)
Received: from serv2.vsi.ru (localhost [127.0.0.1])
	by serv2.vsi.ru (8.13.8/8.13.8) with ESMTP id l2PN8w3p060422
	for <freebsd-stable@freebsd.org>; Mon, 26 Mar 2007 03:08:58 +0400 (MSD)
	(envelope-from oleg@vsi.ru)
Received: (from nobody@localhost)
	by serv2.vsi.ru (8.13.8/8.13.8/Submit) id l2PN7LF3060392
	for freebsd-stable@freebsd.org; Mon, 26 Mar 2007 03:07:21 +0400 (MSD)
	(envelope-from oleg@vsi.ru)
X-Authentication-Warning: serv2.vsi.ru: nobody set sender to oleg@vsi.ru using
	-f
To: freebsd-stable@freebsd.org
Message-ID: <1174864041.460700a97e1eb@webmail.vsi.ru>
Date: Mon, 26 Mar 2007 03:07:21 +0400 (MSD)
From: Oleg Derevenetz <oleg@vsi.ru>
References: <007d01c7605f$79fa60c0$3b215250@NOTEBOOK>
	<20070307030448.GQ10453@deviant.kiev.zoral.com.ua>
	<001101c7625c$c9997860$c8c55358@delloleg>
In-Reply-To: <001101c7625c$c9997860$c8c55358@delloleg>
MIME-Version: 1.0
Content-Type: text/plain; charset=koi8-r
Content-Transfer-Encoding: 8bit
User-Agent: IMP/PHP IMAP webmail program 2.2.8
X-Originating-IP: 80.82.33.59
Subject: Re: Processes get stuck in "ufs" state
X-BeenThere: freebsd-stable@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Production branch of FreeBSD source code <freebsd-stable.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-stable>, 
	<mailto:freebsd-stable-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-stable>
List-Post: <mailto:freebsd-stable@freebsd.org>
List-Help: <mailto:freebsd-stable-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-stable>,
	<mailto:freebsd-stable-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 25 Mar 2007 23:09:03 -0000

مةشةزصہ Oleg Derevenetz <oleg@vsi.ru>:

> On Wed, Mar 07, 2007 at 05:22:38AM +0300, Oleg Derevenetz wrote:
> 
> >> Sometimes (once a week approximately) I have a problem with the same
> >> symptoms described here on SMP FreeBSD 6.2-STABLE with dual AMD
> Opteron(tm)
> >> Processor 850:
> >>
> >> http://www.freebsd.org/cgi/query-pr.cgi?pr=104406&cat=
> >>
> >> Sometimes (apparently when CPU load suddenly goes up) all processes
> that
> >> interacts with disk gets stuck in "ufs" state, but in my case
> >> SIGSTOP/SIGCONT seemingly does not help.
> >
> > See developer handbook, Deadlock Debugging chapter for instruction
> what
> > information shall be gathered to debug the problem.
> 
> OK, I built kernel with debug options and will wait for stuck. By the
> way, when debug options turned on, I see this message on every 
> boot when nullfs mounting in progress:
> 
> acquiring duplicate lock of same type: "vnode interlock"
>  1st vnode interlock @ /usr/src/sys/kern/vfs_vnops.c:806
>  2nd vnode interlock @ /usr/src/sys/kern/vfs_subr.c:2040
> KDB: stack backtrace:
> kdb_backtrace(3,cfc60300,c05926d0,c05926d0,c05542c4,...) at
> kdb_backtrace+0x29
> witness_checkorder(cfd5c4dc,9,c051cf1e,7f8) at witness_checkorder+0x578
> _mtx_lock_flags(cfd5c4dc,0,c051cf1e,7f8,cfb28b90,...) at
> _mtx_lock_flags+0x78
> vrefcnt(cfd5c414) at vrefcnt+0x20
> null_checkvp(cff5eae0,c050c4a6,215) at null_checkvp+0x56
> null_lock(f02f1a68) at null_lock+0x66
> VOP_LOCK_APV(c054d540,f02f1a68) at VOP_LOCK_APV+0x87
> vn_lock(cff5eae0,1002,cfc60300,cff5eae0,cff5ed04,...) at vn_lock+0xac
> nullfs_root(cff76b90,2,f02f1ae0,cfc60300,0,8,0,c05cfca0,0,c051c79c,407)
> at nullfs_root+0x26
> vfs_domount(cfc60300,cfe3d340,cfe3d130,d,cfe3d3f0,c05817e0,0,c051c79c,2bf)
> at vfs_domount+0x975
> vfs_donmount(cfc60300,d,cfe73080,cfe73080,0,...) at vfs_donmount+0x3f9
> nmount(cfc60300,f02f1d04) at nmount+0x8b
> syscall(3b,3b,3b,bf7fe5f5,bf7feea0,...) at syscall+0x25b
> Xint0x80_syscall() at Xint0x80_syscall+0x1f
> --- syscall (378, FreeBSD ELF32, nmount), eip = 0x280bc0e7, esp =
> 0xbf7fe5bc, ebp = 0xbf7fee38 ---
> 
> This host have nullfs filesystems. Is this can be related to deadlock ?

FYI: after replacing nullfs filesystems with unionfs (using new unionfs 
implementation):

http://people.freebsd.org/~daichi/unionfs/

all deadlocks are gone. It seems to be a problem in current nullfs 
implementation, but I can't debug it properly because deadlock cases are 
relatively rare and machine that uses nullfs is heavily loaded so WITNESS and 
DEBUG options leads to unacceptable performance penalty.