From owner-freebsd-stable@FreeBSD.ORG Sun Mar 25 23:09:03 2007 Return-Path: X-Original-To: freebsd-stable@freebsd.org Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id A046916A401 for ; Sun, 25 Mar 2007 23:09:03 +0000 (UTC) (envelope-from oleg@vsi.ru) Received: from serv2.vsi.ru (serv2.vsi.ru [80.82.32.11]) by mx1.freebsd.org (Postfix) with ESMTP id 223A913C448 for ; Sun, 25 Mar 2007 23:09:02 +0000 (UTC) (envelope-from oleg@vsi.ru) Received: from serv2.vsi.ru (localhost [127.0.0.1]) by serv2.vsi.ru (8.13.8/8.13.8) with ESMTP id l2PN8w3p060422 for ; Mon, 26 Mar 2007 03:08:58 +0400 (MSD) (envelope-from oleg@vsi.ru) Received: (from nobody@localhost) by serv2.vsi.ru (8.13.8/8.13.8/Submit) id l2PN7LF3060392 for freebsd-stable@freebsd.org; Mon, 26 Mar 2007 03:07:21 +0400 (MSD) (envelope-from oleg@vsi.ru) X-Authentication-Warning: serv2.vsi.ru: nobody set sender to oleg@vsi.ru using -f To: freebsd-stable@freebsd.org Message-ID: <1174864041.460700a97e1eb@webmail.vsi.ru> Date: Mon, 26 Mar 2007 03:07:21 +0400 (MSD) From: Oleg Derevenetz References: <007d01c7605f$79fa60c0$3b215250@NOTEBOOK> <20070307030448.GQ10453@deviant.kiev.zoral.com.ua> <001101c7625c$c9997860$c8c55358@delloleg> In-Reply-To: <001101c7625c$c9997860$c8c55358@delloleg> MIME-Version: 1.0 Content-Type: text/plain; charset=koi8-r Content-Transfer-Encoding: 8bit User-Agent: IMP/PHP IMAP webmail program 2.2.8 X-Originating-IP: 80.82.33.59 Subject: Re: Processes get stuck in "ufs" state X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 25 Mar 2007 23:09:03 -0000 Цитирую Oleg Derevenetz : > On Wed, Mar 07, 2007 at 05:22:38AM +0300, Oleg Derevenetz wrote: > > >> Sometimes (once a week approximately) I have a problem with the same > >> symptoms described here on SMP FreeBSD 6.2-STABLE with dual AMD > Opteron(tm) > >> Processor 850: > >> > >> http://www.freebsd.org/cgi/query-pr.cgi?pr=104406&cat= > >> > >> Sometimes (apparently when CPU load suddenly goes up) all processes > that > >> interacts with disk gets stuck in "ufs" state, but in my case > >> SIGSTOP/SIGCONT seemingly does not help. > > > > See developer handbook, Deadlock Debugging chapter for instruction > what > > information shall be gathered to debug the problem. > > OK, I built kernel with debug options and will wait for stuck. By the > way, when debug options turned on, I see this message on every > boot when nullfs mounting in progress: > > acquiring duplicate lock of same type: "vnode interlock" > 1st vnode interlock @ /usr/src/sys/kern/vfs_vnops.c:806 > 2nd vnode interlock @ /usr/src/sys/kern/vfs_subr.c:2040 > KDB: stack backtrace: > kdb_backtrace(3,cfc60300,c05926d0,c05926d0,c05542c4,...) at > kdb_backtrace+0x29 > witness_checkorder(cfd5c4dc,9,c051cf1e,7f8) at witness_checkorder+0x578 > _mtx_lock_flags(cfd5c4dc,0,c051cf1e,7f8,cfb28b90,...) at > _mtx_lock_flags+0x78 > vrefcnt(cfd5c414) at vrefcnt+0x20 > null_checkvp(cff5eae0,c050c4a6,215) at null_checkvp+0x56 > null_lock(f02f1a68) at null_lock+0x66 > VOP_LOCK_APV(c054d540,f02f1a68) at VOP_LOCK_APV+0x87 > vn_lock(cff5eae0,1002,cfc60300,cff5eae0,cff5ed04,...) at vn_lock+0xac > nullfs_root(cff76b90,2,f02f1ae0,cfc60300,0,8,0,c05cfca0,0,c051c79c,407) > at nullfs_root+0x26 > vfs_domount(cfc60300,cfe3d340,cfe3d130,d,cfe3d3f0,c05817e0,0,c051c79c,2bf) > at vfs_domount+0x975 > vfs_donmount(cfc60300,d,cfe73080,cfe73080,0,...) at vfs_donmount+0x3f9 > nmount(cfc60300,f02f1d04) at nmount+0x8b > syscall(3b,3b,3b,bf7fe5f5,bf7feea0,...) at syscall+0x25b > Xint0x80_syscall() at Xint0x80_syscall+0x1f > --- syscall (378, FreeBSD ELF32, nmount), eip = 0x280bc0e7, esp = > 0xbf7fe5bc, ebp = 0xbf7fee38 --- > > This host have nullfs filesystems. Is this can be related to deadlock ? FYI: after replacing nullfs filesystems with unionfs (using new unionfs implementation): http://people.freebsd.org/~daichi/unionfs/ all deadlocks are gone. It seems to be a problem in current nullfs implementation, but I can't debug it properly because deadlock cases are relatively rare and machine that uses nullfs is heavily loaded so WITNESS and DEBUG options leads to unacceptable performance penalty.