From owner-freebsd-fs@FreeBSD.ORG Fri May 18 04:38:22 2007 Return-Path: X-Original-To: freebsd-fs@freebsd.org Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 70D2F16A407 for ; Fri, 18 May 2007 04:38:22 +0000 (UTC) (envelope-from anderson@freebsd.org) Received: from mh1.centtech.com (moat3.centtech.com [64.129.166.50]) by mx1.freebsd.org (Postfix) with ESMTP id 43E4813C458 for ; Fri, 18 May 2007 04:38:22 +0000 (UTC) (envelope-from anderson@freebsd.org) Received: from neutrino.centtech.com (andersonbox1.centtech.com [192.168.42.21]) by mh1.centtech.com (8.13.8/8.13.8) with ESMTP id l4I4cKCa028009; Thu, 17 May 2007 23:38:21 -0500 (CDT) (envelope-from anderson@freebsd.org) Message-ID: <464D2DBC.4040303@freebsd.org> Date: Thu, 17 May 2007 23:38:20 -0500 From: Eric Anderson User-Agent: Thunderbird 2.0.0.0 (X11/20070420) MIME-Version: 1.0 To: Kostik Belousov References: <20070517170100.GA41395@deviant.kiev.zoral.com.ua> <5230D3C40B842D4F9FB3CD368021BEF72F090C@exchange-2.sandvine.com> <20070517174735.GB41395@deviant.kiev.zoral.com.ua> In-Reply-To: <20070517174735.GB41395@deviant.kiev.zoral.com.ua> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.88.4/3267/Thu May 17 15:40:58 2007 on mh1.centtech.com X-Virus-Status: Clean X-Spam-Status: No, score=-2.6 required=8.0 tests=AWL,BAYES_00 autolearn=ham version=3.1.6 X-Spam-Checker-Version: SpamAssassin 3.1.6 (2006-10-03) on mh1.centtech.com Cc: freebsd-fs@freebsd.org Subject: Re: Ufs dead-locks on freebsd 6.2 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 18 May 2007 04:38:22 -0000 On 05/17/07 12:47, Kostik Belousov wrote: > On Thu, May 17, 2007 at 01:03:37PM -0400, Andrew Edwards wrote: >> Here it is. >> >> db> show vnode 0xccd47984 >> vnode 0xccd47984: tag ufs, type VDIR >> usecount 5135, writecount 0, refcount 5137 mountedhere 0 >> flags (VV_ROOT) >> v_object 0xcd02518c ref 0 pages 1 >> #0 0xc0593f0d at lockmgr+0x4ed >> #1 0xc06b8e0e at ffs_lock+0x76 >> #2 0xc0739787 at VOP_LOCK_APV+0x87 >> #3 0xc0601c28 at vn_lock+0xac >> #4 0xc05ee832 at lookup+0xde >> #5 0xc05ee4b2 at namei+0x39a >> #6 0xc05e2ab0 at unp_connect+0xf0 >> #7 0xc05e1a6a at uipc_connect+0x66 >> #8 0xc05d9992 at soconnect+0x4e >> #9 0xc05dec60 at kern_connect+0x74 >> #10 0xc05debdf at connect+0x2f >> #11 0xc0723e2b at syscall+0x25b >> #12 0xc070ee0f at Xint0x80_syscall+0x1f >> >> ino 2, on dev amrd0s1a > It seems to be the sort of things that cannot happen. VOP_LOCK() > returned 0, but vnode was not really locked. > > Although claiming that kernel code cannot have such bug is too optimistic, > I would first make sure that: > 1. You checked the memory of the machine. > 2. Your kernel is built from pristine sources. This looks precisely like a lock I was seeing on one of my NFS servers. Only one of the filesystems would cause it, but it was the same one each time, not necessarily under any kind of load. Things like mountd would get wedged in state 'ufs', and other things would get stuck in one of the lock states (I can't recall). After lots of hair pulling, I unmounted the file system, forcefully fsck'ed that one filesystem, and rebooted the box. Haven't had the issue since, and it's been up for about 100days. Eric