From owner-freebsd-stable@FreeBSD.ORG Thu Jan 12 23:31:45 2006 Return-Path: X-Original-To: freebsd-stable@FreeBSD.org Delivered-To: freebsd-stable@FreeBSD.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 6520316A41F for ; Thu, 12 Jan 2006 23:31:45 +0000 (GMT) (envelope-from truckman@FreeBSD.org) Received: from gw.catspoiler.org (217-ip-163.nccn.net [209.79.217.163]) by mx1.FreeBSD.org (Postfix) with ESMTP id D3C2243D46 for ; Thu, 12 Jan 2006 23:31:43 +0000 (GMT) (envelope-from truckman@FreeBSD.org) Received: from FreeBSD.org (mousie.catspoiler.org [192.168.101.2]) by gw.catspoiler.org (8.13.3/8.13.3) with ESMTP id k0CNVXv4046419; Thu, 12 Jan 2006 15:31:37 -0800 (PST) (envelope-from truckman@FreeBSD.org) Message-Id: <200601122331.k0CNVXv4046419@gw.catspoiler.org> Date: Thu, 12 Jan 2006 15:31:33 -0800 (PST) From: Don Lewis To: dsh@vlink.ru In-Reply-To: <200601122319.k0CNJp9G046391@gw.catspoiler.org> MIME-Version: 1.0 Content-Type: TEXT/plain; charset=us-ascii Cc: freebsd-stable@FreeBSD.org Subject: Re: Recurring problem: processes block accessing UFS file system X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 12 Jan 2006 23:31:45 -0000 On 12 Jan, Don Lewis wrote: > On 11 Jan, Denis Shaposhnikov wrote: >> Hi! >> >>>>>>> "Don" == Don Lewis writes: >> >> Don> Are you using any unusual file systems, such as nullfs or >> Don> unionfs? >> >> >> Yes, I'm use a lots of nullfs. This is a host system for about 20 >> >> jails with nullfs mounted ro system: >> >> Don> That would be my guess as to the cause of the problem. Hopefully >> Don> DEBUG_VFS_LOCKS will help pinpoint the bug. >> >> I've got the problem again. Now I have debug kernel and crash >> dump. That is an output from the kdb. Do you need any additional >> information which I can get from the crash dump? > Process 33016 is executing rmdir(). While doing the lookup, it is > holding a lock on vnode 0xc6c07bf4 and attempting to lock vnode > c6bed3fc. Vnode 0xc6c07bf4 should be the parent directory of c6bed3fc. > > Process 546 is executing open(). While doing the lookup, it is holding > a lock on vnode 0xc6bed3fc while attempting to lock vnode c6c07bf4. > Vnode 0xc6bed3fc should be the parent directory of c6c07bf4, but this is > inconsistent with the previous paragraph. > > This situation should not be possible. Using kgdb on your saved crash > dump, print "fmode" and "*ndp" in the vn_open_cred() stack frame of > process 546, and "*nd" in the kern_rmdir() stack frame of process 33016. > The path names being looked up may be helpful. > > Are there any symbolic links in the path names? If so, what are the > link contents? > > Are either of these processes jailed? If so, same or different jails? > > What are inodes 2072767 and 2072795 on ad4s1g? > > Are you using snapshots? I just thought of another possible cause for this problem. Is is possible that you have any hard links to directories in the file system on ad4s1g? That could put a loop in the directory tree and mess up the normal parent-child relationship that we rely on to avoid deadlocks.