From owner-freebsd-current@FreeBSD.ORG Tue Jun 17 18:35:19 2003 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 2A9D437B401 for ; Tue, 17 Jun 2003 18:35:19 -0700 (PDT) Received: from Shenton.org (23.ebbed1.client.atlantech.net [209.190.235.35]) by mx1.FreeBSD.org (Postfix) with SMTP id E95F443FB1 for ; Tue, 17 Jun 2003 18:35:17 -0700 (PDT) (envelope-from chris@mail.hq.nasa.gov) Received: (qmail 11501 invoked by uid 1000); 18 Jun 2003 01:34:47 -0000 To: Don Lewis References: <200306162057.h5GKvCM7049856@gw.catspoiler.org> From: Chris Shenton Date: 17 Jun 2003 21:34:47 -0400 In-Reply-To: <200306162057.h5GKvCM7049856@gw.catspoiler.org> Message-ID: <8765n4b22w.fsf@PECTOPAH.shenton.org> Lines: 56 User-Agent: Gnus/5.09 (Gnus v5.9.0) Emacs/21.3 MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii cc: current@FreeBSD.org Subject: Re: 5.1-CURRENT hangs on disk i/o? sysctl_old_user() non-sleepable locks X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 18 Jun 2003 01:35:19 -0000 Don Lewis writes: > I doubt it. I checked in a fix for this problem today so you should get > the fix when you next cvsup. Yup, many thanks. > Can you break into ddb and do a ps to find out what state all the > processes are in? I'm a newbie to ddb. Was able to get a ps from a hung system but didn't know how to capture it to send to you. Any hints? > You might want to try adding the DEBUG_VFS_LOCKS options to your > kernel config to see if that turns up anything. Oh, man, I'm getting killed here now. Rebuilt the kernel with that option (not found in GENERIC or other examples in /usr/src/sys/i386/conf/). Now the system is dropping into ddb ever minute or so with complaints like the following on the screen, and in /var/log/messages: Jun 17 21:06:08 PECTOPAH kernel: VOP_GETVOBJECT: 0xc584eb68 is not locked but should be Jun 17 21:08:04 PECTOPAH last message repeated 3 times ... Jun 17 21:18:55 PECTOPAH kernel: VOP_GETVOBJECT: 0xc59346d8 is not locked but should be Jun 17 21:18:59 PECTOPAH last message repeated 5 times Lots 'n' lots of 'em, with a few of the same hex value then another set for a different hex value. > There is also ddb command to list the locked vnodes "show > lockedvnods". After I type "cont" at ddb a few times the system runs for a while again, only to repeat. When it drops to ddb again that show command doesn't list anything. I may have to remove that option from my kernel just to get to run a bit, even tho eventually the system will hang. It's (of course) my main box which the other systems NFS off, mail server, etc. :-( > Are you using nullfs or unionfs which are a bit fragile? Nope. I'd be happy to mail you my kernel config if you want. I've posted it to http://chris.shenton.org/PECTOPAH but if the system's hung again, naturally it won't be available :-( Thanks for your help. Any other things I might try? Dunno if this matters, but I'm using an DELL CERC ATA RAID card with disks showing up as amrd* if that matters. Was flawless at 5.0-{CURRENT,RELEASE}.