From owner-freebsd-current@FreeBSD.ORG Tue Jun 17 20:02:17 2003 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id EAD8B37B401 for ; Tue, 17 Jun 2003 20:02:17 -0700 (PDT) Received: from Shenton.org (23.ebbed1.client.atlantech.net [209.190.235.35]) by mx1.FreeBSD.org (Postfix) with SMTP id 9C17543FBF for ; Tue, 17 Jun 2003 20:02:15 -0700 (PDT) (envelope-from chris@Shenton.Org) Received: (qmail 36410 invoked by uid 1000); 18 Jun 2003 03:00:12 -0000 To: Don Lewis References: <200306180233.h5I2WxM7053350@gw.catspoiler.org> From: Chris Shenton Date: 17 Jun 2003 23:00:12 -0400 In-Reply-To: <200306180233.h5I2WxM7053350@gw.catspoiler.org> Message-ID: <87smq8jdj7.fsf@PECTOPAH.shenton.org> Lines: 75 User-Agent: Gnus/5.09 (Gnus v5.9.0) Emacs/21.3 MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii cc: current@FreeBSD.org Subject: Re: 5.1-CURRENT hangs on disk i/o? sysctl_old_user() non-sleepable locks X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 18 Jun 2003 03:02:18 -0000 Don Lewis writes: > If you have another machine and a null modem cable you can redirect the > system console of the machine to be debugged to a serial port and run > some comm software on the other machine so that you can capture all the > output from ddb. OK, I'll give that a shot, probably tomorrow. > At the ddb prompt, you can do a "tr" command to get a stack trace, > which is likely to be very helpful in pointing out the offending > code. Just saw it again, did a tr. From chicken-scratch notes, the last bits are: VOP_GETVOBJECT(...) do_sendfile(...) sendfile(...) syscall(...) Xint0x80_syscall... --- syscall( 393, FreeBSD ELF32, sendfile) ... The next time it dropped into ddb, same "sendfile" thing. The main services I'm running are qmail, apache, and NFS. Also tftp, rarpd, lpd, sshd, bootparamd ... oh, well, I guess I'm running a bunch of stuff here. :-( Not sure which one, if any, this would be. Unless sendfile() is something in the OS? I'll have to dig up a nullmodem and grab console output. I realise I'm not giving enough detailed info to be very helpful here. > If you are running the NFS *client* code on this machine, there is one > lock assertion that is easy to trigger. In my kernel config I have this, because a diskless box uses the same kernel, but my /etc/fstab doesn't mount anyone else's NFS exports. options NFSCLIENT #Network Filesystem Client chris@PECTOPAH<101> ps -axww|grep nfs 42 ?? IL 0:00.00 (nfsiod 0) 43 ?? IL 0:00.00 (nfsiod 1) 44 ?? IL 0:00.00 (nfsiod 2) 45 ?? IL 0:00.00 (nfsiod 3) 428 ?? Is 0:00.03 nfsd: master (nfsd) 429 ?? I 0:00.09 nfsd: server (nfsd) 430 ?? I 0:00.00 nfsd: server (nfsd) 431 ?? I 0:00.00 nfsd: server (nfsd) 432 ?? I 0:00.00 nfsd: server (nfsd) 35366 p0 R+ 0:00.00 grep nfs > At the ddb prompt you should be able to use the write command tweak a > couple of variables to modify this behavior. If you set the > vfs_badlock_panic variable to zero, the kernel will no longer drop into > DDB when one of these lock violations occurs. If you set the > vfs_badlock_print variable to zero, the kernel will stop printing the > warnings. OK, I've done a examine vfs_badlock_panic which shows it zero, then write vfs_badlock_panic 0 at least for now. Thanks again.