From owner-freebsd-current@FreeBSD.ORG  Tue Jun 17 20:20:53 2003
Return-Path: <owner-freebsd-current@FreeBSD.ORG>
Delivered-To: freebsd-current@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id D6D5437B401
	for <current@FreeBSD.org>; Tue, 17 Jun 2003 20:20:53 -0700 (PDT)
Received: from gw.catspoiler.org (217-ip-163.nccn.net [209.79.217.163])
	by mx1.FreeBSD.org (Postfix) with ESMTP id D616E43FAF
	for <current@FreeBSD.org>; Tue, 17 Jun 2003 20:20:52 -0700 (PDT)
	(envelope-from truckman@FreeBSD.org)
Received: from FreeBSD.org (mousie.catspoiler.org [192.168.101.2])
	by gw.catspoiler.org (8.12.9/8.12.9) with ESMTP id h5I3KjM7053484;
	Tue, 17 Jun 2003 20:20:49 -0700 (PDT)
	(envelope-from truckman@FreeBSD.org)
Message-Id: <200306180320.h5I3KjM7053484@gw.catspoiler.org>
Date: Tue, 17 Jun 2003 20:20:45 -0700 (PDT)
From: Don Lewis <truckman@FreeBSD.org>
To: chris@Shenton.Org
In-Reply-To: <87smq8jdj7.fsf@PECTOPAH.shenton.org>
MIME-Version: 1.0
Content-Type: TEXT/plain; charset=us-ascii
cc: current@FreeBSD.org
Subject: Re: 5.1-CURRENT hangs on disk i/o? sysctl_old_user() non-sleepable
 locks
X-BeenThere: freebsd-current@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Discussions about the use of FreeBSD-current
	<freebsd-current.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-current>,
	<mailto:freebsd-current-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-current>
List-Post: <mailto:freebsd-current@freebsd.org>
List-Help: <mailto:freebsd-current-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-current>,
	<mailto:freebsd-current-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 18 Jun 2003 03:20:54 -0000

On 17 Jun, Chris Shenton wrote:
> Don Lewis <truckman@FreeBSD.org> writes:
> 
>> If you have another machine and a null modem cable you can redirect the
>> system console of the machine to be debugged to a serial port and run
>> some comm software on the other machine so that you can capture all the
>> output from ddb.
> 
> OK, I'll give that a shot, probably tomorrow.
> 
> 
>> At the ddb prompt, you can do a "tr" command to get a stack trace,
>> which is likely to be very helpful in pointing out the offending
>> code.
> 
> Just saw it again, did a tr.  From chicken-scratch notes, the last
> bits are:
> 
>   VOP_GETVOBJECT(...)
>   do_sendfile(...)
>   sendfile(...)
>   syscall(...)
>   Xint0x80_syscall...
>   --- syscall( 393, FreeBSD ELF32, sendfile) ...
> 
> The next time it dropped into ddb, same "sendfile" thing.

Try the very untested patch below ...

> The main services I'm running are qmail, apache, and NFS.  Also 
> tftp, rarpd, lpd, sshd, bootparamd ...  oh, well, I guess I'm running
> a bunch of stuff here. :-(  Not sure which one, if any, this would be.
> 
> Unless sendfile() is something in the OS?

It's a system call, and I believe apache uses it.

> 
> I'll have to dig up a nullmodem and grab console output.  I realise
> I'm not giving enough detailed info to be very helpful here.

It's good enough to squash one bug.  I don't know if it will solve your
problem, though.


>> If you are running the NFS *client* code on this machine, there is one
>> lock assertion that is easy to trigger. 
> 
> In my kernel config I have this, because a diskless box uses the same
> kernel, but my /etc/fstab doesn't mount anyone else's NFS exports.

You won't trigger the the lock violation in the NFS client code unless
you actually mount a file system from another machine using NFS and
actually do some I/O on it.

Here's the patch:

Index: uipc_syscalls.c
===================================================================
RCS file: /home/ncvs/src/sys/kern/uipc_syscalls.c,v
retrieving revision 1.150
diff -u -r1.150 uipc_syscalls.c
--- uipc_syscalls.c	12 Jun 2003 05:52:09 -0000	1.150
+++ uipc_syscalls.c	18 Jun 2003 03:14:42 -0000
@@ -1775,10 +1775,13 @@
 	 */
 	if ((error = fgetvp_read(td, uap->fd, &vp)) != 0)
 		goto done;
+	vn_lock(vp, LK_EXCLUSIVE | LK_RETRY, td);
 	if (vp->v_type != VREG || VOP_GETVOBJECT(vp, &obj) != 0) {
 		error = EINVAL;
+		VOP_UNLOCK(vp, 0, td);
 		goto done;
 	}
+	VOP_UNLOCK(vp, 0, td);
 	if ((error = fgetsock(td, uap->s, &so, NULL)) != 0)
 		goto done;
 	if (so->so_type != SOCK_STREAM) {