From owner-freebsd-stable@FreeBSD.ORG  Fri Sep 26 01:29:00 2003
Return-Path: <owner-freebsd-stable@FreeBSD.ORG>
Delivered-To: freebsd-stable@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 1A82D16A4B3
	for <stable@freebsd.org>; Fri, 26 Sep 2003 01:29:00 -0700 (PDT)
Received: from salmon.maths.tcd.ie (salmon.maths.tcd.ie [134.226.81.11])
	by mx1.FreeBSD.org (Postfix) with SMTP id EE31A44022
	for <stable@freebsd.org>; Fri, 26 Sep 2003 01:28:57 -0700 (PDT)
	(envelope-from iedowse@maths.tcd.ie)
Received: from walton.maths.tcd.ie by salmon.maths.tcd.ie with SMTP
          id <aa56539@salmon>; 26 Sep 2003 09:28:57 +0100 (BST)
To: Brandon Fosdick <bfoz@terrandev.com>
In-Reply-To: Your message of "Thu, 25 Sep 2003 20:37:11 PDT."
             <3F73B467.5080500@terrandev.com> 
Date: Fri, 26 Sep 2003 09:28:56 +0100
From: Ian Dowse <iedowse@maths.tcd.ie>
Message-ID: <200309260928.aa56539@salmon.maths.tcd.ie>
cc: stable@freebsd.org
cc: Andrew Atrens <atrens@nortelnetworks.com>
Subject: Re: fix/workaround for usb probe lockups on nForce2 mbs 
X-BeenThere: freebsd-stable@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Production branch of FreeBSD source code
	<freebsd-stable.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-stable>,
	<mailto:freebsd-stable-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-stable>
List-Post: <mailto:freebsd-stable@freebsd.org>
List-Help: <mailto:freebsd-stable-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-stable>,
	<mailto:freebsd-stable-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 26 Sep 2003 08:29:00 -0000

In message <3F73B467.5080500@terrandev.com>, Brandon Fosdick writes:
>Ian Dowse wrote:
>> Great, thanks for tracking it down! It sounds very similar to a
>> problem people were having before with FAST_IPSEC where interrupts
>> were being enabled and confusing the USB code. You have a "uhci"
>> rather than an "ohci" controller I assume? If so, coould you try
>> the following patch instead? Unfortunately I don't have anything
>> to test this on, but it in theory it should work around this class
>> of problems by backporting from -CURRENT some logic for avoiding
>> interrupts in polling mode.
>
>My A7N8X-Deluxe has an ohci, not a uhci. I haven't tried either patch 
>yet, but I will in a minute.
>

In that case it won't help, sorry. I guess that means it is unlikely
to help on uhci controllers either. The hang is probably an interrupt
storm, not the driver getting stuck in a loop.

So the more important question is what could be lowering the cpl
priority during boot, allowing interrupts to be delivered? If someone
has the time to experiment, I'd like to see what happens if you
change sys/kern/subr_bus.c to add the following line just before
the `return error;' in device_probe_and_attach():

	device_printf(dev, "error %d, cpl %08x\n", error, cpl);

Actually, here's one more idea. Try the following patch on its own
to see if the hangs go away. It looks like kthread_create can
accidentally lower the cpl via fork1().

Ian

Index: kern_fork.c
===================================================================
RCS file: /home/iedowse/CVS/src/sys/kern/kern_fork.c,v
retrieving revision 1.72.2.14
diff -u -r1.72.2.14 kern_fork.c
--- kern_fork.c	26 Jun 2003 04:15:10 -0000	1.72.2.14
+++ kern_fork.c	26 Sep 2003 08:26:31 -0000
@@ -183,7 +183,7 @@
 	struct proc *p2, *pptr;
 	uid_t uid;
 	struct proc *newproc;
-	int ok;
+	int ok, s;
 	static int curfail = 0, pidchecked = 0;
 	static struct timeval lastfail;
 	struct forklist *ep;
@@ -544,10 +544,10 @@
 	 */
 	microtime(&(p2->p_stats->p_start));
 	p2->p_acflag = AFORK;
-	(void) splhigh();
+	s = splhigh();
 	p2->p_stat = SRUN;
 	setrunqueue(p2);
-	(void) spl0();
+	splx(s);
 
 	/*
 	 * Now can be swapped.