From owner-freebsd-stable@FreeBSD.ORG Sat Sep 27 06:05:53 2003 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id E83A016A4B3 for ; Sat, 27 Sep 2003 06:05:53 -0700 (PDT) Received: from salmon.maths.tcd.ie (salmon.maths.tcd.ie [134.226.81.11]) by mx1.FreeBSD.org (Postfix) with SMTP id AE2C543FB1 for ; Sat, 27 Sep 2003 06:05:52 -0700 (PDT) (envelope-from iedowse@maths.tcd.ie) Received: from walton.maths.tcd.ie by salmon.maths.tcd.ie with SMTP id ; 27 Sep 2003 14:05:52 +0100 (BST) To: freebsd-stable@freebsd.org Date: Sat, 27 Sep 2003 14:05:51 +0100 From: Ian Dowse Message-ID: <200309271405.aa30238@salmon.maths.tcd.ie> Subject: Patch for boot-time USB hangs in 4.9-PRERELEASE X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 27 Sep 2003 13:05:54 -0000 Could people who are experiencing boot-time hangs in 4.9-PRERELEASE try the following patch to see if it helps? I've had one positive report so far, but it would be helpful to get more feedback to determine if this is the right fix to be committed. The problem was that interrupts were getting unmasked too early in the boot process, causing an interrupt storm that usually occurred while USB devices were being probed. The bug is in fork1(), which is used by kthread_create() to create kernel threads; the code there assumed that all interrupts would be unmasked when called, so it didn't bother saving and restoring the interrupt mask. This assumption is reasonable for normal fork() calls, but not for the creation of kernel threads early in the boot process. The appearance of the problem was linked to some recent changes in sys/kern/subr_taskqueue.c around September 10th. Those changes include a call to kthread_create() that happens just before probing all the devices at boot time, so the result was that interrupts were being left unmasked at a time when they are supposed to be disabled. To actually get an interrupt storm hang, you need some IRQ number to be configured by a device driver, and then for some (possibly other) device sharing that IRQ line to generate an interrupt before the driver for that device is prepared to handle it. As well as the subr_taskqueue.c case, there is also a call to kthread_create() in the USB code, so that might be related too. Apply the patch in /usr/src/sys/kern, then rebuild the kernel and reboot. In case of whitespace problems preventing patch applying, it is also available at: http://people.freebsd.org/~iedowse/fork.diff Ian Index: kern_fork.c =================================================================== RCS file: /home/iedowse/CVS/src/sys/kern/kern_fork.c,v retrieving revision 1.72.2.14 diff -u -r1.72.2.14 kern_fork.c --- kern_fork.c 26 Jun 2003 04:15:10 -0000 1.72.2.14 +++ kern_fork.c 26 Sep 2003 08:26:31 -0000 @@ -183,7 +183,7 @@ struct proc *p2, *pptr; uid_t uid; struct proc *newproc; - int ok; + int ok, s; static int curfail = 0, pidchecked = 0; static struct timeval lastfail; struct forklist *ep; @@ -544,10 +544,10 @@ */ microtime(&(p2->p_stats->p_start)); p2->p_acflag = AFORK; - (void) splhigh(); + s = splhigh(); p2->p_stat = SRUN; setrunqueue(p2); - (void) spl0(); + splx(s); /* * Now can be swapped.