From owner-freebsd-i386@FreeBSD.ORG Mon Nov 15 21:27:06 2004 Return-Path: Delivered-To: freebsd-i386@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id D61C616A4D3 for ; Mon, 15 Nov 2004 21:27:06 +0000 (GMT) Received: from duchess.speedfactory.net (duchess.speedfactory.net [66.23.201.84]) by mx1.FreeBSD.org (Postfix) with SMTP id 247F943D55 for ; Mon, 15 Nov 2004 21:27:06 +0000 (GMT) (envelope-from ups@tree.com) Received: (qmail 19822 invoked by uid 89); 15 Nov 2004 21:27:02 -0000 Received: from duchess.speedfactory.net (66.23.201.84) by duchess.speedfactory.net with SMTP; 15 Nov 2004 21:27:02 -0000 Received: (qmail 19693 invoked by uid 89); 15 Nov 2004 21:27:01 -0000 Received: from unknown (HELO palm.tree.com) (66.23.216.49) by duchess.speedfactory.net with SMTP; 15 Nov 2004 21:27:01 -0000 Received: from [127.0.0.1] (localhost.tree.com [127.0.0.1]) by palm.tree.com (8.12.10/8.12.10) with ESMTP id iAFLR05R001970; Mon, 15 Nov 2004 16:27:00 -0500 (EST) (envelope-from ups@tree.com) From: Stephan Uphoff To: "i386@freebsd.org" , "current@freebsd.org" Content-Type: text/plain Message-Id: <1100554020.90130.241.camel@palm.tree.com> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.4.6 Date: Mon, 15 Nov 2004 16:27:00 -0500 Content-Transfer-Encoding: 7bit Subject: patch for "Previous IPI is stuck" - please test X-BeenThere: freebsd-i386@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: I386-specific issues for FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 15 Nov 2004 21:27:07 -0000 There have been several complains about "Previous IPI is stuck" panics on i386 based multiprocessor systems. In general all affected systems seem to have four or more real (no HTT) processors. Probable cause: The local APIC used for IPIs can only queue two interrupts per interrupt priority class (interrupt vector / 16). Since all IPIs share the same interrupt priority class more than two IPIs pending to the same processor will fill the interrupt fifo for the IPI priority class. I believe the "Previous IPI is stuck" is a deadlock between sending an AST IPI with sched lock held to a CPU trying to acquire the sched lock with interrupt disabled and with full interrupt fifo. Unfortunately I can not reproduce the problem on my dual Xeon with HTT enabled :-( To test the theory I wrote a patch that replaces multiple IPI interrupt handlers with a single hander and uses a bitmap to avoid redundant IPI interrupt requests to the interrupt fifo. The patch is a proof of concept and therefore not optimized (to put it mildly ;-). It probably increases the cost for TLB shootdown IPIs substantially. You can download the patch at: http://people.freebsd.org/~ups/ipi4_patch Please make sure that exception.o is rebuild. ( Makefile seems to miss the dependency to apic_vector.s) Any feedback is appreciated. Stephan