From owner-freebsd-current@FreeBSD.ORG Wed Jun 23 14:39:22 2004 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id C24C616A4CF for ; Wed, 23 Jun 2004 14:39:22 +0000 (GMT) Received: from mail5.speakeasy.net (mail5.speakeasy.net [216.254.0.205]) by mx1.FreeBSD.org (Postfix) with ESMTP id 5B24C43D41 for ; Wed, 23 Jun 2004 14:39:22 +0000 (GMT) (envelope-from jhb@FreeBSD.org) Received: (qmail 5626 invoked from network); 23 Jun 2004 14:39:22 -0000 Received: from dsl027-160-063.atl1.dsl.speakeasy.net (HELO server.baldwin.cx) ([216.27.160.63]) (envelope-sender ) encrypted SMTP for ; 23 Jun 2004 14:39:21 -0000 Received: from 10.50.41.233 (gw1.twc.weather.com [216.133.140.1]) by server.baldwin.cx (8.12.11/8.12.11) with ESMTP id i5NEdHlV064278; Wed, 23 Jun 2004 10:39:18 -0400 (EDT) (envelope-from jhb@FreeBSD.org) From: John Baldwin To: freebsd-current@FreeBSD.org Date: Wed, 23 Jun 2004 10:40:21 -0400 User-Agent: KMail/1.6 References: In-Reply-To: MIME-Version: 1.0 Content-Disposition: inline Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Message-Id: <200406231040.21032.jhb@FreeBSD.org> X-Spam-Checker-Version: SpamAssassin 2.63 (2004-01-11) on server.baldwin.cx cc: Gerrit Nagelhout cc: Julian Elischer cc: kris@FreeBSD.org Subject: Re: STI, HLT in acpi_cpu_idle_c1 X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 23 Jun 2004 14:39:23 -0000 On Tuesday 22 June 2004 09:01 pm, Gerrit Nagelhout wrote: > Thanks for the detailed info on this. It looks like CPU1 is trying > to service the interrupt because PPR = 0xf0, and TPR = 0x00. It is > also the only CPU that has a bit set in ISR. In this case, CPU 3 > was initiating the IPI (although I don't know why its icr_lo is > 0xc00f6 because I was expecting it to be 0xc00f3 (and it was in > previous lockups). I still have no idea why CPU1 is not handling > this interrupt though. I am still getting used to this emulator, but > I think the values I am reading are believable: > > P3>dumpAllLocalApic > CPU 0 > ID: 0x6000000 > TPR: 0x0 > PPR: 0x0 > icr_lo:0xf3 > ISR0: 0x0 > ISR1: 0x0 > ISR2: 0x0 > ISR3: 0x0 > ISR4: 0x0 > ISR5: 0x0 > ISR6: 0x0 > ISR7: 0x0 > CPU 1 > ID: 0x7000000 > TPR: 0x0 > PPR: 0xf0 > icr_lo:0xf3 > ISR0: 0x0 > ISR1: 0x0 > ISR2: 0x0 > ISR3: 0x0 > ISR4: 0x0 > ISR5: 0x0 > ISR6: 0x0 > ISR7: 0x80000 bit 19 is set, so vector of 224 + 19 = 243. #define APIC_LOCAL_INTS 240 #define APIC_IPI_INTS (APIC_LOCAL_INTS + 3) #define IPI_AST APIC_IPI_INTS /* Generate software trap. */ So it's an IPI_AST which is EOI'd before we do anything: IDTVEC(cpuast) PUSH_FRAME movl $KDSEL, %eax movl %eax, %ds /* use KERNEL data segment */ movl %eax, %es movl $KPSEL, %eax movl %eax, %fs movl lapic, %edx movl $0, LA_EOI(%edx) /* End Of Interrupt to APIC */ FAKE_MCOUNT(TF_EIP(%esp)) MEXITCOUNT jmp doreti Hmm nothing in the kernel does an IPI to all but self with IPI_AST. Only with IPI_RENDEZVOUS in MI code. > CPU 2 > ID: 0x0 > TPR: 0x0 > PPR: 0x0 > icr_lo:0xfb > ISR0: 0x0 > ISR1: 0x0 > ISR2: 0x0 > ISR3: 0x0 > ISR4: 0x0 > ISR5: 0x0 > ISR6: 0x0 > ISR7: 0x0 > CPU 3 > ID: 0x1000000 > TPR: 0x0 > PPR: 0x0 > icr_lo:0xc00f6 0xf6 is the vector 246 #define IPI_INVLRNG (APIC_IPI_INTS + 3) That is an IPI that is sent via all_but_self. *sigh* And the TLB shootdown code does sit and spin in a loop with interrupts disabled after sending the IPI. Hmm, I do see one possible bug. It's only safe to spin like that if the same lock protects all such spin cases. For the lazypmap stuff a different lock is used. You can try this patch to see if it helps any. Kris Kenneway, you might want to try this, too on the box with the lazyfix timeouts. Index: pmap.c =================================================================== RCS file: /usr/cvs/src/sys/i386/i386/pmap.c,v retrieving revision 1.473 diff -u -r1.473 pmap.c --- pmap.c 17 Jun 2004 06:16:57 -0000 1.473 +++ pmap.c 23 Jun 2004 14:39:32 -0000 @@ -1292,7 +1296,7 @@ while ((mask = pmap->pm_active) != 0) { spins = 50000000; mask = mask & -mask; /* Find least significant set bit */ - mtx_lock_spin(&lazypmap_lock); + mtx_lock_spin(&smp_tlb_mtx); #ifdef PAE lazyptd = vtophys(pmap->pm_pdpt); #else @@ -1312,7 +1316,7 @@ break; } } - mtx_unlock_spin(&lazypmap_lock); + mtx_unlock_spin(&smp_tlb_mtx); if (spins == 0) printf("pmap_lazyfix: spun for 50000000\n"); } -- John Baldwin <>< http://www.FreeBSD.org/~jhb/ "Power Users Use the Power to Serve" = http://www.FreeBSD.org