From owner-freebsd-stable@FreeBSD.ORG Fri Nov 30 23:50:23 2012 Return-Path: Delivered-To: freebsd-stable@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 4C2231BD; Fri, 30 Nov 2012 23:50:23 +0000 (UTC) (envelope-from longwitz@incore.de) Received: from dss.incore.de (dss.incore.de [195.145.1.138]) by mx1.freebsd.org (Postfix) with ESMTP id CFA2F8FC21; Fri, 30 Nov 2012 23:50:22 +0000 (UTC) Received: from inetmail.dmz (inetmail.dmz [10.3.0.3]) by dss.incore.de (Postfix) with ESMTP id D63145CEF1; Sat, 1 Dec 2012 00:50:21 +0100 (CET) X-Virus-Scanned: amavisd-new at incore.de Received: from dss.incore.de ([10.3.0.3]) by inetmail.dmz (inetmail.dmz [10.3.0.3]) (amavisd-new, port 10024) with LMTP id DVLqDurXkJHQ; Sat, 1 Dec 2012 00:50:21 +0100 (CET) Received: from mail.incore (fwintern.dmz [10.0.0.253]) by dss.incore.de (Postfix) with ESMTP id 0B87D5CDC9; Sat, 1 Dec 2012 00:50:21 +0100 (CET) Received: from bsdmhs.longwitz (unknown [192.168.99.6]) by mail.incore (Postfix) with ESMTP id 888895084C; Sat, 1 Dec 2012 00:50:20 +0100 (CET) Message-ID: <50B9463C.3090005@incore.de> Date: Sat, 01 Dec 2012 00:50:20 +0100 From: Andreas Longwitz User-Agent: Thunderbird 2.0.0.19 (X11/20090113) MIME-Version: 1.0 To: Andriy Gapon Subject: Re: page fault on verbose boot References: <50ABE8BC.1010904@incore.de> <50B8CD59.1050308@FreeBSD.org> <50B8DD1C.4010308@incore.de> <50B8ED1B.8080009@FreeBSD.org> In-Reply-To: <50B8ED1B.8080009@FreeBSD.org> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: freebsd-stable@FreeBSD.org, John Baldwin X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 30 Nov 2012 23:50:23 -0000 Hi, >> ioapic1: routing intpin 15 (PCI IRQ 31) to lapic 0 vector 54 >> kernel trap 12 with interrupts disabled > [snip] >> db> bt >> Tracing pid 0 tid 100000 td 0xc0a35350 >> intr_execute_handlers(0,c1020cb4,3,c1020cf8,c08e4625,...) at >> intr_execute_handlers+0x15 >> lapic_handle_intr(36,c1020cb4) at lapic_handle_intr+0x4c >> Xapic_isr1() at Xapic_isr1+0x35 >> --- interrupt, eip = 0xc08ee8fb, esp = 0xc1020cf4, ebp = 0xc1020cf8 --- >> spinlock_exit(c09a1e2e,0,36,3,c1020d38,...) at spinlock_exit+0x2b >> ioapic_assign_cpu(c4d1565c,0,0,0,c08f3d29,...) at ioapic_assign_cpu+0x2b0 >> intr_shuffle_irqs(0,101ec00,101ec00,101e000,1025000,...) at >> intr_shuffle_irqs+0xba >> mi_startup() at mi_startup+0xac >> begin() at begin+0x2c > > Thank you for all the additional information, which proved to be very useful. > Something struck me now that I didn't realize before: your BSP has APIC ID of 3 > while the other CPU has ID of zero. So, lapic3 is the BSP's lapic and lapic0 is > the other one. > Hence, it looks like IRQ31 (a signal on pin 15 of ioapic1) was delivered to a new > vector of 54 (as opposed to vector 48) but to the old LAPIC/CPU 3 (as opposed to > newly configured lapic0). > So it looks like that the interrupt was handled at IO-APIC in the middle of pin > reconfiguration. > > Looking at the code in ioapic_program_intpin() this seems to be possible indeed: > > /* Write the values to the APIC. */ > intpin->io_lowreg = low; > ioapic_write(io->io_addr, IOAPIC_REDTBL_LO(intpin->io_intpin), low); > > The line above reprograms vector number AND _unmasks_ the pin (which was > specifically masked before reprogramming in ioapic_assign_cpu). > The lines below reprogram the destination LAPIC/CPU: > > value = ioapic_read(io->io_addr, IOAPIC_REDTBL_HI(intpin->io_intpin)); > value &= ~IOART_DEST; > value |= high; > ioapic_write(io->io_addr, IOAPIC_REDTBL_HI(intpin->io_intpin), value); > > So a pending interrupt would be happily delivered to a wrong destination (new > vector + old lapic). > I am not sure if just swapping these two blocks of lines would fix the issue, but > I hope that it would. Could you please try that? Yes I did and the first bootverbose run with your block switching patch was ok. I will do some more expansive tests next week. -- Andreas Longwitz