From owner-freebsd-current@FreeBSD.ORG Wed Nov 5 14:43:45 2003 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id ACAC116A4D0 for ; Wed, 5 Nov 2003 14:43:45 -0800 (PST) Received: from mail.speakeasy.net (mail7.speakeasy.net [216.254.0.207]) by mx1.FreeBSD.org (Postfix) with ESMTP id 10DC643FFD for ; Wed, 5 Nov 2003 14:43:44 -0800 (PST) (envelope-from jhb@FreeBSD.org) Received: (qmail 2342 invoked from network); 5 Nov 2003 22:43:43 -0000 Received: from unknown (HELO server.baldwin.cx) ([216.27.160.63]) (envelope-sender )encrypted SMTP for ; 5 Nov 2003 22:43:43 -0000 Received: from laptop.baldwin.cx (gw1.twc.weather.com [216.133.140.1]) by server.baldwin.cx (8.12.9/8.12.9) with ESMTP id hA5MhJce077301; Wed, 5 Nov 2003 17:43:19 -0500 (EST) (envelope-from jhb@FreeBSD.org) Message-ID: X-Mailer: XFMail 1.5.4 on FreeBSD X-Priority: 3 (Normal) Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 8bit MIME-Version: 1.0 In-Reply-To: <20031105110813.J72398@beagle.fokus.fraunhofer.de> Date: Wed, 05 Nov 2003 17:43:18 -0500 (EST) From: John Baldwin To: harti@FreeBSD.org X-Spam-Checker-Version: SpamAssassin 2.55 (1.174.2.19-2003-05-19-exp) cc: current@FreeBSD.org Subject: RE: New interrupt stuff breaks ASUS 2 CPU system X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 05 Nov 2003 22:43:45 -0000 On 05-Nov-2003 Harti Brandt wrote: > On Tue, 4 Nov 2003, John Baldwin wrote: > > JB> > JB>On 04-Nov-2003 Harti Brandt wrote: > JB>> On Tue, 4 Nov 2003, Harti Brandt wrote: > JB>> > JB>> HB>On Tue, 4 Nov 2003, John Baldwin wrote: > JB>> HB> > JB>> HB>JB> > JB>> HB>JB>On 04-Nov-2003 Harti Brandt wrote: > JB>> HB>JB>> > JB>> HB>JB>> Hi, > JB>> HB>JB>> > JB>> HB>JB>> I have an ASUS system with 2 CPUs that I need to run at HZ=10000. This > JB>> HB>JB>> worked until yesterday, but with the new interrupt code it doesn't boot > JB>> HB>JB>> anymore. It works for the standard HZ, but if I set HZ=1000 I get a double > JB>> HB>JB>> fault. I suspect a race condition in the interrupt handling. My config > JB>> HB>JB>> file has > JB>> HB>JB>> > JB>> HB>JB>> options SMP > JB>> HB>JB>> device apic > JB>> HB>JB>> options HZ=1000 > JB>> HB>JB> > JB>> HB>JB>Ok, I can try to reproduce. > JB>> HB>JB> > JB>> HB>JB>> Device configuration finished. > JB>> HB>JB>> Timecounter "TSC" frequency 1380009492 Hz quality -100 > JB>> HB>JB>> Timecounters cpuid = 0; apic id = 00 > JB>> HB>JB>> instruction pointer = 0x8:0xc048995d > JB>> HB>JB>> stack pointer = 0x10:0xc0821bf4 > JB>> HB>JB>> frame pointer cpuid = 0; apic id = 00 > JB>> HB>JB>> > JB>> HB>JB>> 0xc048995d is in critical_exit. It is the jmp after the popf from > JB>> HB>JB>> cpu_critical_exit. > JB>> HB>JB> > JB>> HB>JB>This is where interrupts are re-enabled, so you are getting an interrupt. > JB>> HB>JB>It might be helpful to figure what type of fault you are actually getting. > JB>> HB> > JB>> HB>tf_err is 0, tf_trapno is 30 (decimal). > JB>> > JB>> More information: > JB>> > JB>> I have replaced all the reserved vectors with individual ones, that set > JB>> tf_err to the index (vector number). It appears the the vector number is > JB>> 39 decimal. What does that mean? > JB> > JB>IRQ 7. > JB>Can you post a verbose dmesg? Also, can you try both with and without > JB>ACPI? > > Attached are both dmesgs. > > More datapoints: > > I had the parallel port (irq7) and the second sio disabled in the BIOS. > After enabling both I now get a panic in lapic_handle_intr: Couldn't get > vector from ISR! After fetching the relevant docs from intel I checked the > registers of the apic pointed to by lapic. The interrupt taken is > Xapic_irq1. isr1 is zero, but irr1 is 0x100 (that was without ACPI). How > may that happen? As I understand ISR are the interrupts that have been > delivered to the CPU so if it is interrupted a bit should be set, correct? I figured out what is happenning I think. You are getting a spurious interrupt from the 8259A PIC (which comes in on IRQ 7). The IRR register lists pending interrupts still waiting to be serviced. Try using 'options NO_MIXED_MODE' to stop using the 8259A's for the clock and see if the spurious IRQ 7 interrupts go away. > A question while reading the code: what does the global lapic variable > refer to? As I understand every CPU has its local APIC. Does it point to > one of those two? To which? Every CPU can get to its APIC at the same physical address. Thus, CPU A can only get to its own local APIC, and not to any other CPUs. The 'lapic' variable has a virtual address mapped to the physical address of the local APIC. -- John Baldwin <>< http://www.FreeBSD.org/~jhb/ "Power Users Use the Power to Serve!" - http://www.FreeBSD.org/