Date: Wed, 24 Nov 2004 00:11:22 -0500 From: Stephan Uphoff <ups@tree.com> To: Doug White <dwhite@gumbysoft.com> Cc: stable@freebsd.org Subject: Re: panic: APIC: Previous IPI is stuck Message-ID: <1101273082.48967.57.camel@palm.tree.com> In-Reply-To: <20041123191634.K90740@carver.gumbysoft.com> References: <20041115045912.A79200@titus.hanley.stade.co.uk> <20041123191634.K90740@carver.gumbysoft.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Tue, 2004-11-23 at 22:19, Doug White wrote: > On Mon, 15 Nov 2004, Adrian Wontroba wrote: > > > At work, I've just taken an old cast off NT server and used it as > > a replacement for an equally elderly low end PC which performs an > > important monitoring task. > > > > I took the opportunity to upgrade to 5.3 (5.3-RC2 now, yesterday's > > 5.3-STABLE when I get to work again) rather than stay on 4.10-RELEASE. > > > > The rationale was this would be a nice resilient machine, demonstrating > > how FreeBSD can extend the useful working life of aging hardware. > > > > The practice is that it it has now crashed three times in a couple of > > days with "panic: APIC: Previous IPI is stuck", the most recent one > > dragging me out from home early in a Monday morning. > > Welcome to the club. This is a known problem with affects older, true 4 > proc machines. Stephan Uphoff (ups@tree.com) has posted a patch to > -current that seems to help. I have a Dell PE6500 (4x500MHz) I'm trying to > get to duplicate the problem (and compile world without resetting) before > I try the patch. (Replacing a CPU has made it happy again, thankfully) > > Dual proc hyperthreaded machines don't seem to be affected, or at least > not as frequently. > > I'd suggest trying the patch and see if that helps for you. It doesn't > seem to be making things worse for people :) The patch has a few testers and no "APIC: Previous IPI is stuck" panics have been reported. Hopefully I will be able to get a new patch out the next days that will be optimized. Once the new patch got some testing it will go into current. ( And hopefully I can MFC it later) > > Over in current there are a couple of threads starting in late September > > where a few people are suffering this problem. Like them, I'm using an > > old (1997) Pentium Pro multiprocessor, in my case a 4 way Fujitsu M700. > > > > The machine is running with the SMP kernel (ie GENERIC + SMP), 4BSD > > scheduler, without preemption. > > > > I've set kern.sched.ipiwakeup.enabled=0 and crossed my fingers. > > > > I'm a SMP novice. Would the machine become stable if I switched to a > > non-SMP kernel? Reliability is more important than speed in this case, > > and the opportunity for experimentation close to zero. Creditability > > has already been damaged by the gvinum RAID5 experience (8-( > > > > I'm not knocking 5.3 - in all other respects it seems wonderful. > > > > "me too" diagnostics: > > > > kern.sched.name: 4BSD > > kern.sched.quantum: 100000 > > kern.sched.ipiwakeup.enabled: 1 > > kern.sched.ipiwakeup.requested: 858129 > > kern.sched.ipiwakeup.delivered: 858129 > > kern.sched.ipiwakeup.usemask: 1 > > kern.sched.ipiwakeup.useloop: 0 > > kern.sched.ipiwakeup.onecpu: 0 > > kern.sched.ipiwakeup.htt2: 0 > > kern.sched.followon: 0 > > kern.sched.pfollowons: 0 > > kern.sched.kgfollowons: 0 > > kern.sched.runq_fuzz: 1 > > > > ============================================================================ > > > > MPTable, version 2.0.15 > > > > looking for EBDA pointer @ 0x040e, found, searching EBDA @ 0x0008f000 > > searching CMOS 'top of mem' @ 0x0008ec00 (571K) > > searching default 'top of mem' @ 0x0009fc00 (639K) > > searching BIOS @ 0x000f0000 > > > > MP FPS found in BIOS @ physical addr: 0x000fdc30 > > > > ---------------------------------------------------------------------------- > > > > MP Floating Pointer Structure: > > > > location: BIOS > > physical address: 0x000fdc30 > > signature: '_MP_' > > length: 16 bytes > > version: 1.4 > > checksum: 0x56 > > mode: Virtual Wire > > > > ---------------------------------------------------------------------------- > > > > MP Config Table Header: > > > > physical address: 0x0008f151 > > signature: 'PCMP' > > base table length: 332 > > version: 1.4 > > checksum: 0x05 > > OEM ID: 'Fujitsu ' > > Product ID: 'Pro Server ' > > OEM table pointer: 0x00000000 > > OEM table size: 0 > > entry count: 30 > > local APIC address: 0xfee00000 > > extended table length: 0 > > extended table checksum: 0 > > > > ---------------------------------------------------------------------------- > > > > MP Config Base Table Entries: > > > > -- > > Processors: APIC ID Version State Family Model Step > > Flags > > 3 0x11 BSP, usable 6 1 9 > > 0xfbff > > 0 0x11 AP, usable 6 1 9 > > 0xfbff > > 1 0x11 AP, usable 6 1 9 > > 0xfbff > > 2 0x11 AP, usable 6 1 9 > > 0xfbff > > -- > > Bus: Bus ID Type > > 0 PCI > > 1 PCI > > 2 EISA > > -- > > I/O APICs: APIC ID Version State Address > > 8 0x11 usable 0xfec00000 > > 9 0x11 usable 0xfec0c000 > > -- > > I/O Ints: Type Polarity Trigger Bus ID IRQ APIC ID PIN# > > ExtINT active-hi edge 2 0 8 0 > > INT conforms conforms 2 1 8 1 > > INT conforms conforms 2 2 8 2 > > INT conforms conforms 2 3 8 3 > > INT conforms conforms 2 4 8 4 > > INT conforms conforms 2 5 8 5 > > INT conforms conforms 2 6 8 6 > > INT conforms conforms 2 7 8 7 > > INT conforms conforms 2 8 8 8 > > INT conforms conforms 2 9 8 9 > > INT conforms conforms 2 10 8 10 > > INT conforms conforms 2 11 8 11 > > INT conforms conforms 2 12 8 12 > > INT conforms conforms 2 13 8 13 > > INT conforms conforms 2 14 8 14 > > INT conforms conforms 2 15 8 15 > > INT active-lo level 0 1:A 9 11 > > INT active-lo level 1 1:A 9 12 > > INT active-lo level 1 2:A 9 12 > > -- > > Local Ints: Type Polarity Trigger Bus ID IRQ APIC ID PIN# > > ExtINT active-hi edge 0 0:A 255 0 > > NMI active-hi edge 0 0:A 255 1 > > > > ---------------------------------------------------------------------------- > > > > dmesg output: > > > > Copyright (c) 1992-2004 The FreeBSD Project. > > Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 > > The Regents of the University of California. All rights reserved. > > FreeBSD 5.3-RC2 #0: Thu Nov 4 03:48:56 GMT 2004 > > > > toor@xjamesfriis.<CENSORED>:/usr/src/sys/i386/compile/JAMESFRIIS > > MPTable: <Fujitsu Pro Server > > > Timecounter "i8254" frequency 1193182 Hz quality 0 > > CPU: Pentium Pro (199.84-MHz 686-class CPU) > > Origin = "GenuineIntel" Id = 0x619 Stepping = 9 > > > > Features=0xfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMO > > V> > > real memory = 2147483648 (2048 MB) > > avail memory = 2095947776 (1998 MB) > > FreeBSD/SMP: Multiprocessor System Detected: 4 CPUs > > cpu0 (BSP): APIC ID: 3 > > cpu1 (AP): APIC ID: 0 > > cpu2 (AP): APIC ID: 1 > > cpu3 (AP): APIC ID: 2 > > ioapic0: Assuming intbase of 0 > > ioapic1: Assuming intbase of 16 > > ioapic0 <Version 1.1> irqs 0-15 on motherboard > > ioapic1 <Version 1.1> irqs 16-31 on motherboard > > npx0: [FAST] > > npx0: <math processor> on motherboard > > npx0: INT 16 interface > > pcib0: <MPTable Host-PCI bridge> pcibus 0 on motherboard > > pci0: <PCI bus> on pcib0 > > fxp0: <Intel 82557 Pro/100 Ethernet> port 0xfce0-0xfcff mem > > 0xfe900000-0xfe9fffff,0xfe8ff000-0xfe8fffff irq 27 at device 1.0 on pci0 > > miibus0: <MII bus> on fxp0 > > ukphy0: <Generic IEEE 802.3u media interface> on miibus0 > > ukphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto > > fxp0: Ethernet address: 00:10:a8:00:10:d6 > > pci0: <display, VGA> at device 2.0 (no driver attached) > > eisab0: <PCI-EISA bridge> at device 3.0 on pci0 > > eisa0: <EISA bus> on eisab0 > > mainboard0: <FUJc081 (System Board)> on eisa0 slot 0 > > isa0: <ISA bus> on eisab0 > > pcib1: <MPTable Host-PCI bridge> pcibus 1 on motherboard > > pci1: <PCI bus> on pcib1 > > ahc0: <Adaptec aic7880 Ultra SCSI adapter> port 0xf800-0xf8ff mem > > 0xfceef000-0xfceeffff irq 28 at device 1.0 on pci1 > > ahc0: [GIANT-LOCKED] > > aic7880: Ultra Wide Channel A, SCSI Id=7, 16/253 SCBs > > ahc1: <Adaptec aic7880 Ultra SCSI adapter> port 0xf400-0xf4ff mem > > 0xfceee000-0xfceeefff irq 28 at device 2.0 on pci1 > > ahc1: [GIANT-LOCKED] > > aic7880: Ultra Wide Channel A, SCSI Id=7, 16/253 SCBs > > pci1: <base peripheral> at device 3.0 (no driver attached) > > cpu0 on motherboard > > cpu1 on motherboard > > cpu2 on motherboard > > cpu3 on motherboard > > orm0: <ISA Option ROM> at iomem 0xc0000-0xc7fff on isa0 > > pmtimer0 on isa0 > > ata0 at port 0x3f6,0x1f0-0x1f7 irq 14 on isa0 > > ata1 at port 0x376,0x170-0x177 irq 15 on isa0 > > atkbdc0: <Keyboard controller (i8042)> at port 0x64,0x60 on isa0 > > atkbd0: <AT Keyboard> irq 1 on atkbdc0 > > kbd0 at atkbd0 > > atkbd0: [GIANT-LOCKED] > > psm0: <PS/2 Mouse> irq 12 on atkbdc0 > > psm0: [GIANT-LOCKED] > > psm0: model MouseMan+, device ID 0 > > fdc0: <Enhanced floppy controller> at port 0x3f0-0x3f5 irq 6 drq 2 on isa0 > > fdc0: [FAST] > > fd0: <1440-KB 3.5" drive> on fdc0 drive 0 > > ppc0: <Parallel port> at port 0x378-0x37f irq 7 on isa0 > > ppc0: Generic chipset (NIBBLE-only) in COMPATIBLE mode > > ppbus0: <Parallel port bus> on ppc0 > > plip0: <PLIP network interface> on ppbus0 > > lpt0: <Printer> on ppbus0 > > lpt0: Interrupt-driven port > > ppi0: <Parallel I/O> on ppbus0 > > sc0: <System console> at flags 0x100 on isa0 > > sc0: VGA <16 virtual consoles, flags=0x300> > > sio0 at port 0x3f8-0x3ff irq 4 flags 0x10 on isa0 > > sio0: type 16550A > > sio1 at port 0x2f8-0x2ff irq 3 on isa0 > > sio1: type 16550A > > vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0 > > unknown: <IBM Enhanced (101/102-key) KC> can't assign resources (port) > > unknown: <PNP0501> can't assign resources (port) > > unknown: <PNP0501> can't assign resources (port) > > unknown: <PNP0401> can't assign resources (port) > > unknown: <PNP0700> can't assign resources (port) > > Timecounters tick every 10.000 msec > > Waiting 15 seconds for SCSI devices to settle > > (probe6:ahc0:0:6:0): AutoSense Failed > > (probe5:ahc0:0:6:1): AutoSense Failed > > (probe0:ahc0:0:6:2): AutoSense Failed > > (probe5:ahc0:0:6:3): AutoSense Failed > > (probe5:ahc0:0:6:4): AutoSense Failed > > (probe0:ahc0:0:6:5): AutoSense Failed > > (probe0:ahc0:0:6:6): AutoSense Failed > > (probe0:ahc0:0:6:7): AutoSense Failed > > (probe21:ahc1:0:6:0): AutoSense Failed > > (probe1:ahc1:0:6:1): AutoSense Failed > > (probe1:ahc1:0:6:2): AutoSense Failed > > (probe1:ahc1:0:6:3): AutoSense Failed > > (probe1:ahc1:0:6:4): AutoSense Failed > > (probe1:ahc1:0:6:5): AutoSense Failed > > (probe1:ahc1:0:6:6): AutoSense Failed > > (probe1:ahc1:0:6:7): AutoSense Failed > > sa0 at ahc0 bus 0 target 4 lun 0 > > sa0: <WangDAT Model 3400DX 04j0> Removable Sequential Access SCSI-2 device > > sa0: 10.000MB/s transfers (10.000MHz, offset 15) > > ses0 at ahc0 bus 0 target 6 lun 0 > > ses0: <FUJITSU SAF-TE PROCESSOR 1.00> Fixed Processor SCSI-2 device > > ses0: 3.300MB/s transfers > > ses0: SAF-TE Compliant Device > > ses1 at ahc1 bus 0 target 6 lun 0 > > ses1: <FUJITSU SAF-TE PROCESSOR 1.00> Fixed Processor SCSI-2 device > > ses1: 3.300MB/s transfers > > ses1: SAF-TE Compliant Device > > da0 at ahc0 bus 0 target 0 lun 0 > > da0: <FUJITSU M2954E-512 0162> Fixed Direct Access SCSI-2 device > > da0: 40.000MB/s transfers (20.000MHz, offset 8, 16bit), Tagged Queueing > > Enabled > > da0: 4149MB (8498506 512 byte sectors: 255H 63S/T 529C) > > da1 at ahc0 bus 0 target 1 lun 0 > > da1: <FUJITSU M2954E-512 0162> Fixed Direct Access SCSI-2 device > > da1: 40.000MB/s transfers (20.000MHz, offset 8, 16bit), Tagged Queueing > > Enabled > > da1: 4149MB (8498506 512 byte sectors: 255H 63S/T 529C) > > da2 at ahc0 bus 0 target 2 lun 0 > > da2: <FUJITSU M2954E-512 0162> Fixed Direct Access SCSI-2 device > > da2: 40.000MB/s transfers (20.000MHz, offset 8, 16bit), Tagged Queueing > > Enabled > > da2: 4149MB (8498506 512 byte sectors: 255H 63S/T 529C) > > da3 at ahc1 bus 0 target 0 lun 0 > > da3: <FUJITSU M2954E-512 0162> Fixed Direct Access SCSI-2 device > > da3: 40.000MB/s transfers (20.000MHz, offset 8, 16bit), Tagged Queueing > > Enabled > > da3: 4149MB (8498506 512 byte sectors: 255H 63S/T 529C) > > da4 at ahc1 bus 0 target 1 lun 0 > > da4: <FUJITSU M2954E-512 0162> Fixed Direct Access SCSI-2 device > > da4: 40.000MB/s transfers (20.000MHz, offset 8, 16bit), Tagged Queueing > > Enabled > > da4: 4149MB (8498506 512 byte sectors: 255H 63S/T 529C) > > da5 at ahc1 bus 0 target 2 lun 0 > > da5: <SEAGATE ST39102LC 0006> Fixed Direct Access SCSI-2 device > > da5: 40.000MB/s transfers (20.000MHz, offset 8, 16bit), Tagged Queueing > > Enabled > > da5: 8683MB (17783240 512 byte sectors: 255H 63S/T 1106C) > > cd0 at ahc0 bus 0 target 5 lun 0 > > cd0: <MATSHITA CD-ROM CR-508 XS03> Removable CD-ROM SCSI-2 device > > cd0: 10.000MB/s transfers (10.000MHz, offset 15) > > cd0: Attempt to query device size failed: NOT READY, Medium not present > > GEOM_MIRROR: Device mirror0 created (id=138753045). > > GEOM_MIRROR: Device mirror0: provider da0 detected. > > GEOM_CONCAT: Device usr2 created (id=1051984440). > > GEOM_CONCAT: Disk da1 attached to usr2. > > GEOM_CONCAT: Disk da2 attached to usr2. > > GEOM_MIRROR: Device mirror0: provider da3 detected. > > GEOM_MIRROR: Device mirror0: provider da3 activated. > > GEOM_MIRROR: Device mirror0: provider mirror/mirror0 launched. > > GEOM_MIRROR: Device mirror0: rebuilding provider da0. > > GEOM_CONCAT: Disk da4 attached to usr2. > > GEOM_CONCAT: Disk da5 attached to usr2. > > GEOM_CONCAT: Device usr2 activated. > > SMP: AP CPU #3 Launched! > > SMP: AP CPU #1 Launched! > > SMP: AP CPU #2 Launched! > > Mounting root from ufs:/dev/mirror/mirror0a > > WARNING: / was not properly dismounted > > WARNING: /var was not properly dismounted > > WARNING: /usr was not properly dismounted > > /usr: mount pending error: blocks 4 files 2 > > WARNING: /usr2 was not properly dismounted > > > >
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?1101273082.48967.57.camel>