Date: Wed, 31 May 2000 11:38:15 -0700 (PDT) From: dh@digitalbrain.com To: freebsd-gnats-submit@FreeBSD.org Subject: kern/18919: Dell PowerEdge 2450/733 SMP panics under heavy disk IO load Message-ID: <20000531183815.D80FB37B8DF@hub.freebsd.org>
next in thread | raw e-mail | index | archive | help
>Number: 18919
>Category: kern
>Synopsis: Dell PowerEdge 2450/733 SMP panics under heavy disk IO load
>Confidential: no
>Severity: critical
>Priority: high
>Responsible: freebsd-bugs
>State: open
>Quarter:
>Keywords:
>Date-Required:
>Class: sw-bug
>Submitter-Id: current-users
>Arrival-Date: Wed May 31 11:40:01 PDT 2000
>Closed-Date:
>Last-Modified:
>Originator: David Hanney
>Release: 4.0 STABLE (supped yesterday)
>Organization:
digitalbrain.com ltd
>Environment:
duel cpu PowerEdge 2450/733
5 scsi disks (3 on channel A 2 on channel B)
-------------------------------------------------------------
Copyright (c) 1992-2000 The FreeBSD Project.
Copyright (c) 1982, 1986, 1989, 1991, 1993
The Regents of the University of California. All rights reserved.
FreeBSD 4.0-STABLE #0: Wed May 31 19:00:15 BST 2000
dave@fast.workgroup:/usr/src/sys/compile/TEST
Timecounter "i8254" frequency 1193182 Hz
CPU: Pentium III/Pentium III Xeon (728.44-MHz 686-class CPU)
Origin = "GenuineIntel" Id = 0x683 Stepping = 3
Features=0x383fbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,MMX,FXSR,XMM>
real memory = 1073733632 (1048568K bytes)
avail memory = 1040756736 (1016364K bytes)
Programming 16 pins in IOAPIC #0
Programming 16 pins in IOAPIC #1
IOAPIC #1 intpin 0 -> irq 2
IOAPIC #1 intpin 1 -> irq 11
IOAPIC #1 intpin 2 -> irq 13
IOAPIC #1 intpin 4 -> irq 16
IOAPIC #1 intpin 5 -> irq 17
IOAPIC #1 intpin 6 -> irq 18
IOAPIC #1 intpin 7 -> irq 19
IOAPIC #1 intpin 14 -> irq 10
IOAPIC #1 intpin 15 -> irq 5
FreeBSD/SMP: Multiprocessor motherboard
cpu0 (BSP): apic id: 1, version: 0x00040011, at 0xfee00000
cpu1 (AP): apic id: 0, version: 0x00040011, at 0xfee00000
io0 (APIC): apic id: 2, version: 0x000f0011, at 0xfec00000
io1 (APIC): apic id: 3, version: 0x000f0011, at 0xfec01000
Preloaded elf kernel "kernel" at 0xc03ee000.
Pentium Pro MTRR support enabled
md0: Malloc disk
npx0: <math processor> on motherboard
npx0: INT 16 interface
pcib0: <RCC LE host to PCI bridge> on motherboard
pci0: <PCI bus> on pcib0
pci0: <ATI model 4759 graphics accelerator> at 14.0
isab0: <PCI to ISA bridge (vendor=1166 device=0200)> at device 15.0 on pci0
isa0: <ISA bus> on isab0
atapci0: <Unknown PCI ATA controller (generic mode)> port 0x8b0-0x8bf at device 15.1 on pci0
ata0: at 0x1f0 irq 14 on atapci0
pci0: <OHCI USB controller> at 15.2 irq 11
pcib1: <RCC LE host to PCI bridge> on motherboard
pci1: <PCI bus> on pcib1
pcib2: <PCI to PCI bridge (vendor=8086 device=0962)> at device 2.0 on pci1
pci2: <PCI bus> on pcib2
ahc0: <Adaptec aic7899 Ultra160 SCSI adapter> port 0xdc00-0xdcff mem 0xf8fff000-0xf8ffffff irq 5 at device 4.0 on pci2
ahc0: aic7899 Wide Channel A, SCSI Id=7, 16/255 SCBs
ahc1: <Adaptec aic7899 Ultra160 SCSI adapter> port 0xd800-0xd8ff mem 0xf8ffe000-0xf8ffefff irq 10 at device 4.1 on pci2
ahc1: aic7899 Wide Channel B, SCSI Id=7, 16/255 SCBs
fxp0: <Intel EtherExpress Pro 10/100B Ethernet> port 0xccc0-0xccff mem 0xfa000000-0xfa0fffff,0xfa100000-0xfa100fff irq 2 at device 8.0 on pci1
fxp0: Ethernet address 00:b0:d0:20:ee:74
fxp0: supplying EUI64: 00:b0:d0:ff:fe:20:ee:74
fdc0: <NEC 72065B or clone> at port 0x3f0-0x3f5,0x3f7 irq 6 drq 2 on isa0
fdc0: FIFO enabled, 8 bytes threshold
fd0: <1440-KB 3.5" drive> on fdc0 drive 0
atkbdc0: <Keyboard controller (i8042)> at port 0x60,0x64 on isa0
atkbd0: <AT Keyboard> irq 1 on atkbdc0
vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0
sc0: <System console> on isa0
sc0: VGA <16 virtual consoles, flags=0x200>
sio0 at port 0x3f8-0x3ff irq 4 flags 0x10 on isa0
sio0: type 16550A
sio1 at port 0x2f8-0x2ff irq 3 on isa0
sio1: type 16550A
ppc0: <Parallel port> at port 0x378-0x37f irq 7 on isa0
ppc0: Generic chipset (ECP/PS2/NIBBLE) in COMPATIBLE mode
ppc0: FIFO with 16/16/8 bytes threshold
ppi0: <Parallel I/O> on ppbus0
lpt0: <Printer> on ppbus0
lpt0: Interrupt-driven port
plip0: <PLIP network interface> on ppbus0
Vendor Specific Word = ffff
APIC_IO: routing 8254 via 8259 and IOAPIC #0 intpin 0
SMP: AP CPU #1 Launched!
acd0: CDROM <SAMSUNG CD-ROM SN-124> at ata0-master using PIO4
Waiting 15 seconds for SCSI devices to settle
pass3 at ahc0 bus 0 target 6 lun 0
pass3: <DELL 2x3 U2W SCSI BP 1.15> Fixed Processor SCSI-2 device
pass3: 3.300MB/s transfers
pass6 at ahc1 bus 0 target 6 lun 0
pass6: <DELL 2x2 U2W SCSI BP 1.15> Fixed Processor SCSI-2 device
pass6: 3.300MB/s transfers
da0 at ahc0 bus 0 target 2 lun 0
da0: <QUANTUM ATLAS 10K 18SCA UCIE> Fixed Direct Access SCSI-3 device
da0: 80.000MB/s transfers (40.000MHz, offset 31, 16bit), Tagged Queueing Enabled
da0: 17366MB (35566499 512 byte sectors: 255H 63S/T 2213C)
da3 at ahc1 bus 0 target 0 lun 0
da3: <QUANTUM ATLAS 10K 18SCA UCIE> Fixed Direct Access SCSI-3 device
da3: 80.000MB/s transfers (40.000MHz, offset 31, 16bit), Tagged Queueing Enabled
da3: 17366MB (35566499 512 byte sectors: 255H 63S/T 2213C)
da1 at ahc0 bus 0 target 3 lun 0
da1: <QUANTUM ATLAS 10K 18SCA UCIE> Fixed Direct Access SCSI-3 device
da1: 80.000MB/s transfers (40.000MHz, offset 31, 16bit), Tagged Queueing Enabled
da1: 17366MB (35566499 512 byte sectors: 255H 63S/T 2213C)
da4 at ahc1 bus 0 target 1 lun 0
da4: <QUANTUM ATLAS 10K 18SCA UCIE> Fixed Direct Access SCSI-3 device
da4: 80.000MB/s transfers (40.000MHz, offset 31, 16bit), Tagged Queueing Enabled
da4: 17366MB (35566499 512 byte sectors: 255H 63S/T 2213C)
da2 at ahc0 bus 0 target 4 lun 0
da2: <QUANTUM ATLAS 10K 18SCA UCIE> Fixed Direct Access SCSI-3 device
da2: 80.000MB/s transfers (40.000MHz, offset 31, 16bit), Tagged Queueing Enabled
da2: 17366MB (35566499 512 byte sectors: 255H 63S/T 2213C)
Mounting root from ufs:/dev/da0s1a
vinum: loaded
vinum: reading configuration from /dev/da4e
vinum: updating configuration from /dev/da3e
vinum: updating configuration from /dev/da2e
vinum: updating configuration from /dev/da1e
fxp0: starting DAD for fe80:0001::02b0:d0ff:fe20:ee74
fxp0: DAD complete for fe80:0001::02b0:d0ff:fe20:ee74 - no duplicates found
-------------------------------------------------------------
===============================================================================
MPTable, version 2.0.15
-------------------------------------------------------------------------------
MP Floating Pointer Structure:
location: BIOS
physical address: 0x000fe710
signature: '_MP_'
length: 16 bytes
version: 1.4
checksum: 0x91
mode: Virtual Wire
-------------------------------------------------------------------------------
MP Config Table Header:
physical address: 0x000f0000
signature: 'PCMP'
base table length: 372
version: 1.4
checksum: 0xd6
OEM ID: 'DELL '
Product ID: 'POWEREDGE A6'
OEM table pointer: 0x00000000
OEM table size: 0
entry count: 38
local APIC address: 0xfee00000
extended table length: 128
extended table checksum: 0
-------------------------------------------------------------------------------
MP Config Base Table Entries:
--
Processors: APIC ID Version State Family Model Step Flags
1 0x11 BSP, usable 6 8 3 0x383fbff
0 0x11 AP, usable 6 8 3 0x383fbff
--
Bus: Bus ID Type
0 PCI
1 PCI
2 PCI
3 ISA
--
I/O APICs: APIC ID Version State Address
2 0x11 usable 0xfec00000
3 0x11 usable 0xfec01000
--
I/O Ints: Type Polarity Trigger Bus ID IRQ APIC ID PIN#
ExtINT active-hi edge 3 0 2 0
INT conforms conforms 3 1 2 1
INT conforms conforms 3 3 2 3
INT conforms conforms 3 4 2 4
INT conforms conforms 3 6 2 6
INT conforms conforms 3 7 2 7
INT conforms conforms 3 8 2 8
INT conforms conforms 3 9 2 9
INT conforms conforms 3 12 2 12
INT conforms conforms 3 14 2 14
INT conforms conforms 3 15 2 15
INT conforms conforms 1 8:A 3 0
INT conforms conforms 2 4:A 3 15
INT conforms conforms 2 4:B 3 14
INT conforms conforms 0 4:A 3 1
INT conforms conforms 0 4:C 3 1
INT conforms conforms 0 4:B 3 2
INT conforms conforms 0 4:D 3 2
INT conforms conforms 0 2:A 3 4
INT conforms conforms 0 2:C 3 4
INT conforms conforms 0 2:B 3 5
INT conforms conforms 0 2:D 3 5
INT conforms conforms 0 8:A 3 6
INT conforms conforms 0 8:C 3 6
INT conforms conforms 0 8:B 3 7
INT conforms conforms 0 8:D 3 7
INT conforms conforms 1 2:B 3 14
INT conforms conforms 1 2:A 3 15
--
Local Ints: Type Polarity Trigger Bus ID IRQ APIC ID PIN#
ExtINT active-hi edge 3 0 255 0
NMI active-hi edge 3 0 255 1
-------------------------------------------------------------------------------
MP Config Extended Table Entries:
--
bus ID: 0 address type: I/O address
address base: 0xe000
address range: 0x1000
--
bus ID: 0 address type: memory address
address base: 0xa0000
address range: 0x20000
--
bus ID: 0 address type: I/O address
address base: 0x0
address range: 0x1000
--
bus ID: 0 address type: memory address
address base: 0xfb000000
address range: 0x3010000
--
bus ID: 1 address type: I/O address
address base: 0xc000
address range: 0x2000
--
bus ID: 1 address type: memory address
address base: 0xf4000000
address range: 0x6110000
--
bus ID: 3 bus info: 0x01 parent bus ID: 0
-------------------------------------------------------------------------------
# SMP kernel config file options:
# Required:
options SMP # Symmetric MultiProcessor Kernel
options APIC_IO # Symmetric (APIC) I/O
# Optional (built-in defaults will work in most cases):
#options NCPU=2 # number of CPUs
#options NBUS=4 # number of busses
#options NAPIC=2 # number of IO APICs
#options NINTR=28 # number of INTs
===============================================================================
kernel differs from GENERIC like this:
< cpu I386_CPU
< cpu I486_CPU
> options DDB
> options DIAGNOSTIC
> options INVARIANTS
> options INVARIANT_SUPPORT
> options NAPIC=2 # number of IO APICs
> options NBUS=4 # number of busses
> options NCPU=2 # number of CPUs
> options NINTR=28 # number of INTs
> options APIC_IO # Symmetric (APIC) I/O
> options SMP # Symmetric MultiProcessor Kernel
>Description:
lots of disk IO causes this crash:
mp_lock 01000001 cpuid 1 lapic.id = 0
ip = 0x8:0xc02ca46f
eflags = iterrupt neabled, iopl=0
current process = idle
int mask = none <- SMP: XXX
kernel: type 29 trap, code=0
I look at the kernel symbol table.
that ip is in 'idle_loop'
>How-To-Repeat:
do lots of disk IO
>Fix:
use UP mode:(
>Release-Note:
>Audit-Trail:
>Unformatted:
To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-bugs" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20000531183815.D80FB37B8DF>
