Date: Fri, 4 Aug 2000 16:39:32 -0600 From: Charles Randall <crandall@matchlogic.com> To: freebsd-smp@freebsd.org Subject: 4.0-R panic on Dell PowerEdge 2450 Message-ID: <5FE9B713CCCDD311A03400508B8B301301C78A17@bdr-xcln.is.matchlogic.com>
next in thread | raw e-mail | index | archive | help
I've run into the following panic under heavy I/O on a Dell PowerEdge 2450 (2x866 MHz P-III, 1 GB RAM, etc) running 4.0-R. There's a lot of information here... I can reproduce this in a few hours or less by running multiple concurrent "sort" processes in a "while /usr/bin/true" loop on a very large file (the disk I/O is for the temporary files in the directory pointed to by -T). The disk controller is, ahc0: <Adaptec aic7899 Ultra160 SCSI adapter> port 0xdc00-0xdcff mem 0xf8fff000- 0xf8ffffff irq 5 at device 4.0 on pci2 and the disks are, da0: <SEAGATE ST318404LC 0005> Fixed Direct Access SCSI-3 device da0: 80.000MB/s transfers (40.000MHz, offset 63, 16bit), Tagged Queueing Enabled da0: 17366MB (35566478 512 byte sectors: 255H 63S/T 2213C) da1 at ahc0 bus 0 target 1 lun 0 da1: <SEAGATE ST318404LC 0005> Fixed Direct Access SCSI-3 device da1: 80.000MB/s transfers (40.000MHz, offset 63, 16bit), Tagged Queueing Enabled da1: 17366MB (35566478 512 byte sectors: 255H 63S/T 2213C) Here's the panic info (I had to copy this from the console so there may be a mistake but I did triple-check it), --- snip --- mp_lock = 01000001; cpuid = 1 lapic.id = 00000000 instruction pointer = 0x8:0xc02bf983 stack pointer = 0x10:0xff80dffc frame pointer = 0x10:0x0 code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, def32 1, gran 1 processor eflags = interrupt enabled, IOPL = 0 current process = Idle interrupt mask = none <- SMP: XXX trap number = 29 panic: unknown/reserved trap mp_lock = 01000001; cpuid = 1 lapic.id = 00000000 boot() called on cpu#1 syncing disks... Timedout SCB handled by another timeout --- snip --- The machine never rebooted, it just locked up. I was running a debug SMP kernel configured with "config -g" at the time. I've been unable to reproduce this with the GENERIC kernel (I ran the sort test above for more than 24 hours without problem -- that doesn't prove that it doesn't happen, just that it doesn't happen as often). Here's a diff between GENERIC and my CUSTOM kernel, --- snip --- --- GENERIC Thu Mar 9 16:32:55 2000 +++ CUSTOM Fri Aug 4 13:36:02 2000 @@ -22,8 +22,8 @@ cpu I486_CPU cpu I586_CPU cpu I686_CPU -ident GENERIC -maxusers 32 +ident CUSTOM +maxusers 128 #makeoptions DEBUG=-g #Build kernel with gdb(1) debug symbols @@ -54,12 +54,12 @@ options ICMP_BANDLIM #Rate limit bad replies # To make an SMP kernel, the next two are needed -#options SMP # Symmetric MultiProcessor Kernel -#options APIC_IO # Symmetric (APIC) I/O +options SMP # Symmetric MultiProcessor Kernel +options APIC_IO # Symmetric (APIC) I/O # Optionally these may need tweaked, (defaults shown): #options NCPU=2 # number of CPUs #options NBUS=4 # number of busses -#options NAPIC=1 # number of IO APICs +options NAPIC=2 # number of IO APICs #options NINTR=24 # number of INTs device isa --- snip --- Mptable returns the following on this system, --- snip --- ============================================================================ === MPTable, version 2.0.15 ---------------------------------------------------------------------------- --- MP Floating Pointer Structure: location: BIOS physical address: 0x000fe710 signature: '_MP_' length: 16 bytes version: 1.4 checksum: 0x91 mode: Virtual Wire ---------------------------------------------------------------------------- --- MP Config Table Header: physical address: 0x000f0000 signature: 'PCMP' base table length: 372 version: 1.4 checksum: 0xd6 OEM ID: 'DELL ' Product ID: 'POWEREDGE A6' OEM table pointer: 0x00000000 OEM table size: 0 entry count: 38 local APIC address: 0xfee00000 extended table length: 128 extended table checksum: 0 ---------------------------------------------------------------------------- --- MP Config Base Table Entries: -- Processors: APIC ID Version State Family Model Step Flags 1 0x11 BSP, usable 6 8 3 0x383fbff 0 0x11 AP, usable 6 8 3 0x383fbff -- Bus: Bus ID Type 0 PCI 1 PCI 2 PCI 3 ISA -- I/O APICs: APIC ID Version State Address 2 0x11 usable 0xfec00000 3 0x11 usable 0xfec01000 -- I/O Ints: Type Polarity Trigger Bus ID IRQ APIC ID PIN# ExtINT active-hi edge 3 0 2 0 INT conforms conforms 3 1 2 1 INT conforms conforms 3 3 2 3 INT conforms conforms 3 4 2 4 INT conforms conforms 3 6 2 6 INT conforms conforms 3 7 2 7 INT conforms conforms 3 8 2 8 INT conforms conforms 3 9 2 9 INT conforms conforms 3 12 2 12 INT conforms conforms 3 14 2 14 INT conforms conforms 3 15 2 15 INT conforms conforms 1 8:A 3 0 INT conforms conforms 2 4:A 3 15 INT conforms conforms 2 4:B 3 14 INT conforms conforms 0 4:A 3 1 INT conforms conforms 0 4:C 3 1 INT conforms conforms 0 4:B 3 2 INT conforms conforms 0 4:D 3 2 INT conforms conforms 0 2:A 3 4 INT conforms conforms 0 2:C 3 4 INT conforms conforms 0 2:B 3 5 INT conforms conforms 0 2:D 3 5 INT conforms conforms 0 8:A 3 6 INT conforms conforms 0 8:C 3 6 INT conforms conforms 0 8:B 3 7 INT conforms conforms 0 8:D 3 7 INT conforms conforms 1 2:B 3 14 INT conforms conforms 1 2:A 3 15 -- Local Ints: Type Polarity Trigger Bus ID IRQ APIC ID PIN# ExtINT active-hi edge 3 0 255 0 NMI active-hi edge 3 0 255 1 ---------------------------------------------------------------------------- --- MP Config Extended Table Entries: -- bus ID: 0 address type: I/O address address base: 0xe000 address range: 0x1000 -- bus ID: 0 address type: memory address address base: 0xa0000 address range: 0x20000 -- bus ID: 0 address type: I/O address address base: 0x0 address range: 0x1000 -- bus ID: 0 address type: memory address address base: 0xfb000000 address range: 0x3010000 -- bus ID: 1 address type: I/O address address base: 0xc000 address range: 0x2000 -- bus ID: 1 address type: memory address address base: 0xf4000000 address range: 0x6110000 -- bus ID: 3 bus info: 0x01 parent bus ID: 0 ---------------------------------------------------------------------------- --- # SMP kernel config file options: # Required: options SMP # Symmetric MultiProcessor Kernel options APIC_IO # Symmetric (APIC) I/O # Optional (built-in defaults will work in most cases): #options NCPU=2 # number of CPUs #options NBUS=4 # number of busses #options NAPIC=2 # number of IO APICs #options NINTR=28 # number of INTs ============================================================================ === --- snip --- Note that NAPIC isn't specified. However, the kernel won't boot without NAPIC=2 as I've specivied. Finally, I'm running Luoqi's patch for multiple APIC support based on "diff -p -u -r1.250.2.2 -r1.250.2.3". I've confirmed with him that this patch is correct, --- snip --- --- ./backup/pmap.c Tue Jul 25 18:32:03 2000 +++ pmap.c Tue Jul 25 18:33:06 2000 @@ -426,9 +426,10 @@ for (j = 0; j < mp_napics; j++) { /* same page frame as a previous IO apic? */ if (((vm_offset_t)SMPpt[NPTEPG-2-j] & PG_FRAME) == - (io_apic_address[0] & PG_FRAME)) { + (io_apic_address[i] & PG_FRAME)) { ioapic[i] = (ioapic_t *)((u_int)SMP_prvspace - + (NPTEPG-2-j)*PAGE_SIZE); + + (NPTEPG-2-j) * PAGE_SIZE + + (io_apic_address[i] & PAGE_MASK)); break; } /* use this slot if available */ @@ -436,7 +437,8 @@ SMPpt[NPTEPG-2-j] = (pt_entry_t)(PG_V | PG_RW | pgeflag | (io_apic_address[i] & PG_FRAME)); ioapic[i] = (ioapic_t *)((u_int)SMP_prvspace - + (NPTEPG-2-j)*PAGE_SIZE); + + (NPTEPG-2-j) * PAGE_SIZE + + (io_apic_address[i] & PAGE_MASK)); break; } } --- snip --- There are no clues in the system log. Have any other 2450 users seen this? I'm going to try 4.1-R now. Thanks, Charles To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-smp" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?5FE9B713CCCDD311A03400508B8B301301C78A17>