Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 04 Jun 2005 03:48:04 +0200
From:      Palle Girgensohn <girgen@pingpong.net>
To:        freebsd-stable@freebsd.org
Cc:        Brendan White <bmwt@caida.org>
Subject:   Repeatable crash with 5.4-p1-RELEASE and SMP
Message-ID:  <2032FF2A928A89651F1C7843@rambutan.pingpong.net>

next in thread | raw e-mail | index | archive | help
Hi!

This is very similar to Brendan White problem just reported here. My guess=20
is it is the very same problem. I've reported the same problem on some=20
occasions before (although I use amd64, so my postings are to=20
amd64@freebsd.org).

My system is also Dell 2850, dual CPUs, 3GB RAM, running amd64 FreeBSD=20
5.4-p1. It is quite stable (but slow) when running without SMP. When SMP is =

on, it crashes within a few hours. High load, around 4. See my postings on=20
amd64@ for many more details.

Anyway, I have managed to get an automatic reboot and a core dump. Giant=20
leap for mankind :-) . It looks kind of partly overwritten, though.=20
According to the Developer's handbook, the core should be saved *before*=20
the swap partition is added to the system. I can easily verifying that this =

is not the case, the swap is "mounted" first. I once again raise the=20
question if PR conf/73834 shouln't be addressed? Or perhaps my core dump is =

quite normal? Doesn't look like it. In rc.conf, I have:

# kernel crash dumps
dumpdev=3D"/dev/amrd0s2b"
dumpdir=3D"/misc/crash"


Here's the dump. Anything else I shall extract, please just ask.

# kgdb kernel.debug /misc/crash/vmcore.11
[GDB will not be able to debug user-mode threads: /usr/lib/libthread_db.so: =

Undefined symbol "ps_pglobal_lookup"]
GNU gdb 6.1.1 [FreeBSD]
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you =
are
welcome to change it and/or distribute copies of it under certain=20
conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "amd64-marcel-freebsd".
#0  doadump () at pcpu.h:167
167             __asm __volatile("movq %%gs:0,%0" : "=3Dr" (td));
(kgdb) backtrace
#0  doadump () at pcpu.h:167
#1  0x0000000000000000 in ?? ()
#2  0xffffffff80341267 in boot (howto=3D260) at=20
/usr/src/sys/kern/kern_shutdown.c:410
#3  0xffffffff80341ac6 in panic (fmt=3D0xffffff007b76d000 "=A0=ABx{") at=20
/usr/src/sys/kern/kern_shutdown.c:566
#4  0xffffffff804f0f52 in trap_fatal (frame=3D0xc, =
eva=3D18446742976269307904)
    at /usr/src/sys/amd64/amd64/trap.c:639
#5  0xffffffff804f11ef in trap_pfault (frame=3D0xffffffffb1d229b0, =
usermode=3D0)
    at /usr/src/sys/amd64/amd64/trap.c:562
#6  0xffffffff804f1457 in trap (frame=3D
      {tf_rdi =3D -1097427517200, tf_rsi =3D -1097440243712, tf_rdx =3D =
1056,=20
tf_rcx =3D 0, tf_r8 =3D 0, tf_r9 =3D 0, tf_r
ax =3D 1056, tf_rbx =3D 0, tf_rbp =3D -1098069766144, tf_r10 =3D =
4503599627366400,=20
tf_r11 =3D 3392, tf_r12 =3D 4, tf_r13 =3D
 4, tf_r14 =3D -1099313881192, tf_r15 =3D -1097364452848, tf_trapno =3D 12, =

tf_addr =3D 136, tf_flags =3D -1099313881192
, tf_err =3D 0, tf_rip =3D -2144020582, tf_cs =3D 8, tf_rflags =3D 66050, =
tf_rsp =3D=20
-1311626640, tf_ss =3D 0})
    at /usr/src/sys/amd64/amd64/trap.c:341
#7  0xffffffff804deb0b in calltrap () at=20
/usr/src/sys/amd64/amd64/exception.S:171
#8  0xffffff007c3900f0 in ?? ()
#9  0xffffff007b76d000 in ?? ()
#10 0x0000000000000420 in ?? ()
#11 0x0000000000000000 in ?? ()
#12 0x0000000000000000 in ?? ()
#13 0x0000000000000000 in ?? ()
#14 0x0000000000000420 in ?? ()
#15 0x0000000000000000 in ?? ()
#16 0xffffff0055f11000 in ?? ()
#17 0x000ffffffffff000 in ?? ()
#18 0x0000000000000d40 in ?? ()
#19 0x0000000000000004 in ?? ()
#20 0x0000000000000004 in ?? ()
#21 0xffffff000bc95f98 in ?? ()
#22 0xffffff007ffb4a10 in ?? ()
#23 0x000000000000000c in ?? ()
#24 0x0000000000000088 in ?? ()
#25 0xffffff000bc95f98 in ?? ()
#26 0x0000000000000000 in ?? ()
#27 0xffffffff8034d79a in thread_fini (mem=3D0x0, size=3D0) at=20
/usr/src/sys/kern/kern_thread.c:271
#28 0x0000000000000000 in ?? ()
#29 0x0000000000000001 in ?? ()
#30 0xffffff007ffb4a00 in ?? ()
#31 0xffffff0055f11f98 in ?? ()
#32 0xffffffff804d46ff in zone_drain (zone=3D0x8) at=20
/usr/src/sys/vm/uma_core.c:749
#33 0xffffffff804d22b6 in zone_foreach (zfunc=3D0xffffffff804d4530=20
<zone_drain>)
    at /usr/src/sys/vm/uma_core.c:1494
#34 0xffffffff804d5ec9 in uma_reclaim () at /usr/src/sys/vm/uma_core.c:2623
#35 0xffffffff804cfcac in vm_pageout () at /usr/src/sys/vm/vm_pageout.c:674
#36 0xffffffff8032805c in fork_exit (callout=3D0xffffffff804cf6b0=20
<vm_pageout>, arg=3D0x0,
    frame=3D0xffffffffb1d22c50) at /usr/src/sys/kern/kern_fork.c:791
#37 0xffffffff804ded0e in fork_trampoline () at=20
/usr/src/sys/amd64/amd64/exception.S:296
#38 0x0000000000000000 in ?? ()
#39 0x0000000000000000 in ?? ()
#40 0x0000000000000001 in ?? ()
#41 0x0000000000000000 in ?? ()
#42 0x0000000000000000 in ?? ()
#43 0x0000000000000000 in ?? ()
#44 0x0000000000000000 in ?? ()
#45 0x0000000000000000 in ?? ()
#46 0x0000000000000000 in ?? ()
#47 0x0000000000000000 in ?? ()
#48 0x0000000000000000 in ?? ()
---Type <return> to continue, or q <return> to quit---
#49 0x0000000000000000 in ?? ()
#50 0x0000000000000000 in ?? ()
#51 0x0000000000000000 in ?? ()
#52 0x0000000000000000 in ?? ()
#53 0x0000000000000000 in ?? ()
#54 0x0000000000000000 in ?? ()
#55 0x0000000000000000 in ?? ()
#56 0x0000000000000000 in ?? ()
#57 0x0000000000000000 in ?? ()
#58 0x0000000000000000 in ?? ()
#59 0x0000000000000000 in ?? ()
#60 0x0000000000000000 in ?? ()
#61 0x0000000000000000 in ?? ()
#62 0x0000000000000000 in ?? ()
#63 0x0000000000000000 in ?? ()
#64 0x0000000000000000 in ?? ()
#65 0x0000000000000000 in ?? ()
#66 0x0000000000000000 in ?? ()
#67 0x0000000000000000 in ?? ()
#68 0x0000000000000000 in ?? ()
#69 0x0000000000000000 in ?? ()
#70 0x000000000095d000 in ?? ()
#71 0xffffffffb1d229b0 in ?? ()
#72 0x0000000000000104 in ?? ()
#73 0x0000000000000000 in ?? ()
#74 0xffffff007b78aba0 in ?? ()
#75 0xffffff007b7af280 in ?? ()
#76 0xffffffffb1d226e8 in ?? ()
#77 0xffffff007b76d000 in ?? ()
#78 0xffffffff80355d5c in sched_switch (td=3D0x0, newtd=3D0x0, flags=3D1) =
at=20
/usr/src/sys/kern/sched_4bsd.c:881
#79 0x0000000000000000 in ?? ()
#80 0x0000000000000000 in ?? ()
#81 0x0000000000000000 in ?? ()
#82 0x0000000000000000 in ?? ()
#83 0x0000000000000000 in ?? ()
#84 0x0000000000000000 in ?? ()
#85 0x0000000000000000 in ?? ()
#86 0x0000000000000000 in ?? ()
#87 0x0000000000000000 in ?? ()
#88 0x0000000000000000 in ?? ()
#89 0x0000000000000000 in ?? ()
#90 0x0000000000000000 in ?? ()
#91 0x0000000000000000 in ?? ()
#92 0x0000000000000000 in ?? ()
#93 0x0000000000000000 in ?? ()
#94 0x0000000000000000 in ?? ()
#95 0x0000000000000000 in ?? ()
#96 0x0000000000000000 in ?? ()
#97 0x0000000000000000 in ?? ()
#98 0x0000000000000000 in ?? ()
#99 0x0000000000000000 in ?? ()
#100 0x0000000000000000 in ?? ()
#101 0x0000000000000000 in ?? ()
#102 0x0000000000000000 in ?? ()
#103 0x0000000000000000 in ?? ()
#104 0x0000000000000000 in ?? ()
#105 0x0000000000000000 in ?? ()
#106 0x0000000000000000 in ?? ()
---Type <return> to continue, or q <return> to quit---
#107 0x0000000000000000 in ?? ()
#108 0x0000000000000000 in ?? ()
#109 0x0000000000000000 in ?? ()
#110 0x0000000000000000 in ?? ()
#111 0x0000000000000000 in ?? ()
#112 0x0000000000000000 in ?? ()
#113 0x0000000000000000 in ?? ()
#114 0x0000000000000000 in ?? ()
#115 0x0000000000000000 in ?? ()
#116 0x0000000000000000 in ?? ()
#117 0x0000000000000000 in ?? ()
#118 0x0000000000000000 in ?? ()
#119 0x0000000000000000 in ?? ()
#120 0x0000000000000000 in ?? ()
#121 0x0000000000000000 in ?? ()
#122 0x0000000000000000 in ?? ()
#123 0x0000000000000000 in ?? ()
#124 0x0000000000000000 in ?? ()
#125 0x0000000000000000 in ?? ()
#126 0x0000000000000000 in ?? ()
#127 0x0000000000000000 in ?? ()
#128 0x0000000000000000 in ?? ()
#129 0x0000000000000000 in ?? ()
#130 0x0000000000000000 in ?? ()
#131 0x0000000000000000 in ?? ()
#132 0x0000000000000000 in ?? ()
#133 0x0000000000000000 in ?? ()
#134 0x0000000000000000 in ?? ()
#135 0x0000000000000000 in ?? ()
#136 0x0000000000000000 in ?? ()
#137 0x0000000000000000 in ?? ()
#138 0x0000000000000000 in ?? ()
#139 0x0000000000000000 in ?? ()
#140 0x0000000000000000 in ?? ()
#141 0x0000000000000000 in ?? ()
#142 0x0000000000000000 in ?? ()
#143 0x0000000000000000 in ?? ()
#144 0x0000000000000000 in ?? ()
#145 0x0000000000000000 in ?? ()
#146 0x0000000000000000 in ?? ()
#147 0x0000000000000000 in ?? ()
#148 0x0000000000000000 in ?? ()
#149 0x0000000000000000 in ?? ()
#150 0x0000000000000000 in ?? ()
Cannot access memory at address 0xffffffffb1d23000


$ dmesg
Copyright (c) 1992-2005 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
        The Regents of the University of California. All rights reserved.
FreeBSD 5.4-RELEASE-p1 #9: Fri Jun  3 22:26:49 CEST 2005
    girgen@melon.pingpong.net:/usr/obj/usr/src/sys/MELON
Timecounter "i8254" frequency 1193182 Hz quality 0
CPU: Intel(R) Xeon(TM) CPU 2.80GHz (2793.01-MHz K8-class CPU)
  Origin =3D "GenuineIntel"  Id =3D 0xf41  Stepping =3D 1
=20
Features=3D0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,M=
CA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE>
  Features2=3D0x641d<SSE3,RSVD2>,MON,DS_CPL,CNTX-ID,CX16,<b14>>
  AMD Features=3D0x20100800<SYSCALL,NX,LM>
real memory  =3D 2147221504 (2047 MB)
avail memory =3D 2061885440 (1966 MB)
ACPI APIC Table: <DELL   PE BKC  >
FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs
 cpu0 (BSP): APIC ID:  0
 cpu1 (AP): APIC ID:  6
ioapic0: Changing APIC ID to 7
ioapic1: Changing APIC ID to 8
ioapic1: WARNING: intbase 32 !=3D expected base 24
ioapic2: Changing APIC ID to 9
ioapic2: WARNING: intbase 64 !=3D expected base 56
ioapic0 <Version 2.0> irqs 0-23 on motherboard
ioapic1 <Version 2.0> irqs 32-55 on motherboard
ioapic2 <Version 2.0> irqs 64-87 on motherboard
acpi0: <DELL PE BKC> on motherboard
acpi0: Power Button (fixed)
Timecounter "ACPI-fast" frequency 3579545 Hz quality 1000
acpi_timer0: <24-bit timer at 3.579545MHz> port 0x808-0x80b on acpi0
cpu0: <ACPI CPU> on acpi0
cpu1: <ACPI CPU> on acpi0
pcib0: <ACPI Host-PCI bridge> port 0xcf8-0xcff on acpi0
pci0: <ACPI PCI bus> on pcib0
pcib1: <ACPI PCI-PCI bridge> at device 2.0 on pci0
pci1: <ACPI PCI bus> on pcib1
pcib2: <ACPI PCI-PCI bridge> at device 0.0 on pci1
pci2: <ACPI PCI bus> on pcib2
amr0: <LSILogic MegaRAID 1.51> mem=20
0xdfdc0000-0xdfdfffff,0xd80f0000-0xd80fffff irq 46 at device 14.0 on pci2
amr0: <LSILogic PERC 4e/Di> Firmware 516A, BIOS H418, 256MB RAM
pcib3: <ACPI PCI-PCI bridge> at device 0.2 on pci1
pci3: <ACPI PCI bus> on pcib3
pcib4: <ACPI PCI-PCI bridge> at device 4.0 on pci0
pci4: <ACPI PCI bus> on pcib4
pcib5: <ACPI PCI-PCI bridge> at device 5.0 on pci0
pci5: <ACPI PCI bus> on pcib5
pcib6: <ACPI PCI-PCI bridge> at device 0.0 on pci5
pci6: <ACPI PCI bus> on pcib6
em0: <Intel(R) PRO/1000 Network Connection, Version - 1.7.35> port=20
0xecc0-0xecff mem 0xdfae0000-0xdfafffff irq 64 at device 7.0 on pci6
em0: Ethernet address: 00:11:43:37:a4:9e
em0:  Speed:N/A  Duplex:N/A
pcib7: <ACPI PCI-PCI bridge> at device 0.2 on pci5
pci7: <ACPI PCI bus> on pcib7
em1: <Intel(R) PRO/1000 Network Connection, Version - 1.7.35> port=20
0xdcc0-0xdcff mem 0xdf8e0000-0xdf8fffff irq 65 at device 8.0 on pci7
em1: Ethernet address: 00:11:43:37:a4:9f
em1:  Speed:N/A  Duplex:N/A
pcib8: <ACPI PCI-PCI bridge> at device 6.0 on pci0
pci8: <ACPI PCI bus> on pcib8
pci0: <serial bus, USB> at device 29.0 (no driver attached)
pci0: <serial bus, USB> at device 29.1 (no driver attached)
pci0: <serial bus, USB> at device 29.2 (no driver attached)
pci0: <serial bus, USB> at device 29.7 (no driver attached)
pcib9: <ACPI PCI-PCI bridge> at device 30.0 on pci0
pci9: <ACPI PCI bus> on pcib9
pci9: <display, VGA> at device 13.0 (no driver attached)
isab0: <PCI-ISA bridge> at device 31.0 on pci0
isa0: <ISA bus> on isab0
atapci0: <Intel ICH5 UDMA100 controller> port=20
0xfc00-0xfc0f,0x376,0x170-0x177,0x3f6,0x1f0-0x1f7 at device 31.1 on pci0
ata0: channel #0 on atapci0
ata1: channel #1 on atapci0
fdc0: <floppy drive controller> port 0x3f7,0x3f0-0x3f5 irq 6 drq 2 on acpi0
atkbdc0: <Keyboard controller (i8042)> port 0x64,0x60 irq 1 on acpi0
atkbd0: <AT Keyboard> flags 0x1 irq 1 on atkbdc0
kbd0 at atkbd0
sio0: <16550A-compatible COM port> port 0x3f8-0x3ff irq 4 flags 0x10 on=20
acpi0
sio0: type 16550A
orm0: <ISA Option ROMs> at iomem=20
0xec000-0xeffff,0xce800-0xcf7ff,0xcb000-0xcbfff,0xc0000-0xcafff on isa0
ppc0: cannot reserve I/O port range
sc0: <System console> at flags 0x100 on isa0
sc0: VGA <16 virtual consoles, flags=3D0x300>
sio1: configured irq 3 not in bitmap of probed irqs 0
sio1: port may not be enabled
vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0
Timecounters tick every 1.000 msec
acd0: CDROM <TEAC CD-ROM CD-224E/K.9A> at ata0-master PIO4
amrd0: <LSILogic MegaRAID logical drive> on amr0
amrd0: 139760MB (286228480 sectors) RAID 5 (optimal)
ses0 at amr0 bus 0 target 6 lun 0
ses0: <PE/PV 1x6 SCSI BP 1.0> Fixed Processor SCSI-2 device
ses0: SAF-TE Compliant Device
SMP: AP CPU #1 Launched!
Mounting root from ufs:/dev/amrd0s2a
WARNING: / was not properly dismounted
WARNING: /misc was not properly dismounted
/misc: mount pending error: blocks 7368 files 5
WARNING: /usr was not properly dismounted
WARNING: /usr/local was not properly dismounted
/usr/local: mount pending error: blocks 204 files 1
WARNING: /var was not properly dismounted
/var: mount pending error: blocks 1344 files 86
WARNING: /var/spool/imap was not properly dismounted
em1: Link is up 100 Mbps Half Duplex
em0: Link is up 1000 Mbps Full Duplex



nothing at all in /etc/make.conf

generic kernel with SMP, removed USB since I got interrupt storm, and don't =

need it. Also removed FireWire. Diff against GENERIC:

$ diff -u GENERIC MELON
--- GENERIC     Tue Apr 12 15:57:01 2005
+++ MELON       Fri Jun  3 20:13:03 2005
@@ -20,7 +20,9 @@

 machine                amd64
 cpu            HAMMER
-ident          GENERIC
+ident          MELON
+
+makeoptions     DEBUG=3D-g

 # To statically compile in device wiring instead of /boot/device.hints
 #hints         "GENERIC.hints"         # Default places to look for=20
devices.
@@ -64,10 +66,10 @@

 # Enabling NO_MIXED_MODE gives a performance improvement on some=20
motherboards
 # but does not work with some boards (mostly nVidia chipset based).
-#options       NO_MIXED_MODE   # Don't penalize working chipsets
+options        NO_MIXED_MODE   # Don't penalize working chipsets

 # Linux 32-bit ABI support
-options        LINPROCFS               # Cannot be a module yet.
+#options       LINPROCFS               # Cannot be a module yet.

 # Bus support.  Do not remove isa, even if you have no isa slots
 device         acpi
@@ -234,29 +236,23 @@
 # Note that 'bpf' is required for DHCP.
 device         bpf             # Berkeley packet filter

-# USB support
-device         uhci            # UHCI PCI->USB interface
-device         ohci            # OHCI PCI->USB interface
-#device                ehci            # EHCI PCI->USB interface (USB 2.0)
-device         usb             # USB Bus (required)
-#device                udbp            # USB Double Bulk Pipe devices
-device         ugen            # Generic
-device         uhid            # "Human Interface Devices"
-device         ukbd            # Keyboard
-device         ulpt            # Printer
-device         umass           # Disks/Mass storage - Requires scbus and =
da
-device         ums             # Mouse
-device         urio            # Diamond Rio 500 MP3 player
-device         uscanner        # Scanners
-# USB Ethernet, requires mii
-device         aue             # ADMtek USB Ethernet
-device         axe             # ASIX Electronics USB Ethernet
-device         cdce            # Generic USB over Ethernet
-device         cue             # CATC USB Ethernet
-device         kue             # Kawasaki LSI USB Ethernet
-device         rue             # RealTek RTL8150 USB Ethernet
-
-# FireWire support
-device         firewire        # FireWire bus code
-device         sbp             # SCSI over FireWire (Requires scbus and =
da)
-device         fwe             # Ethernet over FireWire (non-standard!)
+# SMP
+options                SMP
+
+# SysV stuff
+# This provides support for System V shared memory.
+#
+options                SYSVSHM
+options                SYSVSEM
+options                SYSVMSG
+options                SHMMAXPGS=3D65536
+options                SEMMNI=3D40
+options                SEMMNS=3D240
+options                SEMUME=3D40
+options                SEMMNU=3D120
+
+# Debug stuff, temporary
+options                KDB
+options                KDB_TRACE
+options                KDB_UNATTENDED





Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?2032FF2A928A89651F1C7843>