Date: Thu, 10 Apr 2003 13:08:52 +1000 From: Christopher Smith <csmith@its.uq.edu.au> To: freebsd-stable@freebsd.org Subject: Panics on 4.7 system Message-ID: <C24730E0-6B01-11D7-A788-000502F96668@its.uq.edu.au>
next in thread | raw e-mail | index | archive | help
After some dialog with Terry Lambert on -hackers, I've been advised to post this here. I have a 4.7-RELEASE-p10 box that is suffering regular kernel panics. The machine is a Dell 2650 running primarily as a file/print server to a number of computer labs of about 400 machines (although it also functions as a rembo image server and squid proxy). It mainly stores applications, which are run off a samba share and user home directories (again, accessed via samba). It has a largish filesystem (~200G) on a Powervault 220 attached via a PERC3/DC controller (amr) that most of the data is stored on. The OS is on a pair of internal 18G drives attached to the internal PERC3/Di controller (aac). It is attached to the network with a Netgear GA620 fibre NIC (ti). The panic is being triggered by the "find" run in /etc/periodic/daily/100.clean-disks. Disabling this script has, for the moment, circumvented the problem - although from what I can gather it is a kernel bug. The machine is scheduled to be updated to 4.8 in two weeks. If anyone knows if this issue has been resolved already, please let me know. If it hasn't, or the status is unknown, I'd be quite happy to re-enable the daily script triggering the problem once the system has been upgraded and provide the necessary crash dumps, etc to help solve it. Terry tells me it has been fixed in -current. Here is the relevant system info. If I've forgotten anything, or there is anything more anyone needs to help fix the problem, please let me know. leela# uname -a FreeBSD leela.lab.bel.uq.edu.au 4.7-RELEASE-p10 FreeBSD 4.7-RELEASE-p10 #0: Mon Apr 7 10:34:08 EST 2003 root@leela.lab.bel.uq.edu.au:/usr/src/sys/compile/LEELA i386 leela# leela# cat /var/run/dmesg.boot Copyright (c) 1992-2002 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD 4.7-RELEASE-p10 #0: Mon Apr 7 10:34:08 EST 2003 root@leela.lab.bel.uq.edu.au:/usr/src/sys/compile/LEELA Timecounter "i8254" frequency 1193182 Hz CPU: Pentium 4 (2392.26-MHz 686-class CPU) Origin = "GenuineIntel" Id = 0xf27 Stepping = 7 Features=0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE ,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,<b28>,ACC,<b31 >> real memory = 2147418112 (2097088K bytes) avail memory = 2088574976 (2039624K bytes) Changing APIC ID for IO APIC #0 from 0 to 4 on chip Changing APIC ID for IO APIC #1 from 0 to 5 on chip Changing APIC ID for IO APIC #2 from 0 to 6 on chip Programming 16 pins in IOAPIC #0 IOAPIC #0 intpin 2 -> irq 0 Programming 16 pins in IOAPIC #1 Programming 16 pins in IOAPIC #2 FreeBSD/SMP: Multiprocessor motherboard cpu0 (BSP): apic id: 0, version: 0x00050014, at 0xfee00000 cpu1 (AP): apic id: 2, version: 0x00050014, at 0xfee00000 io0 (APIC): apic id: 4, version: 0x000f0011, at 0xfec00000 io1 (APIC): apic id: 5, version: 0x000f0011, at 0xfec01000 io2 (APIC): apic id: 6, version: 0x000f0011, at 0xfec02000 Preloaded elf kernel "kernel" at 0xc030d000. Pentium Pro MTRR support enabled md0: Malloc disk Using $PIR table, 9 entries at 0xc00fc480 npx0: <math processor> on motherboard npx0: INT 16 interface pcib0: <Host to PCI bridge> on motherboard IOAPIC #1 intpin 3 -> irq 2 IOAPIC #1 intpin 7 -> irq 7 IOAPIC #1 intpin 11 -> irq 10 pci0: <PCI bus> on pcib0 pci0: <unknown card> (vendor=0x1028, dev=0x000c) at 4.0 irq 2 pci0: <unknown card> (vendor=0x1028, dev=0x0008) at 4.1 irq 7 pci0: <unknown card> (vendor=0x1028, dev=0x000d) at 4.2 irq 10 pci0: <ATI Mach64-GR graphics accelerator> at 14.0 atapci0: <ServerWorks CSB5 ATA100 controller> port 0x8b0-0x8bf,0x8d8-0x8db,0x8d0-0x8d7,0x8c8-0x8cb,0x8c0-0x8c7 at device 15.1 on pci0 ata0: at 0x1f0 irq 14 on atapci0 ata1: at 0x170 irq 15 on atapci0 pci0: <OHCI USB controller> at 15.2 irq 5 isab0: <PCI to ISA bridge (vendor=1166 device=0225)> at device 15.3 on pci0 isa0: <ISA bus> on isab0 pcib1: <Host to PCI bridge> on motherboard IOAPIC #1 intpin 0 -> irq 11 pci1: <PCI bus> on pcib1 ti0: <Netgear GA620 1000baseSX Gigabit Ethernet> mem 0xfcf00000-0xfcf03fff irq 11 at device 6.0 on pci1 ti0: Ethernet address: 00:02:e3:00:0d:c6 pcib2: <Host to PCI bridge> on motherboard pci2: <PCI bus> on pcib2 pcib8: <PCI to PCI bridge (vendor=8086 device=b154)> at device 6.0 on pci2 IOAPIC #1 intpin 9 -> irq 13 pci3: <PCI bus> on pcib8 pcib9: <PCI to PCI bridge (vendor=8086 device=b154)> at device 0.0 on pci3 IOAPIC #1 intpin 8 -> irq 16 pci4: <PCI bus> on pcib9 amr0: <AMI MegaRAID> mem 0xf0000000-0xf7ffffff irq 16 at device 0.0 on pci4 amr0: <PERC 3/DC> Firmware 1.74, BIOS 3.27, 128MB RAM pci3: <unknown card> (vendor=0x1077, dev=0x1216) at 1.0 irq 13 pcib3: <Host to PCI bridge> on motherboard IOAPIC #1 intpin 12 -> irq 17 IOAPIC #1 intpin 13 -> irq 18 pci5: <PCI bus> on pcib3 bge0: <Broadcom BCM5701 Gigabit Ethernet> mem 0xeff10000-0xeff1ffff irq 17 at device 6.0 on pci5 bge0: Ethernet address: 00:06:5b:f3:09:7d miibus0: <MII bus> on bge0 brgphy0: <BCM5701 10/100/1000baseTX PHY> on miibus0 brgphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseTX, 1000baseTX-FDX, auto bge1: <Broadcom BCM5701 Gigabit Ethernet> mem 0xeff00000-0xeff0ffff irq 18 at device 8.0 on pci5 bge1: Ethernet address: 00:06:5b:f3:09:7e miibus1: <MII bus> on bge1 brgphy1: <BCM5701 10/100/1000baseTX PHY> on miibus1 brgphy1: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseTX, 1000baseTX-FDX, auto pcib4: <Host to PCI bridge> on motherboard IOAPIC #1 intpin 14 -> irq 19 pci6: <PCI bus> on pcib4 pcib10: <PCI to PCI bridge (vendor=8086 device=0309)> at device 8.0 on pci6 IOAPIC #1 intpin 15 -> irq 20 pci7: <PCI bus> on pcib10 pci7: <unknown card> (vendor=0x9005, dev=0x00c5) at 6.0 irq 19 pci7: <unknown card> (vendor=0x9005, dev=0x00c5) at 6.1 irq 20 aac0: <Dell PERC 3/Di> mem 0xe0000000-0xe7ffffff irq 19 at device 8.1 on pci6 aac0: i960RX 100MHz, 118MB cache memory, optional battery present aac0: Kernel 2.7-1, Build 3170, S/N 9c38d3 pcib5: <Host to PCI bridge> on motherboard pci8: <PCI bus> on pcib5 pcib6: <Host to PCI bridge> on motherboard pci9: <PCI bus> on pcib6 pcib7: <Host to PCI bridge> on motherboard pci10: <PCI bus> on pcib7 orm0: <Option ROMs> at iomem 0xc0000-0xc7fff,0xc8000-0xcbfff,0xec000-0xeffff on isa0 fdc0: <NEC 72065B or clone> at port 0x3f0-0x3f5,0x3f7 irq 6 drq 2 on isa0 fdc0: FIFO enabled, 8 bytes threshold fd0: <1440-KB 3.5" drive> on fdc0 drive 0 atkbdc0: <Keyboard controller (i8042)> at port 0x60,0x64 on isa0 atkbd0: <AT Keyboard> irq 1 on atkbdc0 kbd0 at atkbd0 vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0 sc0: <System console> at flags 0x100 on isa0 sc0: VGA <16 virtual consoles, flags=0x300> sio0 at port 0x3f8-0x3ff irq 4 flags 0x10 on isa0 sio0: type 16550A APIC_IO: Testing 8254 interrupt delivery APIC_IO: Broken MP table detected: 8254 is not connected to IOAPIC #0 intpin 2 APIC_IO: routing 8254 via 8259 and IOAPIC #0 intpin 0 ata0-slave: ATAPI identify retries exceeded SMP: AP CPU #1 Launched! acd0: CDROM <TEAC CD-ROM CD-224E> at ata0-master PIO4 amrd0: <MegaRAID logical drive> on amr0 amrd0: 209634MB (429330432 sectors) RAID 5 (optimal) aacd0: <RAID 1 (Mirror)> on aac0 aacd0: 17355MB (35544576 sectors) Mounting root from ufs:/dev/aacd0s1a WARNING: / was not properly dismounted leela# leela# gdb -k /kernel.debug /export/crash/vmcore.1 GNU gdb 4.18 (FreeBSD) Copyright 1998 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "i386-unknown-freebsd"...Deprecated bfd_read called at /usr/src/gnu/usr.bin/binutils/gdb/../../../../contrib/gdb/gdb/dbxread.c line 2627 in elfstab_build_psymtabs Deprecated bfd_read called at /usr/src/gnu/usr.bin/binutils/gdb/../../../../contrib/gdb/gdb/dbxread.c line 933 in fill_symbuf SMP 2 cpus IdlePTD at phsyical address 0x0032c000 initial pcb at physical address 0x002a29e0 panicstr: page fault panic messages: --- Fatal trap 12: page fault while in kernel mode mp_lock = 00000002; cpuid = 0; lapic.id = 00000000 fault virtual address = 0x18 fault code = supervisor write, page not present instruction pointer = 0x8:0xc01e1725 stack pointer = 0x10:0xfd05bc50 frame pointer = 0x10:0xfd05bc54 code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, def32 1, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 16458 (find) interrupt mask = none <- SMP: XXX trap number = 12 panic: page fault mp_lock = 00000002; cpuid = 0; lapic.id = 00000000 boot() called on cpu#0 syncing disks... 10 done Uptime: 23h45m19s amr0: flushing cache...done dumping to dev #aacd/0x20001, offset 4194432 dump 2047 2046 2045 2044 2043 2042 2041 2040 2039 2038 2037 2036 2035 2034 2033 2032 2031 2030 2029 2028 2027 2026 [...] 39 38 37 36 35 34 33 32 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 succeeded aac0: shutting down controller... --- #0 dumpsys () at ../../kern/kern_shutdown.c:487 487 if (dumping++) { (kgdb) where #0 dumpsys () at ../../kern/kern_shutdown.c:487 #1 0xc0163cf0 in boot (howto=256) at ../../kern/kern_shutdown.c:316 #2 0xc0164171 in panic (fmt=0xc0251c79 "%s") at ../../kern/kern_shutdown.c:595 #3 0xc0214e46 in trap_fatal (frame=0xfd05bc10, eva=24) at ../../i386/i386/trap.c:974 #4 0xc0214a99 in trap_pfault (frame=0xfd05bc10, usermode=0, eva=24) at ../../i386/i386/trap.c:867 #5 0xc02145df in trap (frame={tf_fs = 24, tf_es = -1071775728, tf_ds = -1070989296, tf_edi = 1, tf_esi = 0, tf_ebp = -49955756, tf_isp = -49955780, tf_ebx = 2, tf_edx = 0, tf_ecx = 1, tf_eax = 2, tf_trapno = 12, tf_err = 2, tf_eip = -1071769819, tf_cs = 8, tf_eflags = 66118, tf_esp = 2, tf_ss = -49955720}) at ../../i386/i386/trap.c:466 #6 0xc01e1725 in _vm_object_allocate (type=2, size=1, object=0x0) at ../../vm/vm_object.c:158 #7 0xc01e18c4 in vm_object_allocate (type=2, size=1) at ../../vm/vm_object.c:241 #8 0xc01e753d in vnode_pager_alloc (handle=0xff7fce00, size=512, prot=0, offset=0) at ../../vm/vnode_pager.c:145 #9 0xc018ffc9 in vop_stdcreatevobject (ap=0xfd05bd64) at ../../kern/vfs_default.c:526 #10 0xc018fc35 in vop_defaultop (ap=0xfd05bd64) at ../../kern/vfs_default.c:150 #11 0xc01d7ef1 in ufs_vnoperate (ap=0xfd05bd64) at ../../ufs/ufs/ufs_vnops.c:2422 #12 0xc01943c2 in vfs_object_create (vp=0xff7fce00, p=0xfcee6d00, cred=0xc74ba800) at vnode_if.h:1383 #13 0xc0190af1 in namei (ndp=0xfd05bec4) at ../../kern/vfs_lookup.c:171 #14 0xc0199c85 in vn_open (ndp=0xfd05bec4, fmode=5, cmode=2180) at ../../kern/vfs_vnops.c:138 #15 0xc0195c98 in open (p=0xfcee6d00, uap=0xfd05bf80) at ../../kern/vfs_syscalls.c:1028 #16 0xc0215195 in syscall2 (frame={tf_fs = 134545455, tf_es = 134545455, tf_ds = -1078001617, tf_edi = 134597120, tf_esi = -1077937628, tf_ebp = -1077937532, tf_isp = -49954860, tf_ebx = 672099916, tf_edx = 134597184, tf_ecx = 134557696, tf_eax = 5, tf_trapno = 0, tf_err = 2, tf_eip = 672006988, tf_cs = 31, tf_eflags = 663, tf_esp = -1077937960, tf_ss = 47}) at ../../i386/i386/trap.c:1175 #17 0xc0201e5b in Xint0x80_syscall () #18 0x280a1f60 in ?? () #19 0x280a1b46 in ?? () #20 0x80496ca in ?? () #21 0x804b6c8 in ?? () #22 0x8049377 in ?? () (kgdb) list *_vm_object_allocate+158 0xc01e17b6 is in _vm_object_allocate (../../vm/vm_object.c:187). 182 * Try to generate a number that will spread objects out in the 183 * hash table. We 'wipe' new objects across the hash in 128 page 184 * increments plus 1 more to offset it a little more by the time 185 * it wraps around. 186 */ 187 object->hash_rand = object_hash_rand - 129; 188 189 object->generation++; 190 191 TAILQ_INSERT_TAIL(&vm_object_list, object, object_list); (kgdb) list *vm_object_allocate+241 0xc01e1991 is in vm_object_deallocate (../../vm/vm_object.c:325). 320 321 if (object->ref_count == 0) { 322 panic("vm_object_deallocate: object deallocated too many times: %d", object->type); 323 } else if (object->ref_count > 2) { 324 object->ref_count--; 325 return; 326 } 327 328 /* 329 * Here on ref_count of one or two, which are special cases for (kgdb) list *vnode_pager_alloc+145 0xc01e751d is in vnode_pager_alloc (../../vm/vnode_pager.c:141). 136 } 137 138 if (vp->v_usecount == 0) 139 panic("vnode_pager_alloc: no vnode reference"); 140 141 if (object == NULL) { 142 /* 143 * And an object of the appropriate size 144 */ 145 object = vm_object_allocate(OBJT_VNODE, OFF_TO_IDX(round_page(size))); (kgdb) Cheers, -- +- Christopher Smith, Systems Administrator ------------------------------+ | Server & Security Group, Information Technology Services | | The University of Queensland, Brisbane, Australia, 4072 | +- Ph +61 7 3365 4046 | email csmith@its.uq.edu.au | Fax +61 7 3365 4065 -+
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?C24730E0-6B01-11D7-A788-000502F96668>