Date: Thu, 1 Jun 2017 15:53:36 +0200 From: Raimo Niskanen <raimo+freebsd@erix.ericsson.se> To: <freebsd-questions@freebsd.org> Subject: Re: Advice on kernel panics Message-ID: <20170601135336.GD2256@erix.ericsson.se> In-Reply-To: <20170529092043.GA89682@erix.ericsson.se> References: <20170529092043.GA89682@erix.ericsson.se>
next in thread | previous in thread | raw e-mail | index | archive | help
Hello again. I gave to little details in my original post; this concerns a Dell Power Edge R320 server with motherboard disk controller and ZFS only install. The dmsg is at the end of this mail. On Mon, May 29, 2017 at 11:20:43AM +0200, Raimo Niskanen wrote: > Hello list. > > I have a server that panics about every 3 days and need some advice on how > to handle that. > > It currently has 7 dumps in /var/crash/, head of the latest core.txt.4 > looks like this: > > > ======= > sasquatch.otp.ericsson.se dumped core - see /var/crash/vmcore.4 > > Mon May 29 03:15:32 CEST 2017 > > FreeBSD sasquatch.otp.ericsson.se 10.3-RELEASE-p18 FreeBSD 10.3-RELEASE-p18 > #0: Tue Apr 11 10:31:00 UTC 2017 > root@amd64-builder.daemonology.net:/usr/obj/usr/src/sys/GENERIC amd64 > > panic: page fault > > GNU gdb 6.1.1 [FreeBSD] > Copyright 2004 Free Software Foundation, Inc. > GDB is free software, covered by the GNU General Public License, and you > are > welcome to change it and/or distribute copies of it under certain > conditions. > Type "show copying" to see the conditions. > There is absolutely no warranty for GDB. Type "show warranty" for details. > This GDB was configured as "amd64-marcel-freebsd"... > > Unread portion of the kernel message buffer: > > > Fatal trap 12: page fault while in kernel mode > cpuid = 0; apic id = 00 > fault virtual address = 0x0 > fault code = supervisor write data, page not present > instruction pointer = 0x20:0xffffffff809fb017 > stack pointer = 0x28:0xfffffe04673a18c0 > frame pointer = 0x28:0xfffffe04673a1900 > code segment = base 0x0, limit 0xfffff, type 0x1b > = DPL 0, pres 1, long 1, def32 0, gran 1 > processor eflags = interrupt enabled, resume, IOPL = 0 > current process = 18 (syncer) > trap number = 12 > panic: page fault > cpuid = 0 > KDB: stack backtrace: > #0 0xffffffff8098e7e0 at kdb_backtrace+0x60 > #1 0xffffffff809514b6 at vpanic+0x126 > #2 0xffffffff80951383 at panic+0x43 > #3 0xffffffff80d5646b at trap_fatal+0x36b > #4 0xffffffff80d5676d at trap_pfault+0x2ed > #5 0xffffffff80d55dea at trap+0x47a > #6 0xffffffff80d3bdb2 at calltrap+0x8 > #7 0xffffffff809f9b23 at vfs_msync+0x203 > #8 0xffffffff809fb858 at sync_fsync+0x108 > #9 0xffffffff80e81ed7 at VOP_FSYNC_APV+0xa7 > #10 0xffffffff809fc27b at sched_sync+0x3ab > #11 0xffffffff8091a93a at fork_exit+0x9a > #12 0xffffffff80d3c2ee at fork_trampoline+0xe > Uptime: 2d19h53m15s > ======= > > > What sticks out later in core.txt.4 is the fstat section that contains a > lot of errors, but I can not tell if that is just a secondary symptom... > > Looks like this: > ======= > fstat > > fstat: can't read file 1 at 0x200007fffffffff > fstat: can't read file 2 at 0x4000000001fffff > fstat: can't read znode_phys at 0x1 > fstat: can't read znode_phys at 0x1 > fstat: can't read znode_phys at 0x1 > : > USER CMD PID FD MOUNT INUM MODE SZ|DV R/W > root sed 78401 root - - error - > root sed 78401 wd - - error - > root sed 78401 text - - error - > root sed 78401 0* pipe fffff8001800f000 <-> fffff8001800f160 > 0 rw > root grep 78400 root - - error - > root grep 78400 wd - - error - > root grep 78400 text - - error - > : > ======= > > To me the other core.txt.? files does not look exactly the same. All have > an fstat section with many errors, though. > > Does anyone have some advice on how to proceed? > -- Copyright (c) 1992-2016 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD is a registered trademark of The FreeBSD Foundation. FreeBSD 10.3-RELEASE-p18 #0: Tue Apr 11 10:31:00 UTC 2017 root@amd64-builder.daemonology.net:/usr/obj/usr/src/sys/GENERIC amd64 FreeBSD clang version 3.4.1 (tags/RELEASE_34/dot1-final 208032) 20140512 CPU: Intel(R) Xeon(R) CPU E5-2407 v2 @ 2.40GHz (2400.06-MHz K8-class CPU) Origin="GenuineIntel" Id=0x306e4 Family=0x6 Model=0x3e Stepping=4 Features=0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE> Features2=0x7fbee3ff<SSE3,PCLMULQDQ,DTES64,MON,DS_CPL,VMX,SMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,PCID,DCA,SSE4.1,SSE4.2,x2APIC,POPCNT,TSCDLT,AESNI,XSAVE,OSXSAVE,AVX,F16C,RDRAND> AMD Features=0x2c100800<SYSCALL,NX,Page1GB,RDTSCP,LM> AMD Features2=0x1<LAHF> Structured Extended Features=0x281<FSGSBASE,SMEP,ERMS> XSAVE Features=0x1<XSAVEOPT> VT-x: PAT,HLT,MTF,PAUSE,EPT,UG,VPID,VID,PostIntr TSC: P-state invariant, performance statistics real memory = 12884901888 (12288 MB) avail memory = 12380942336 (11807 MB) Event timer "LAPIC" quality 600 ACPI APIC Table: <DELL PE_SC3 > FreeBSD/SMP: Multiprocessor System Detected: 4 CPUs FreeBSD/SMP: 1 package(s) x 4 core(s) cpu0 (BSP): APIC ID: 0 cpu1 (AP): APIC ID: 2 cpu2 (AP): APIC ID: 4 cpu3 (AP): APIC ID: 6 random: <Software, Yarrow> initialized ioapic1: Changing APIC ID to 1 ioapic0 <Version 2.0> irqs 0-23 on motherboard ioapic1 <Version 2.0> irqs 32-55 on motherboard kbd1 at kbdmux0 acpi0: <DELL PE_SC3> on motherboard acpi0: Power Button (fixed) cpu0: <ACPI CPU> on acpi0 cpu1: <ACPI CPU> on acpi0 cpu2: <ACPI CPU> on acpi0 cpu3: <ACPI CPU> on acpi0 atrtc0: <AT realtime clock> port 0x70-0x7f irq 8 on acpi0 Event timer "RTC" frequency 32768 Hz quality 0 attimer0: <AT timer> port 0x40-0x5f irq 0 on acpi0 Timecounter "i8254" frequency 1193182 Hz quality 0 Event timer "i8254" frequency 1193182 Hz quality 100 hpet0: <High Precision Event Timer> iomem 0xfed00000-0xfed003ff on acpi0 Timecounter "HPET" frequency 14318180 Hz quality 950 Event timer "HPET" frequency 14318180 Hz quality 550 Event timer "HPET1" frequency 14318180 Hz quality 440 Event timer "HPET2" frequency 14318180 Hz quality 440 Event timer "HPET3" frequency 14318180 Hz quality 440 Event timer "HPET4" frequency 14318180 Hz quality 440 Timecounter "ACPI-fast" frequency 3579545 Hz quality 900 acpi_timer0: <24-bit timer at 3.579545MHz> port 0x808-0x80b on acpi0 pcib0: <ACPI Host-PCI bridge> port 0xcf8-0xcff on acpi0 pci0: <ACPI PCI bus> on pcib0 pcib1: <ACPI PCI-PCI bridge> irq 53 at device 1.0 on pci0 pci1: <ACPI PCI bus> on pcib1 pcib2: <ACPI PCI-PCI bridge> irq 53 at device 3.0 on pci0 pci8: <ACPI PCI bus> on pcib2 bge0: <Broadcom NetXtreme Gigabit Ethernet, ASIC rev. 0x5720000> mem 0xd90a0000-0xd90affff,0xd90b0000-0xd90bffff,0xd90c0000-0xd90cffff irq 48 at device 0.0 on pci8 bge0: APE FW version: NCSI v1.2.33.0 bge0: CHIP ID 0x05720000; ASIC REV 0x5720; CHIP REV 0x57200; PCI-E miibus0: <MII bus> on bge0 brgphy0: <BCM5720C 1000BASE-T media interface> PHY 1 on miibus0 brgphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, auto, auto-flow bge0: Using defaults for TSO: 65518/35/2048 bge0: Ethernet address: 00:0a:f7:52:b1:1a bge1: <Broadcom NetXtreme Gigabit Ethernet, ASIC rev. 0x5720000> mem 0xd90d0000-0xd90dffff,0xd90e0000-0xd90effff,0xd90f0000-0xd90fffff irq 52 at device 0.1 on pci8 bge1: APE FW version: NCSI v1.2.33.0 bge1: CHIP ID 0x05720000; ASIC REV 0x5720; CHIP REV 0x57200; PCI-E miibus1: <MII bus> on bge1 brgphy1: <BCM5720C 1000BASE-T media interface> PHY 2 on miibus1 brgphy1: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, auto, auto-flow bge1: Using defaults for TSO: 65518/35/2048 bge1: Ethernet address: 00:0a:f7:52:b1:1b pcib3: <PCI-PCI bridge> irq 16 at device 17.0 on pci0 pci9: <PCI bus> on pcib3 pci0: <simple comms> at device 22.0 (no driver attached) pci0: <simple comms> at device 22.1 (no driver attached) ehci0: <Intel Patsburg USB 2.0 controller> mem 0xde8fd000-0xde8fd3ff irq 23 at device 26.0 on pci0 usbus0: EHCI version 1.0 usbus0 on ehci0 pcib4: <ACPI PCI-PCI bridge> at device 28.0 on pci0 pci10: <ACPI PCI bus> on pcib4 pcib5: <ACPI PCI-PCI bridge> irq 16 at device 28.4 on pci0 pci2: <ACPI PCI bus> on pcib5 bge2: <Broadcom NetXtreme Gigabit Ethernet, ASIC rev. 0x5720000> mem 0xd91a0000-0xd91affff,0xd91b0000-0xd91bffff,0xd91c0000-0xd91cffff irq 16 at device 0.0 on pci2 bge2: APE FW version: NCSI v1.2.33.0 bge2: CHIP ID 0x05720000; ASIC REV 0x5720; CHIP REV 0x57200; PCI-E miibus2: <MII bus> on bge2 brgphy2: <BCM5720C 1000BASE-T media interface> PHY 1 on miibus2 brgphy2: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, auto, auto-flow bge2: Using defaults for TSO: 65518/35/2048 bge2: Ethernet address: c8:1f:66:bc:10:cd bge3: <Broadcom NetXtreme Gigabit Ethernet, ASIC rev. 0x5720000> mem 0xd91d0000-0xd91dffff,0xd91e0000-0xd91effff,0xd91f0000-0xd91fffff irq 17 at device 0.1 on pci2 bge3: APE FW version: NCSI v1.2.33.0 bge3: CHIP ID 0x05720000; ASIC REV 0x5720; CHIP REV 0x57200; PCI-E miibus3: <MII bus> on bge3 brgphy3: <BCM5720C 1000BASE-T media interface> PHY 2 on miibus3 brgphy3: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, auto, auto-flow bge3: Using defaults for TSO: 65518/35/2048 bge3: Ethernet address: c8:1f:66:bc:10:ce pcib6: <ACPI PCI-PCI bridge> irq 19 at device 28.7 on pci0 pci3: <ACPI PCI bus> on pcib6 pcib7: <PCI-PCI bridge> at device 0.0 on pci3 pci4: <PCI bus> on pcib7 pcib8: <PCI-PCI bridge> at device 0.0 on pci4 pci5: <PCI bus> on pcib8 pcib9: <PCI-PCI bridge> at device 0.0 on pci5 pci6: <PCI bus> on pcib9 vgapci0: <VGA-compatible display> mem 0xd8000000-0xd8ffffff,0xddffc000-0xddffffff,0xdd000000-0xdd7fffff irq 19 at device 0.0 on pci6 vgapci0: Boot video device pcib10: <PCI-PCI bridge> at device 1.0 on pci4 pci7: <PCI bus> on pcib10 ehci1: <Intel Patsburg USB 2.0 controller> mem 0xde8fe000-0xde8fe3ff irq 22 at device 29.0 on pci0 usbus1: EHCI version 1.0 usbus1 on ehci1 pcib11: <PCI-PCI bridge> at device 30.0 on pci0 pci11: <PCI bus> on pcib11 isab0: <PCI-ISA bridge> at device 31.0 on pci0 isa0: <ISA bus> on isab0 ahci0: <Intel Patsburg AHCI SATA controller> port 0xfce8-0xfcef,0xfcf8-0xfcfb,0xfcf0-0xfcf7,0xfcfc-0xfcff,0xfcc0-0xfcdf mem 0xde8ff000-0xde8ff7ff irq 20 at device 31.2 on pci0 ahci0: AHCI v1.30 with 6 3Gbps ports, Port Multiplier not supported ahcich0: <AHCI channel> at channel 0 on ahci0 ahcich1: <AHCI channel> at channel 1 on ahci0 ahcich2: <AHCI channel> at channel 2 on ahci0 ahcich3: <AHCI channel> at channel 3 on ahci0 ahcich4: <AHCI channel> at channel 4 on ahci0 ahciem0: <AHCI enclosure management bridge> on ahci0 pcib12: <ACPI Host-PCI bridge> on acpi0 pci63: <ACPI PCI bus> on pcib12 pcib13: <ACPI Host-PCI bridge> on acpi0 pci127: <ACPI PCI bus> on pcib13 uart1: <16550 or compatible> port 0x2f8-0x2ff irq 3 on acpi0 uart0: <16550 or compatible> port 0x3f8-0x3ff irq 4 flags 0x10 on acpi0 orm0: <ISA Option ROMs> at iomem 0xc0000-0xc7fff,0xec000-0xeffff on isa0 sc0: <System console> at flags 0x100 on isa0 sc0: VGA <16 virtual consoles, flags=0x300> vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0 ppc0: cannot reserve I/O port range est0: <Enhanced SpeedStep Frequency Control> on cpu0 est: CPU supports Enhanced Speedstep, but is not recognized. est: cpu_vendor GenuineIntel, msr 1d4d00001800 device_attach: est0 attach returned 6 est1: <Enhanced SpeedStep Frequency Control> on cpu1 est: CPU supports Enhanced Speedstep, but is not recognized. est: cpu_vendor GenuineIntel, msr 1d4d00001800 device_attach: est1 attach returned 6 est2: <Enhanced SpeedStep Frequency Control> on cpu2 est: CPU supports Enhanced Speedstep, but is not recognized. est: cpu_vendor GenuineIntel, msr 1d4d00001800 device_attach: est2 attach returned 6 est3: <Enhanced SpeedStep Frequency Control> on cpu3 est: CPU supports Enhanced Speedstep, but is not recognized. est: cpu_vendor GenuineIntel, msr 1d4d00001800 device_attach: est3 attach returned 6 ZFS filesystem version: 5 ZFS storage pool version: features support (5000) Timecounters tick every 1.000 msec random: unblocking device. usbus0: 480Mbps High Speed USB v2.0 usbus1: 480Mbps High Speed USB v2.0 ugen0.1: <Intel> at usbus0 uhub0: <Intel EHCI root HUB, class 9/0, rev 2.00/1.00, addr 1> on usbus0 ugen1.1: <Intel> at usbus1 uhub1: <Intel EHCI root HUB, class 9/0, rev 2.00/1.00, addr 1> on usbus1 ses0 at ahciem0 bus 0 scbus5 target 0 lun 0 ses0: <AHCI SGPIO Enclosure 1.00 0001> SEMB S-E-S 2.00 device ses0: SEMB SES Device ada0 at ahcich0 bus 0 scbus0 target 0 lun 0 ada0: <WDC WD5003ABYX-18WERA0 01.01S04> ATA8-ACS SATA 2.x device ada0: Serial Number WD-WMAYP8034312 ada0: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes) ada0: Command Queueing enabled ada0: 476940MB (976773168 512 byte sectors) ada0: Previously was known as ad4 ada1 at ahcich1 bus 0 scbus1 target 0 lun 0 ada1: <WDC WD10EFRX-68PJCN0 82.00A82> ACS-2 ATA SATA 3.x device ada1: Serial Number WD-WCC4JDU1EVHN ada1: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes) ada1: Command Queueing enabled ada1: 953869MB (1953525168 512 byte sectors) ada1: quirks=0x1<4K> ada1: Previously was known as ad6 cd0 at ahcich4 bus 0 scbus4 target 0 lun 0 cd0: <TSSTcorp DVD-ROM SN-108FB D150> Removable CD-ROM SCSI device cd0: Serial Number S1596YBF3001M9 cd0: 150.000MB/s transfers (SATA 1.x, UDMA5, ATAPI 12bytes, PIO 8192bytes) cd0: Attempt to query device size failed: NOT READY, Medium not present - tray closed SMP: AP CPU #2 Launched! SMP: AP CPU #1 Launched! SMP: AP CPU #3 Launched! Timecounter "TSC-low" frequency 1200028244 Hz quality 1000 GEOM_MIRROR: Device mirror/swap launched (2/2). Root mount waiting for: usbus1 usbus0 uhub1: 2 ports with 2 removable, self powered uhub0: 2 ports with 2 removable, self powered Root mount waiting for: usbus1 usbus0 ugen1.2: <vendor 0x8087> at usbus1 uhub2: <vendor 0x8087 product 0x0024, class 9/0, rev 2.00/0.00, addr 2> on usbus1 ugen0.2: <vendor 0x8087> at usbus0 uhub3: <vendor 0x8087 product 0x0024, class 9/0, rev 2.00/0.00, addr 2> on usbus0 Root mount waiting for: usbus1 usbus0 uhub3: 6 ports with 6 removable, self powered uhub2: 8 ports with 8 removable, self powered Root mount waiting for: usbus1 usbus0 ugen0.3: <no manufacturer> at usbus0 uhub4: <no manufacturer Gadget USB HUB, class 9/0, rev 2.00/0.00, addr 3> on usbus0 ugen1.3: <vendor 0x0557> at usbus1 uhub5: <vendor 0x0557 product 0x8021, class 9/0, rev 1.10/1.00, addr 3> on usbus1 uhub5: 4 ports with 4 removable, self powered uhub4: 6 ports with 6 removable, self powered Root mount waiting for: usbus1 usbus0 ugen0.4: <Avocent> at usbus0 ukbd0: <Keyboard> on usbus0 kbd0 at ukbd0 ugen1.4: <ATEN> at usbus1 ukbd1: <Kb> on usbus1 kbd2 at ukbd1 Trying to mount root from zfs:zroot/ROOT/default []... -- / Raimo Niskanen, Erlang/OTP, Ericsson AB
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20170601135336.GD2256>