Date: Mon, 29 May 2017 11:20:43 +0200 From: Raimo Niskanen <raimo+freebsd@erix.ericsson.se> To: <freebsd-questions@freebsd.org> Subject: Advice on kernel panics Message-ID: <20170529092043.GA89682@erix.ericsson.se>
next in thread | raw e-mail | index | archive | help
Hello list. I have a server that panics about every 3 days and need some advice on how to handle that. It currently has 7 dumps in /var/crash/, head of the latest core.txt.4 looks like this: ======= sasquatch.otp.ericsson.se dumped core - see /var/crash/vmcore.4 Mon May 29 03:15:32 CEST 2017 FreeBSD sasquatch.otp.ericsson.se 10.3-RELEASE-p18 FreeBSD 10.3-RELEASE-p18 #0: Tue Apr 11 10:31:00 UTC 2017 root@amd64-builder.daemonology.net:/usr/obj/usr/src/sys/GENERIC amd64 panic: page fault GNU gdb 6.1.1 [FreeBSD] Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "amd64-marcel-freebsd"... Unread portion of the kernel message buffer: Fatal trap 12: page fault while in kernel mode cpuid = 0; apic id = 00 fault virtual address = 0x0 fault code = supervisor write data, page not present instruction pointer = 0x20:0xffffffff809fb017 stack pointer = 0x28:0xfffffe04673a18c0 frame pointer = 0x28:0xfffffe04673a1900 code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 18 (syncer) trap number = 12 panic: page fault cpuid = 0 KDB: stack backtrace: #0 0xffffffff8098e7e0 at kdb_backtrace+0x60 #1 0xffffffff809514b6 at vpanic+0x126 #2 0xffffffff80951383 at panic+0x43 #3 0xffffffff80d5646b at trap_fatal+0x36b #4 0xffffffff80d5676d at trap_pfault+0x2ed #5 0xffffffff80d55dea at trap+0x47a #6 0xffffffff80d3bdb2 at calltrap+0x8 #7 0xffffffff809f9b23 at vfs_msync+0x203 #8 0xffffffff809fb858 at sync_fsync+0x108 #9 0xffffffff80e81ed7 at VOP_FSYNC_APV+0xa7 #10 0xffffffff809fc27b at sched_sync+0x3ab #11 0xffffffff8091a93a at fork_exit+0x9a #12 0xffffffff80d3c2ee at fork_trampoline+0xe Uptime: 2d19h53m15s ======= What sticks out later in core.txt.4 is the fstat section that contains a lot of errors, but I can not tell if that is just a secondary symptom... Looks like this: ======= fstat fstat: can't read file 1 at 0x200007fffffffff fstat: can't read file 2 at 0x4000000001fffff fstat: can't read znode_phys at 0x1 fstat: can't read znode_phys at 0x1 fstat: can't read znode_phys at 0x1 : USER CMD PID FD MOUNT INUM MODE SZ|DV R/W root sed 78401 root - - error - root sed 78401 wd - - error - root sed 78401 text - - error - root sed 78401 0* pipe fffff8001800f000 <-> fffff8001800f160 0 rw root grep 78400 root - - error - root grep 78400 wd - - error - root grep 78400 text - - error - : ======= To me the other core.txt.? files does not look exactly the same. All have an fstat section with many errors, though. Does anyone have some advice on how to proceed? -- / Raimo Niskanen, Erlang/OTP, Ericsson AB
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20170529092043.GA89682>