Date: Mon, 7 Jan 2013 19:11:48 +0100 From: Rasmus Skaarup <freebsd@gal.dk> To: John Baldwin <jhb@freebsd.org> Cc: freebsd-gnats-submit@freebsd.org, freebsd-amd64@freebsd.org Subject: Re: amd64/175091: Crash: Fatal trap 12: page fault while in kernel mode Message-ID: <AB7C0328-3764-480E-ACB0-FEAECAD9E200@gal.dk> In-Reply-To: <201301070932.37097.jhb@freebsd.org> References: <201301070805.r0785IeP031201@red.freebsd.org> <201301070932.37097.jhb@freebsd.org>
index | next in thread | previous in thread | raw e-mail
Thank you for the quick response. I enabled the setting in rc.conf as you mentioned, and the machine has crashed twice since. The two dumps are uploaded here: http://gal.dk/crash0.tar.gz http://gal.dk/crash1.tar.gz gdb output for the original error: (gdb) l *vm_fault_hold+0x1b13 0xffffffff80b41133 is in vm_fault_hold (/usr/src/sys/vm/vm_fault.c:936). 931 * because pmap_enter() may sleep. We don't put the page 932 * back on the active queue until later so that the pageout daemon 933 * won't find it (yet). 934 */ 935 pmap_enter(fs.map->pmap, vaddr, fault_type, fs.m, prot, wired); 936 if ((fault_flags & VM_FAULT_CHANGE_WIRING) == 0 && wired == 0) 937 vm_fault_prefault(fs.map->pmap, vaddr, fs.entry); 938 VM_OBJECT_LOCK(fs.object); 939 vm_page_lock(fs.m); 940 (gdb) The two other crashes, had different excuses. Here is the first: Fatal trap 9: general protection fault while in kernel mode cpuid = 3; apic id = 03 instruction pointer = 0x20:0xffffffff81612ace stack pointer = 0x28:0xffffff816230a4c0 frame pointer = 0x28:0xffffff816230a4e0 code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 18778 (imapd) processor eflags = interrupt enabled, resume, IOPL = 0 current process = 18778 (imapd) trap number = 9 panic: general protection fault cpuid = 1 KDB: stack backtrace: #0 0xffffffff809208a6 at kdb_backtrace+0x66 #1 0xffffffff808ea8be at panic+0x1ce #2 0xffffffff80bd8240 at trap_fatal+0x290 #3 0xffffffff80bd88d5 at trap+0x105 #4 0xffffffff80bc315f at calltrap+0x8 #5 0xffffffff8164915d at dnode_free_range+0x29d #6 0xffffffff81639d5f at dmu_free_long_range_impl+0x13f #7 0xffffffff81639f9c at dmu_free_long_range+0x4c #8 0xffffffff816a7839 at zfs_rmnode+0x69 #9 0xffffffff816be9b6 at zfs_inactive+0x66 #10 0xffffffff816beb7a at zfs_freebsd_inactive+0x1a #11 0xffffffff8097f61d at vinactive+0x8d #12 0xffffffff80982de8 at vputx+0x2d8 #13 0xffffffff80986f4f at kern_unlinkat+0x1df #14 0xffffffff80bd7ae6 at amd64_syscall+0x546 #15 0xffffffff80bc3447 at Xfast_syscall+0xf7 Uptime: 1h17m3s (gdb) l *trap_fatal+0x290 0xffffffff80bd8240 is in trap_fatal (/usr/src/sys/amd64/amd64/trap.c:852). 847 printf("Idle\n"); 848 } 849 850 #ifdef KDB 851 if (debugger_on_panic || kdb_active) 852 if (kdb_trap(type, 0, frame)) 853 return; 854 #endif 855 printf("trap number = %d\n", type); 856 if (type <= MAX_TRAP_MSG) (gdb) The other new crash: Fatal trap 12: page fault while in kernel mode cpuid = 1; apic id = 01 fault virtual address = 0x0 fault code = supervisor read data, page not present instruction pointer = 0x20:0xffffffff80bcf1fb stack pointer = 0x28:0xffffff8161fef950 frame pointer = 0x28:0xffffff8161fef990 code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 75787 (httpd) trap number = 12 panic: page fault cpuid = 1 KDB: stack backtrace: #0 0xffffffff809208a6 at kdb_backtrace+0x66 #1 0xffffffff808ea8be at panic+0x1ce #2 0xffffffff80bd8240 at trap_fatal+0x290 #3 0xffffffff80bd857d at trap_pfault+0x1ed #4 0xffffffff80bd8b9e at trap+0x3ce #5 0xffffffff80bc315f at calltrap+0x8 #6 0xffffffff80bcf290 at pmap_is_modified+0x40 #7 0xffffffff80b52f7e at vm_page_dontneed+0x17e #8 0xffffffff80b4f0cd at vm_object_madvise+0x4dd #9 0xffffffff80b49beb at vm_map_madvise+0x1bb #10 0xffffffff80b4bff1 at sys_madvise+0x91 #11 0xffffffff80bd7ae6 at amd64_syscall+0x546 #12 0xffffffff80bc3447 at Xfast_syscall+0xf7 Uptime: 5h5m23s (gdb) l *pmap_is_modified+0x40 0xffffffff80bcf290 is in pmap_is_modified (/usr/src/sys/amd64/amd64/pmap.c:4264). 4259 VM_OBJECT_LOCK_ASSERT(m->object, MA_OWNED); 4260 if ((m->oflags & VPO_BUSY) == 0 && 4261 (m->aflags & PGA_WRITEABLE) == 0) 4262 return (FALSE); 4263 rw_wlock(&pvh_global_lock); 4264 rv = pmap_is_modified_pvh(&m->md) || 4265 ((m->flags & PG_FICTITIOUS) == 0 && 4266 pmap_is_modified_pvh(pa_to_pvh(VM_PAGE_TO_PHYS(m)))); 4267 rw_wunlock(&pvh_global_lock); 4268 return (rv); (gdb) l *trap_pfault+0x1ed 0xffffffff80bd857d is in trap_pfault (/usr/src/sys/amd64/amd64/trap.c:773). 768 if (td->td_intr_nesting_level == 0 && 769 PCPU_GET(curpcb)->pcb_onfault != NULL) { 770 frame->tf_rip = (long)PCPU_GET(curpcb)->pcb_onfault; 771 return (0); 772 } 773 trap_fatal(frame, eva); 774 return (-1); 775 } 776 777 return((rv == KERN_PROTECTION_FAILURE) ? SIGBUS : SIGSEGV); (gdb) (not sure I'm gbd'ing what you need, but let me know). I am beginning to suspect the hardware, but the strange thing is that the host (CentOS 6.3) and the other virtual machine works completely fine. And the other virtual machine has plenty of user on it. Best regards, Rasmus skaarup On 07/01/2013, at 15.32, John Baldwin <jhb@freebsd.org> wrote: > On Monday, January 07, 2013 03:05:18 AM Rasmus Skaarup wrote: >>> Number: 175091 >>> Category: amd64 >>> Synopsis: Crash: Fatal trap 12: page fault while in kernel mode >>> Confidential: no >>> Severity: non-critical >>> Priority: low >>> Responsible: freebsd-amd64 >>> State: open >>> Quarter: >>> Keywords: >>> Date-Required: >>> Class: sw-bug >>> Submitter-Id: current-users >>> Arrival-Date: Mon Jan 07 08:10:01 UTC 2013 >>> Closed-Date: >>> Last-Modified: >>> Originator: Rasmus Skaarup >>> Release: 9.1-RELEASE >>> Organization: >> >>> Environment: >> FreeBSD thirdhost 9.1-RELEASE FreeBSD 9.1-RELEASE #0 r243825: Tue Dec 4 >> 09:23:10 UTC 2012 >> root@farrell.cse.buffalo.edu:/usr/obj/usr/src/sys/GENERIC amd64 >> >>> Description: >> On of my virtualized FreeBSD machines has been panic'ing two times within >> the last two weeks. After the first panic I ran freebsd-update and >> upgraded to 9.1-RELEASE succesfully. Today the machine panic'ed again. >> >> I have another virtualized FreeBSD machine running on the same host, and it >> does not exhibit this behaviour. >> >> Here is the output from dmesg, after reboot: >> >> **** >> Fatal trap 12: page fault while in kernel mode >> cpuid = 2; apic id = 02 >> fault virtual address = 0x48 >> fault code = supervisor read data, page not present >> instruction pointer = 0x20:0xffffffff80bd5139 >> stack pointer = 0x28:0xffffff81625536c0 >> frame pointer = 0x28:0xffffff8162553750 >> code segment = base 0x0, limit 0xfffff, type 0x1b >> = DPL 0, pres 1, long 1, def32 0, gran 1 >> processor eflags = interrupt enabled, resume, IOPL = 0 >> current process = 62083 (httpd) >> trap number = 12 >> panic: page fault >> cpuid = 2 >> KDB: stack backtrace: >> #0 0xffffffff809208a6 at kdb_backtrace+0x66 >> #1 0xffffffff808ea8be at panic+0x1ce >> #2 0xffffffff80bd8240 at trap_fatal+0x290 >> #3 0xffffffff80bd857d at trap_pfault+0x1ed >> #4 0xffffffff80bd8b9e at trap+0x3ce >> #5 0xffffffff80bc315f at calltrap+0x8 >> #6 0xffffffff80b41133 at vm_fault_hold+0x1b13 >> #7 0xffffffff80b41cc3 at vm_fault+0x73 >> #8 0xffffffff80bd84b4 at trap_pfault+0x124 >> #9 0xffffffff80bd8c6c at trap+0x49c >> #10 0xffffffff80bc315f at calltrap+0x8 >> Uptime: 13h6m22s >> ********* > > Can you enable crashdumps by setting 'dumpdev="AUTO"' in /etc/rc.conf? > > Also, can you run 'gdb /boot/kernel/kernel' and then at the prompt run > 'l *vm_fault_hold+0x1b13' and reply with the output? > > -- > John Baldwin >help
Want to link to this message? Use this
URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?AB7C0328-3764-480E-ACB0-FEAECAD9E200>
