Date: Wed, 14 Oct 2015 18:10:34 +0200 From: Frank Razenberg <frank@zzattack.org> To: Konstantin Belousov <kostikbel@gmail.com>, freebsd-stable@freebsd.org Subject: Re: 10.2-STABLE amd64 panic: page fault while in kernel mode Message-ID: <561E7E7A.1080600@zzattack.org> In-Reply-To: <20151014144217.GV2257@kib.kiev.ua> References: <561E5E2F.90404@zzattack.org> <20151014144217.GV2257@kib.kiev.ua>
next in thread | previous in thread | raw e-mail | index | archive | help
Thanks for looking into this. On 10/14/2015 4:42 PM, Konstantin Belousov wrote: > On Wed, Oct 14, 2015 at 03:52:47PM +0200, Frank Razenberg wrote: >> After upgrading from 9.2 to 10.1 I first started noticing panics. They >> occurred roughly weekly and since this storage machine isn't frequently >> used I didn't look into it much further. After updating for 10.2-STABLE >> the panics have gone from weekly to daily. >> The machine has 32GB of non-registered ECC DDR3-1066 RAM. There's also a >> 10-disk raidz2 pool. I've ran memtest86+ for 72 hours straight with no >> errors. >> >> Crash dumps all feature the following: >> >> Fatal trap 12: page fault while in kernel mode >> cpuid = 2; apic id = 12 >> fault virtual address = 0x1d1c0bec0 >> fault code = supervisor read data, page not present >> instruction pointer = 0x20:0xffffffff804fda65 >> stack pointer = 0x28:0xfffffe0698f21870 >> frame pointer = 0x28:0xfffffe0698f218d0 >> code segment = base 0x0, limit 0xfffff, type 0x1b >> = DPL 0, pres 1, long 1, def32 0, gran 1 >> processor eflags = interrupt enabled, resume, IOPL = 0 >> current process = 6106 (pickup) >> trap number = 12 >> panic: page fault >> cpuid = 2 >> >> >> (kgdb) bt >> #0 doadump (textdump=<value optimized out>) at pcpu.h:219 >> #1 0xffffffff8053ce32 in kern_reboot (howto=260) at >> /usr/src/sys/kern/kern_shutdown.c:455 >> #2 0xffffffff8053d215 in vpanic (fmt=<value optimized out>, ap=<value >> optimized out>) at /usr/src/sys/kern/kern_shutdown.c:762 >> #3 0xffffffff8053d0a3 in panic (fmt=0x0) at >> /usr/src/sys/kern/kern_shutdown.c:691 >> #4 0xffffffff807755db in trap_fatal (frame=<value optimized out>, >> eva=<value optimized out>) at /usr/src/sys/amd64/amd64/trap.c:851 >> #5 0xffffffff807758dd in trap_pfault (frame=0xfffffe0698dbc7c0, >> usermode=<value optimized out>) at /usr/src/sys/amd64/amd64/trap.c:674 >> #6 0xffffffff80774f7a in trap (frame=0xfffffe0698dbc7c0) at >> /usr/src/sys/amd64/amd64/trap.c:440 >> #7 0xffffffff8075b0f2 in calltrap () at >> /usr/src/sys/amd64/amd64/exception.S:236 >> #8 0xffffffff804fda65 in kqueue_close (fp=0xfffff803e4967190, >> td=0xfffff80014b094a0) at /usr/src/sys/kern/kern_event.c:1750 >> #9 0xffffffff804f25f9 in _fdrop (fp=0xfffff803e4967190, >> td=0xfffff802b5d2a000) at file.h:343 >> #10 0xffffffff804f4e9e in closef (fp=<value optimized out>, td=<value >> optimized out>) at /usr/src/sys/kern/kern_descrip.c:2338 >> #11 0xffffffff804f4ab9 in fdescfree (td=0xfffff80014b094a0) at >> /usr/src/sys/kern/kern_descrip.c:2106 >> #12 0xffffffff805013a9 in exit1 (td=0xfffff80014b094a0, rv=<value >> optimized out>) at /usr/src/sys/kern/kern_exit.c:369 >> #13 0xffffffff80500e3e in sys_sys_exit (td=0xfffffe000782e060, >> uap=<value optimized out>) at /usr/src/sys/kern/kern_exit.c:179 >> #14 0xffffffff80775efd in amd64_syscall (td=0xfffff80014b094a0, >> traced=0) at subr_syscall.c:134 >> #15 0xffffffff8075b3db in Xfast_syscall () at >> /usr/src/sys/amd64/amd64/exception.S:396 >> #16 0x000000080120335a in ?? () >> >> Most of the dumps list 'pickup' as current process. All of them have >> 'kqueue_close' in the backtrace. >> I'm not sure what the next step in diagnosing the issue is. Any pointers >> would be greatly appreciated. > What is exact revision of the checkout you run, where the panic above > occurs ? Not entirely sure. Can I still find out if I've updated my source tree since? It's not in uname -a, but matching the dates it should be around ~289032. Want me to update to HEAD and do the steps below on that instead? > > Please load the kernel.debug + vmcore into kgdb, go to frame 8, and do > p *kq > p *kn > p i > p kq->kq_knlist[i].slh_first > p *(kq->kq_knlist[i].slh_first) #8 0xffffffff804fda65 in kqueue_close (fp=0xfffff801dd94b1e0, td=0xfffff80015bbc000) at /usr/src/sys/kern/kern_event.c:1750 1750 kn->kn_fop->f_detach(kn); (kgdb) p *kq $1 = {kq_lock = {lock_object = {lo_name = 0xffffffff80829725 "kqueue", lo_flags = 21168128, lo_data = 0, lo_witness = 0x0}, mtx_lock = 4}, kq_refcnt = 1, kq_list = { tqe_next = 0xfffff8015f29fc00, tqe_prev = 0xfffff8000c749860}, kq_head = {tqh_first = 0x0, tqh_last = 0xfffff801dd33a038}, kq_count = 0, kq_sel = {si_tdlist = {tqh_first = 0x0, tqh_last = 0x0}, si_note = {kl_list = {slh_first = 0x0}, kl_lock = 0xffffffff804fc560 <knlist_mtx_lock>, kl_unlock = 0xffffffff804fc5a0 <knlist_mtx_unlock>, kl_assert_locked = 0xffffffff804fc5e0 <knlist_mtx_assert_locked>, kl_assert_unlocked = 0xffffffff804fc5f0 <knlist_mtx_assert_unlocked>, kl_lockarg = 0xfffff801dd33a000}, si_mtx = 0x0}, kq_sigio = 0x0, kq_fdp = 0xfffff8000c749800, kq_state = 16, kq_knlistsize = 256, kq_knlist = 0xfffff8000c7a8800, kq_knhashmask = 0, kq_knhash = 0x0, kq_task = { ta_link = {stqe_next = 0x0}, ta_pending = 0, ta_priority = 0, ta_func = 0xffffffff804faeb0 <kqueue_task>, ta_context = 0xfffff801dd33a000}} (kgdb) p *kn No symbol "kn" in current context. (kgdb) p i No symbol "i" in current context.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?561E7E7A.1080600>