Date: Thu, 22 Sep 2016 16:01:14 +0300 From: Konstantin Belousov <kostikbel@gmail.com> To: Ben Woods <woodsb02@gmail.com> Cc: FreeBSD Current <freebsd-current@freebsd.org> Subject: Re: Kernel crashes from turnstile_broadcast (/usr/src/sys/kern/subr_turnstile.c:837) Message-ID: <20160922130114.GT38409@kib.kiev.ua> In-Reply-To: <CAOc73CDZnPy1JJ8DBJyANsYqPkV35__ojfE%2BFbfbT3hTZw7MgA@mail.gmail.com> References: <CAOc73CDZnPy1JJ8DBJyANsYqPkV35__ojfE%2BFbfbT3hTZw7MgA@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, Sep 22, 2016 at 08:48:29PM +0800, Ben Woods wrote: > Hi everyone, > > I am currently experiencing semi-regular kernel crashes on my FreeBSD > 12-current machine. I am new to kernel debugging, and hoping someone can > have a look at the debugging output below to point me in the direction of > what the problem might be. > > My machine is a FreeNAS-mini from iXsystems which I have formatted and > installed stock FreeBSD onto. My kernel is the default generic-nodebug with > the VIMAGE options added. > > $ uname -a > FreeBSD freenas.woods.am 12.0-CURRENT FreeBSD 12.0-CURRENT #0 r305311M: Sat > Sep 3 12:29:01 AWST 2016 > woodsb02@freenas.woods.am:/usr/obj/usr/src/sys/GENERIC-NODEBUG-VIMAGE > amd64 > > $ cat /usr/src/sys/amd64/conf/GENERIC-NODEBUG-VIMAGE > # SPARTICUS -- WITNESS and INVARIANTS free kernel configuration file > # for FreeBSD/amd64 > > include GENERIC-NODEBUG > > ident GENERIC-NODEBUG-VIMAGE > > #nooptions SCTP # Stream Control Transmission Protocol > options VIMAGE # VNET/Vimage support > > > Output from kernel crash dump debug with kgdb below: > > /usr/obj/usr/src/sys/GENERIC-NODEBUG-VIMAGE)# kgdb kernel.debug > /var/crash/vmcore.last > GNU gdb 6.1.1 [FreeBSD] > Copyright 2004 Free Software Foundation, Inc. > GDB is free software, covered by the GNU General Public License, and you are > welcome to change it and/or distribute copies of it under certain > conditions. > Type "show copying" to see the conditions. > There is absolutely no warranty for GDB. Type "show warranty" for details. > This GDB was configured as "amd64-marcel-freebsd"... > > Unread portion of the kernel message buffer: > kernel trap 12 with interrupts disabled > > > Fatal trap 12: page fault while in kernel mode > cpuid = 2; apic id = 04 > fault virtual address = 0x30 > fault code = supervisor read data, page not present > instruction pointer = 0x20:0xffffffff80b4d91c > stack pointer = 0x28:0xfffffe046813a440 > frame pointer = 0x28:0xfffffe046813a470 > code segment = base 0x0, limit 0xfffff, type 0x1b > = DPL 0, pres 1, long 1, def32 0, gran 1 > processor eflags = resume, IOPL = 0 > current process = 33487 (sh) > Uptime: 15m16s > Dumping 1664 out of 16338 > MB:..1%..11%..21%..31%..41%..51%..61%..71%..81%..91% > > Reading symbols from /boot/kernel.GENERIC-NODEBUG-VIMAGE/zfs.ko...Reading > symbols from > /usr/lib/debug//boot/kernel.GENERIC-NODEBUG-VIMAGE/zfs.ko.debug...done. > done. > Loaded symbols for /boot/kernel.GENERIC-NODEBUG-VIMAGE/zfs.ko > Reading symbols from > /boot/kernel.GENERIC-NODEBUG-VIMAGE/opensolaris.ko...Reading symbols from > /usr/lib/debug//boot/kernel.GENERIC-NODEBUG-VIMAGE/opensolaris.ko.debug... > done. > done. > Loaded symbols for /boot/kernel.GENERIC-NODEBUG-VIMAGE/opensolaris.ko > Reading symbols from > /boot/kernel.GENERIC-NODEBUG-VIMAGE/geom_eli.ko...Reading symbols from > /usr/lib/debug//boot/kernel.GENERIC-NODEBUG-VIMAGE/geom_eli.ko.debug...done. > done. > Loaded symbols for /boot/kernel.GENERIC-NODEBUG-VIMAGE/geom_eli.ko > Reading symbols from > /boot/kernel.GENERIC-NODEBUG-VIMAGE/accf_http.ko...Reading symbols from > /usr/lib/debug//boot/kernel.GENERIC-NODEBUG-VIMAGE/accf_http.ko.debug...done. > done. > Loaded symbols for /boot/kernel.GENERIC-NODEBUG-VIMAGE/accf_http.ko > Reading symbols from > /boot/kernel.GENERIC-NODEBUG-VIMAGE/coretemp.ko...Reading symbols from > /usr/lib/debug//boot/kernel.GENERIC-NODEBUG-VIMAGE/coretemp.ko.debug...done. > done. > Loaded symbols for /boot/kernel.GENERIC-NODEBUG-VIMAGE/coretemp.ko > Reading symbols from /boot/kernel.GENERIC-NODEBUG-VIMAGE/aesni.ko...Reading > symbols from > /usr/lib/debug//boot/kernel.GENERIC-NODEBUG-VIMAGE/aesni.ko.debug...done. > done. > Loaded symbols for /boot/kernel.GENERIC-NODEBUG-VIMAGE/aesni.ko > Reading symbols from > /boot/kernel.GENERIC-NODEBUG-VIMAGE/if_bridge.ko...Reading symbols from > /usr/lib/debug//boot/kernel.GENERIC-NODEBUG-VIMAGE/if_bridge.ko.debug...done. > done. > Loaded symbols for /boot/kernel.GENERIC-NODEBUG-VIMAGE/if_bridge.ko > Reading symbols from > /boot/kernel.GENERIC-NODEBUG-VIMAGE/bridgestp.ko...Reading symbols from > /usr/lib/debug//boot/kernel.GENERIC-NODEBUG-VIMAGE/bridgestp.ko.debug...done. > done. > Loaded symbols for /boot/kernel.GENERIC-NODEBUG-VIMAGE/bridgestp.ko > Reading symbols from /boot/kernel.GENERIC-NODEBUG-VIMAGE/ums.ko...Reading > symbols from > /usr/lib/debug//boot/kernel.GENERIC-NODEBUG-VIMAGE/ums.ko.debug...done. > done. > Loaded symbols for /boot/kernel.GENERIC-NODEBUG-VIMAGE/ums.ko > Reading symbols from > /boot/kernel.GENERIC-NODEBUG-VIMAGE/netgraph.ko...Reading symbols from > /usr/lib/debug//boot/kernel.GENERIC-NODEBUG-VIMAGE/netgraph.ko.debug...done. > done. > Loaded symbols for /boot/kernel.GENERIC-NODEBUG-VIMAGE/netgraph.ko > Reading symbols from > /boot/kernel.GENERIC-NODEBUG-VIMAGE/ng_netflow.ko...Reading symbols from > /usr/lib/debug//boot/kernel.GENERIC-NODEBUG-VIMAGE/ng_netflow.ko.debug...done. > done. > Loaded symbols for /boot/kernel.GENERIC-NODEBUG-VIMAGE/ng_netflow.ko > Reading symbols from > /boot/kernel.GENERIC-NODEBUG-VIMAGE/ng_ksocket.ko...Reading symbols from > /usr/lib/debug//boot/kernel.GENERIC-NODEBUG-VIMAGE/ng_ksocket.ko.debug...done. > done. > Loaded symbols for /boot/kernel.GENERIC-NODEBUG-VIMAGE/ng_ksocket.ko > Reading symbols from > /boot/kernel.GENERIC-NODEBUG-VIMAGE/ng_ether.ko...Reading symbols from > /usr/lib/debug//boot/kernel.GENERIC-NODEBUG-VIMAGE/ng_ether.ko.debug...done. > done. > Loaded symbols for /boot/kernel.GENERIC-NODEBUG-VIMAGE/ng_ether.ko > Reading symbols from > /boot/kernel.GENERIC-NODEBUG-VIMAGE/ng_socket.ko...Reading symbols from > /usr/lib/debug//boot/kernel.GENERIC-NODEBUG-VIMAGE/ng_socket.ko.debug...done. > done. > Loaded symbols for /boot/kernel.GENERIC-NODEBUG-VIMAGE/ng_socket.ko > Reading symbols from /boot/kernel.GENERIC-NODEBUG-VIMAGE/linux.ko...Reading > symbols from > /usr/lib/debug//boot/kernel.GENERIC-NODEBUG-VIMAGE/linux.ko.debug...done. > done. > Loaded symbols for /boot/kernel.GENERIC-NODEBUG-VIMAGE/linux.ko > Reading symbols from > /boot/kernel.GENERIC-NODEBUG-VIMAGE/linux_common.ko...Reading symbols from > /usr/lib/debug//boot/kernel.GENERIC-NODEBUG-VIMAGE/linux_common.ko.debug...done. > done. > Loaded symbols for /boot/kernel.GENERIC-NODEBUG-VIMAGE/linux_common.ko > Reading symbols from > /boot/kernel.GENERIC-NODEBUG-VIMAGE/linux64.ko...Reading symbols from > /usr/lib/debug//boot/kernel.GENERIC-NODEBUG-VIMAGE/linux64.ko.debug...done. > done. > Loaded symbols for /boot/kernel.GENERIC-NODEBUG-VIMAGE/linux64.ko > Reading symbols from > /boot/kernel.GENERIC-NODEBUG-VIMAGE/fdescfs.ko...Reading symbols from > /usr/lib/debug//boot/kernel.GENERIC-NODEBUG-VIMAGE/fdescfs.ko.debug...done. > done. > Loaded symbols for /boot/kernel.GENERIC-NODEBUG-VIMAGE/fdescfs.ko > Reading symbols from > /boot/kernel.GENERIC-NODEBUG-VIMAGE/if_epair.ko...Reading symbols from > /usr/lib/debug//boot/kernel.GENERIC-NODEBUG-VIMAGE/if_epair.ko.debug...done. > done. > Loaded symbols for /boot/kernel.GENERIC-NODEBUG-VIMAGE/if_epair.ko > Reading symbols from > /boot/kernel.GENERIC-NODEBUG-VIMAGE/nullfs.ko...Reading symbols from > /usr/lib/debug//boot/kernel.GENERIC-NODEBUG-VIMAGE/nullfs.ko.debug...done. > done. > Loaded symbols for /boot/kernel.GENERIC-NODEBUG-VIMAGE/nullfs.ko > Reading symbols from /boot/kernel.GENERIC-NODEBUG-VIMAGE/tmpfs.ko...Reading > symbols from > /usr/lib/debug//boot/kernel.GENERIC-NODEBUG-VIMAGE/tmpfs.ko.debug...done. > done. > Loaded symbols for /boot/kernel.GENERIC-NODEBUG-VIMAGE/tmpfs.ko > #0 doadump (textdump=1) at pcpu.h:221 > 221 __asm("movq %%gs:%1,%0" : "=r" (td) > (kgdb) list *0xffffffff80b4d91c > 0xffffffff80b4d91c is in turnstile_broadcast > (/usr/src/sys/kern/subr_turnstile.c:837). > 832 > 833 /* > 834 * Transfer the blocked list to the pending list. > 835 */ > 836 mtx_lock_spin(&td_contested_lock); > 837 TAILQ_CONCAT(&ts->ts_pending, &ts->ts_blocked[queue], > td_lockq); > 838 mtx_unlock_spin(&td_contested_lock); > 839 > 840 /* > 841 * Give a turnstile to each thread. The last thread gets > Current language: auto; currently minimal > (kgdb) backtrace > #0 doadump (textdump=1) at pcpu.h:221 > #1 0xffffffff80aea40e in kern_reboot (howto=260) at > /usr/src/sys/kern/kern_shutdown.c:366 > #2 0xffffffff80aea9db in vpanic (fmt=<value optimized out>, ap=<value > optimized out>) at /usr/src/sys/kern/kern_shutdown.c:759 > #3 0xffffffff80aea813 in panic (fmt=0x0) at > /usr/src/sys/kern/kern_shutdown.c:690 > #4 0xffffffff8039e197 in db_panic (addr=<value optimized out>, > have_addr=false, count=0, modif=0x0) at /usr/src/sys/ddb/db_command.c:486 > #5 0xffffffff8039d689 in db_command (cmd_table=<value optimized out>) at > /usr/src/sys/ddb/db_command.c:453 > #6 0xffffffff8039d3e4 in db_command_loop () at > /usr/src/sys/ddb/db_command.c:506 > #7 0xffffffff803a053b in db_trap (type=<value optimized out>, code=<value > optimized out>) at /usr/src/sys/ddb/db_main.c:251 > #8 0xffffffff80b36b33 in kdb_trap (type=<value optimized out>, code=<value > optimized out>, tf=<value optimized out>) at > /usr/src/sys/kern/subr_kdb.c:654 > #9 0xffffffff80fdd441 in trap_fatal (frame=0xfffffe046813a390, eva=48) at > /usr/src/sys/amd64/amd64/trap.c:836 > #10 0xffffffff80fdd673 in trap_pfault (frame=0xfffffe046813a390, > usermode=0) at /usr/src/sys/amd64/amd64/trap.c:691 > #11 0xffffffff80fdcbfc in trap (frame=0xfffffe046813a390) at > /usr/src/sys/amd64/amd64/trap.c:442 > #12 0xffffffff80fbf491 in calltrap () at > /usr/src/sys/amd64/amd64/exception.S:236 > #13 0xffffffff80b4d91c in turnstile_broadcast (ts=0x0, queue=1) at > /usr/src/sys/kern/subr_turnstile.c:837 > #14 0xffffffff80ae5e1f in __rw_wunlock_hard (c=0xfffff803f886d960, > tid=<value optimized out>, file=<value optimized out>, line=<value > optimized out>) > at /usr/src/sys/kern/kern_rwlock.c:1027 > #15 0xffffffff80e525dc in vm_map_delete (map=<value optimized out>, > start=<value optimized out>, end=<value optimized out>) at > /usr/src/sys/vm/vm_map.c:2960 > #16 0xffffffff80e54477 in vm_map_remove (map=0xfffff8035540f000, > start=140737488355328, end=1) at /usr/src/sys/vm/vm_map.c:3077 > #17 0xffffffff80a9863f in exec_new_vmspace (imgp=0xfffffe046813a860, > sv=0xffffffff81a596e8) at /usr/src/sys/kern/kern_exec.c:1096 > #18 0xffffffff80a6ced8 in exec_elf64_imgact (imgp=<value optimized out>) at > /usr/src/sys/kern/imgact_elf.c:896 > #19 0xffffffff80a9670d in kern_execve (td=<value optimized out>, > args=<value optimized out>, mac_p=0x0) at /usr/src/sys/kern/kern_exec.c:603 > #20 0xffffffff80a95b9c in sys_execve (td=0xfffff8032893aa00, > uap=0xfffffe046813ab80) at /usr/src/sys/kern/kern_exec.c:219 > #21 0xffffffff80fddde8 in amd64_syscall (td=<value optimized out>, > traced=0) at subr_syscall.c:135 > #22 0xffffffff80fbf77b in Xfast_syscall () at > /usr/src/sys/amd64/amd64/exception.S:396 > #23 0x0000000800b468ea in ?? () > Previous frame inner to this frame (corrupt stack?) > (kgdb) up 11 > #11 0xffffffff80fdcbfc in trap (frame=0xfffffe046813a390) at > /usr/src/sys/amd64/amd64/trap.c:442 > 442 (void) trap_pfault(frame, FALSE); > (kgdb) list > 437 > 438 KASSERT(cold || td->td_ucred != NULL, > 439 ("kernel trap doesn't have ucred")); > 440 switch (type) { > 441 case T_PAGEFLT: /* page fault */ > 442 (void) trap_pfault(frame, FALSE); > 443 goto out; > 444 > 445 case T_DNA: > 446 if (PCB_USER_FPU(td->td_pcb)) > (kgdb) print td > $1 = (struct thread *) 0xfffff8032893aa00 > (kgdb) print td->td_ucred > $2 = (struct ucred *) 0xfffff8004005ec00 > (kgdb) print type > $3 = 12 > (kgdb) up > #12 0xffffffff80fbf491 in calltrap () at > /usr/src/sys/amd64/amd64/exception.S:236 > 236 call trap_check > Current language: auto; currently asm > (kgdb) up > #13 0xffffffff80b4d91c in turnstile_broadcast (ts=0x0, queue=1) at > /usr/src/sys/kern/subr_turnstile.c:837 > 837 TAILQ_CONCAT(&ts->ts_pending, &ts->ts_blocked[queue], > td_lockq); ts == NULL usually means that the unlock was performed on unlocked rwlock. Add witness and all other usual stuff to your kernel config, this should give more useful information earlier. Still, I do not see how an object can become unlocked at the line 2690. > Current language: auto; currently minimal > (kgdb) up > #14 0xffffffff80ae5e1f in __rw_wunlock_hard (c=0xfffff803f886d960, > tid=<value optimized out>, file=<value optimized out>, line=<value > optimized out>) > at /usr/src/sys/kern/kern_rwlock.c:1027 > 1027 turnstile_broadcast(ts, queue); > (kgdb) up > #15 0xffffffff80e525dc in vm_map_delete (map=<value optimized out>, > start=<value optimized out>, end=<value optimized out>) at > /usr/src/sys/vm/vm_map.c:2960 > 2960 VM_OBJECT_WUNLOCK(object); > (kgdb) up > #16 0xffffffff80e54477 in vm_map_remove (map=0xfffff8035540f000, > start=140737488355328, end=1) at /usr/src/sys/vm/vm_map.c:3077 > 3077 result = vm_map_delete(map, start, end); > (kgdb) up > #17 0xffffffff80a9863f in exec_new_vmspace (imgp=0xfffffe046813a860, > sv=0xffffffff81a596e8) at /usr/src/sys/kern/kern_exec.c:1096 > 1096 vm_map_remove(map, vm_map_min(map), > vm_map_max(map)); > (kgdb) up > #18 0xffffffff80a6ced8 in exec_elf64_imgact (imgp=<value optimized out>) at > /usr/src/sys/kern/imgact_elf.c:896 > 896 error = exec_new_vmspace(imgp, sv); > (kgdb) up > #19 0xffffffff80a9670d in kern_execve (td=<value optimized out>, > args=<value optimized out>, mac_p=0x0) at /usr/src/sys/kern/kern_exec.c:603 > 603 error = (*execsw[i]->ex_imgact)(imgp); > (kgdb) up > #20 0xffffffff80a95b9c in sys_execve (td=0xfffff8032893aa00, > uap=0xfffffe046813ab80) at /usr/src/sys/kern/kern_exec.c:219 > 219 error = kern_execve(td, &args, NULL); > (kgdb) up > #21 0xffffffff80fddde8 in amd64_syscall (td=<value optimized out>, > traced=0) at subr_syscall.c:135 > 135 error = (sa->callp->sy_call)(td, sa->args); > (kgdb) up > #22 0xffffffff80fbf77b in Xfast_syscall () at > /usr/src/sys/amd64/amd64/exception.S:396 > 396 call amd64_syscall > Current language: auto; currently asm > (kgdb) up > #23 0x0000000800b468ea in ?? () > (kgdb) up > Initial frame selected; you cannot go up. > (kgdb) quit > > > > Regards, > Ben > > -- > From: Benjamin Woods > woodsb02@gmail.com > _______________________________________________ > freebsd-current@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-current > To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20160922130114.GT38409>