Date: Wed, 10 Aug 2016 12:58:29 -0400 From: Ryan Stone <rysto32@gmail.com> To: Hooman Fazaeli <hoomanfazaeli@gmail.com> Cc: Konstantin Belousov <kostikbel@gmail.com>, FreeBSD Hackers <freebsd-hackers@freebsd.org> Subject: Re: 9.3-RELEASE panic: spin lock held too long Message-ID: <CAFMmRNw3hFWy0dqwvnQn4wdYdWvU=-73N4gYffvj2HGrvefk7Q@mail.gmail.com> In-Reply-To: <57AB462A.2080608@gmail.com> References: <57AB349B.2010805@gmail.com> <20160810141948.GP83214@kib.kiev.ua> <57AB462A.2080608@gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Wed, Aug 10, 2016 at 11:20 AM, Hooman Fazaeli <hoomanfazaeli@gmail.com> wrote: > > kgdb /boot/kernel/kernel /var/crash/vmcore.14 > ... > ... > (kgdb) bt > #0 doadump (textdump=1) at pcpu.h:250 > #1 0xc0ade835 in kern_reboot (howto=260) at ../../../kern/kern_shutdown.c: > 454 > #2 0xc0adeb32 in panic (fmt=<value optimized out>) at > ../../../kern/kern_shutdown.c:642 > #3 0xc0ac9cff in _mtx_lock_spin_failed (m=0x0) at > ../../../kern/kern_mutex.c:515 > #4 0xc0ac9e75 in _mtx_lock_spin (m=0xc140a4c0, tid=3384060112, opts=0, > file=0x0, line=0) at ../../../kern/kern_mutex.c:557 > #5 0xc0b096c5 in sched_add (td=0xc9b00bc0, flags=0) at > ../../../kern/sched_ule.c:1153 > #6 0xc0b09890 in sched_wakeup (td=0xc9b00bc0) at > ../../../kern/sched_ule.c:1991 > #7 0xc0ae8968 in setrunnable (td=0xc9b00bc0) at > ../../../kern/kern_synch.c:537 > #8 0xc0b2227e in sleepq_resume_thread (sq=0xc869fd40, td=0xc9b00bc0, > pri=104) at ../../../kern/subr_sleepqueue.c:763 > #9 0xc0b22fd3 in sleepq_broadcast (wchan=0xc95741e4, flags=1, pri=104, > queue=0) at ../../../kern/subr_sleepqueue.c:865 > #10 0xc0a8c4cd in cv_broadcastpri (cvp=0xc95741e4, pri=104) at > ../../../kern/kern_condvar.c:448 > #11 0xc0b2a406 in doselwakeup (sip=0xc963faac, pri=104) at > ../../../kern/sys_generic.c:1683 > #12 0xc0b2a4be in selwakeuppri (sip=0xc963faac, pri=104) at > ../../../kern/sys_generic.c:1651 > #13 0xc0a9fa59 in knote_enqueue (kn=<value optimized out>) at > ../../../kern/kern_event.c:1786 > #14 0xc0aa073f in kqueue_register (kq=0xc963fa80, kev=0xf0e07b20, > td=0xc9b4a8d0, waitok=1) at ../../../kern/kern_event.c:1154 > #15 0xc0aa09f3 in kern_kevent (td=0xc9b4a8d0, fd=152, nchanges=2, > nevents=0, k_ops=0xf0e07c20, timeout=0x0) at ../../../kern/kern_event.c:850 > #16 0xc0aa16ce in sys_kevent (td=0xc9b4a8d0, uap=0xf0e07ccc) at > ../../../kern/kern_event.c:771 > #17 0xc0fcc8c3 in syscall (frame=0xf0e07d08) at subr_syscall.c:135 > #18 0xc0fb60f1 in Xint0x80_syscall () at ../../../i386/i386/exception.s > :270 > #19 0x00000033 in ?? () > Previous frame inner to this frame (corrupt stack?) > > (kgdb) up 4 > #4 0xc0ac9e75 in _mtx_lock_spin (m=0xc140a4c0, tid=3384060112, opts=0, > file=0x0, line=0) at ../../../kern/kern_mutex.c:557 > 557 ../../../kern/kern_mutex.c: No such file or directory. > in ../../../kern/kern_mutex.c > > (kgdb) p *m > $1 = {lock_object = {lo_name = 0xc140ab08 "sched lock 0", lo_flags = > 720896, lo_data = 0, lo_witness = 0x0}, mtx_lock = 3355943664} > > ------------ > > As you see, the mtx_lock is 3355943664 (0xc807a2f0), the same TID reported > in panic string. > > (kgdb) info threads > ... > 34 Thread 100045 (PID=12: intr/irq267: igb0:que 0) sched_switch > (td=0xc807a2f0, newtd=0xc7da18d0, flags=265) at > ../../../kern/sched_ule.c:1904 > ... This sounds somewhat familiar. Is it always 'sched lock 0' that is ultimately leaked? Could you try applying this patch and seeing whether the new KASSERT triggers? https://people.freebsd.org/~rstone/patches/sched_balance_kassert.diff
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAFMmRNw3hFWy0dqwvnQn4wdYdWvU=-73N4gYffvj2HGrvefk7Q>