Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 10 Aug 2016 18:35:15 +0430
From:      Hooman Fazaeli <hoomanfazaeli@gmail.com>
To:        FreeBSD Hackers <freebsd-hackers@freebsd.org>
Subject:   9.3-RELEASE panic: spin lock held too long
Message-ID:  <57AB349B.2010805@gmail.com>

next in thread | raw e-mail | index | archive | help

Hi

on a 9.3-REL i386 box we have occasional "spin lock held too long" panics.

System info:
-------------
- Intel(R) Core(TM) i5-4440 CPU @ 3.10GHz CPU (4 cores, no hyper theading)
- 4G non-ECC RAM
- asterisk-1.8.30.0 from ports
- dahdi-kmod26-2.6.1.r10738 from ports
- powerd disabled.
- Workload: ISDN & SIP call processing.
------------

The panics are either on 'sched lock' or 'turnstile lock' spin locks.

PANIC 1
=======
As below trace shows:

1- input arrives on a UDP socket
2- doselwakeup is called.
3- That wakeup call ends up in sched_add.
4- sched_add grabs 'sched lock 0' spin lock, and aparenlty, holds it for a too long time.
5- The pancing thread does the same calls as owner thread but panics because
    it can't grab the the same spin lock.

 > kgdb /boot/kernel/kernel /var/crash/vmcore.14
...
kernel trap 12 with interrupts disabled
spin lock 0xc140a4c0 (sched lock 0) held by 0xc807a2f0 (tid 100045) too long
panic: spin lock held too long
cpuid = 3
KDB: stack backtrace:
#0 0xc0b17eaf at kdb_backtrace+0x4f
#1 0xc0adeaef at panic+0x16f
#2 0xc0ac9cff at _mtx_lock_spin_failed+0x3f
#3 0xc0ac9e75 at _mtx_lock_spin+0x165
#4 0xc0b096c5 at sched_add+0xf5
#5 0xc0b09890 at sched_wakeup+0x70
#6 0xc0ae8968 at setrunnable+0x88
#7 0xc0b2227e at sleepq_resume_thread+0x12e
#8 0xc0b22fd3 at sleepq_broadcast+0xd3
#9 0xc0a8c4cd at cv_broadcastpri+0x4d
#10 0xc0b2a406 at doselwakeup+0xe6
#11 0xc0b2a4be at selwakeuppri+0xe
#12 0xc0a9fa59 at knote_enqueue+0x59
#13 0xc0aa073f at kqueue_register+0x84f
#14 0xc0aa09f3 at kern_kevent+0xf3
#15 0xc0aa16ce at sys_kevent+0x19e
#16 0xc0fcc8c3 at syscall+0x443
#17 0xc0fb60f1 at Xint0x80_syscall+0x21
Uptime: 7m44s

 > bt
#0  doadump (textdump=1) at pcpu.h:250
#1  0xc0ade835 in kern_reboot (howto=260) at ../../../kern/kern_shutdown.c:454
#2  0xc0adeb32 in panic (fmt=<value optimized out>) at ../../../kern/kern_shutdown.c:642
#3  0xc0ac9cff in _mtx_lock_spin_failed (m=0x0) at ../../../kern/kern_mutex.c:515
#4  0xc0ac9e75 in _mtx_lock_spin (m=0xc140a4c0, tid=3384060112, opts=0, file=0x0, line=0) at ../../../kern/kern_mutex.c:557
#5  0xc0b096c5 in sched_add (td=0xc9b00bc0, flags=0) at ../../../kern/sched_ule.c:1153
#6  0xc0b09890 in sched_wakeup (td=0xc9b00bc0) at ../../../kern/sched_ule.c:1991
#7  0xc0ae8968 in setrunnable (td=0xc9b00bc0) at ../../../kern/kern_synch.c:537
#8  0xc0b2227e in sleepq_resume_thread (sq=0xc869fd40, td=0xc9b00bc0, pri=104) at ../../../kern/subr_sleepqueue.c:763
#9  0xc0b22fd3 in sleepq_broadcast (wchan=0xc95741e4, flags=1, pri=104, queue=0) at ../../../kern/subr_sleepqueue.c:865
#10 0xc0a8c4cd in cv_broadcastpri (cvp=0xc95741e4, pri=104) at ../../../kern/kern_condvar.c:448
#11 0xc0b2a406 in doselwakeup (sip=0xc963faac, pri=104) at ../../../kern/sys_generic.c:1683
#12 0xc0b2a4be in selwakeuppri (sip=0xc963faac, pri=104) at ../../../kern/sys_generic.c:1651
#13 0xc0a9fa59 in knote_enqueue (kn=<value optimized out>) at ../../../kern/kern_event.c:1786
#14 0xc0aa073f in kqueue_register (kq=0xc963fa80, kev=0xf0e07b20, td=0xc9b4a8d0, waitok=1) at ../../../kern/kern_event.c:1154
#15 0xc0aa09f3 in kern_kevent (td=0xc9b4a8d0, fd=152, nchanges=2, nevents=0, k_ops=0xf0e07c20, timeout=0x0) at ../../../kern/kern_event.c:850
#16 0xc0aa16ce in sys_kevent (td=0xc9b4a8d0, uap=0xf0e07ccc) at ../../../kern/kern_event.c:771
#17 0xc0fcc8c3 in syscall (frame=0xf0e07d08) at subr_syscall.c:135
#18 0xc0fb60f1 in Xint0x80_syscall () at ../../../i386/i386/exception.s:270
#19 0x00000033 in ?? ()
Previous frame inner to this frame (corrupt stack?)

(kgdb) tid 100045
[Switching to thread 34 (Thread 100045)]#0  cpustop_handler () at ../../../i386/i386/mp_machdep.c:1496
1496    ../../../i386/i386/mp_machdep.c: No such file or directory.
         in ../../../i386/i386/mp_machdep.c

(kgdb) bt
#0  cpustop_handler () at ../../../i386/i386/mp_machdep.c:1496
#1  0xc0fc1805 in ipi_nmi_handler () at ../../../i386/i386/mp_machdep.c:1478
#2  0xc0fccf38 in trap (frame=0xe1be9620) at ../../../i386/i386/trap.c:227
#3  0xc0fb605c in calltrap () at ../../../i386/i386/exception.s:170
#4  0xc0fddf45 in DELAY (n=1) at ../../../x86/isa/clock.c:283
#5  0xc0ac9e6c in _mtx_lock_spin (m=0xc140b1c0, tid=3355943664, opts=0, file=0x0, line=0) at ../../../kern/kern_mutex.c:555
#6  0xc0b096c5 in sched_add (td=0xc7de0000, flags=4) at ../../../kern/sched_ule.c:1153
#7  0xc0aae008 in intr_event_schedule_thread (ie=<value optimized out>) at ../../../kern/kern_intr.c:921
#8  0xc0aaf8fd in swi_sched (cookie=0xc7d95300, flags=<value optimized out>) at ../../../kern/kern_intr.c:1174
#9  0xc0af5b53 in callout_tick () at ../../../kern/kern_timeout.c:361
#10 0xc0a8c3e1 in hardclock_cnt (cnt=1, usermode=0) at ../../../kern/kern_clock.c:554
#11 0xc0fd403f in handleevents (now=0xe1be9870, fake=0) at ../../../kern/kern_clocksource.c:215
#12 0xc0fd5d3f in timercb (et=0xc148c860, arg=0x0) at ../../../kern/kern_clocksource.c:390
#13 0xc0fe731f in lapic_handle_timer (frame=0xe1be98b4) at ../../../x86/x86/local_apic.c:818
#14 0xc0fb6600 in Xtimerint () at apic_vector.s:108
#15 0xc0aca0df in _mtx_lock_sleep (m=0xc95741d0, tid=3355943664, opts=<value optimized out>, file=0x0, line=0) at ../../../kern/kern_mutex.c:400
#16 0xc0b2a3e9 in doselwakeup (sip=0xc9a491f4, pri=104) at ../../../kern/sys_generic.c:1681
#17 0xc0b2a4be in selwakeuppri (sip=0xc9a491f4, pri=104) at ../../../kern/sys_generic.c:1651
#18 0xc0b4886f in sowakeup (so=0xc9a491a0, sb=0xc9a491f4) at ../../../kern/uipc_sockbuf.c:182
#19 0xc0c9680d in udp_append (inp=0xc87bb0fc, ip=0xc8ead80e, n=0xc8ebf500, off=28, udp_in=0xe1be9a7c) at ../../../netinet/udp_usrreq.c:330
#20 0xc0c98ad6 in udp_input (m=0xc8ebf500, off=20) at ../../../netinet/udp_usrreq.c:616
#21 0xc0c0d5e7 in ip_input (m=0xc8ebf500) at ../../../netinet/ip_input.c:760
#22 0xc0bae68f in netisr_dispatch_src (proto=1, source=0, m=0xc8ebf500) at ../../../net/netisr.c:1013
#23 0xc0bae930 in netisr_dispatch (proto=1, m=0xc8ebf500) at ../../../net/netisr.c:1104
#24 0xc0ba50c1 in ether_demux (ifp=0xc807f000, m=0xc8ebf500) at ../../../net/if_ethersubr.c:943
#25 0xc0ba551f in ether_nh_input (m=0xc8ebf500) at ../../../net/if_ethersubr.c:762
#26 0xc0bae68f in netisr_dispatch_src (proto=9, source=0, m=0xc8ebf500) at ../../../net/netisr.c:1013
#27 0xc0bae930 in netisr_dispatch (proto=9, m=0xc8ebf500) at ../../../net/netisr.c:1104
#28 0xc0ba4c09 in ether_input (ifp=0xc807f000, m=0xc8ebf500) at ../../../net/if_ethersubr.c:803
#29 0xc06930c2 in igb_rxeof (que=0xc8078180, count=99, done=0x0) at ../../../dev/e1000/if_igb.c:4735
#30 0xc0693328 in igb_msix_que (arg=0xc8078180) at ../../../dev/e1000/if_igb.c:1601
#31 0xc0aae18b in intr_event_execute_handlers (p=0xc7d9e5b0, ie=0xc8077a00) at ../../../kern/kern_intr.c:1272
#32 0xc0aaf990 in ithread_loop (arg=0xc80a43e0) at ../../../kern/kern_intr.c:1285
#33 0xc0aaa96f in fork_exit (callout=0xc0aaf910 <ithread_loop>, arg=0xc80a43e0, frame=0xe1be9d08) at ../../../kern/kern_fork.c:996
#34 0xc0fb6104 in fork_trampoline () at ../../../i386/i386/exception.s:279

PANIC 2
=======
 > kgdb /boot/kernel/kernel /var/crash/vmcore.15
...
...
Unread portion of the kernel message buffer:
spin lock 0xc9976800 (turnstile lock) held by 0xc7da12f0 (tid 100005) too long
panic: spin lock held too long
cpuid = 2
KDB: stack backtrace:
#0 0xc0b17eaf at kdb_backtrace+0x4f
#1 0xc0adeaef at panic+0x16f
#2 0xc0ac9cff at _mtx_lock_spin_failed+0x3f
#3 0xc0ac9e75 at _mtx_lock_spin+0x165
#4 0xc0b26f87 at turnstile_lookup+0x87
#5 0xc0ac9c77 at _mtx_unlock_sleep+0x47
#6 0xc0b2a443 at doselwakeup+0x123
#7 0xc0b2a4be at selwakeuppri+0xe
#8 0xc0a9fa59 at knote_enqueue+0x59
#9 0xc0aa073f at kqueue_register+0x84f
#10 0xc0aa09f3 at kern_kevent+0xf3
#11 0xc0aa16ce at sys_kevent+0x19e
#12 0xc0fcc8c3 at syscall+0x443
#13 0xc0fb60f1 at Xint0x80_syscall+0x21
Uptime: 2h6m20s
Physical memory: 3486 MB
Dumping 162 MB: 147 131 115 99 83 67 51 35 19 3
...
...
(kgdb) bt
#0  doadump (textdump=1) at pcpu.h:250
#1  0xc0ade835 in kern_reboot (howto=260) at ../../../kern/kern_shutdown.c:454
#2  0xc0adeb32 in panic (fmt=<value optimized out>) at ../../../kern/kern_shutdown.c:642
#3  0xc0ac9cff in _mtx_lock_spin_failed (m=0x0) at ../../../kern/kern_mutex.c:515
#4  0xc0ac9e75 in _mtx_lock_spin (m=0xc9976800, tid=3383440864, opts=0, file=0x0, line=0) at ../../../kern/kern_mutex.c:557
#5  0xc0b26f87 in turnstile_lookup (lock=0xc9639550) at ../../../kern/subr_turnstile.c:600
#6  0xc0ac9c77 in _mtx_unlock_sleep (m=0xc9639550, opts=0, file=0x0, line=0) at ../../../kern/kern_mutex.c:712
#7  0xc0b2a443 in doselwakeup (sip=0xc9ef702c, pri=104) at ../../../kern/sys_generic.c:1684
#8  0xc0b2a4be in selwakeuppri (sip=0xc9ef702c, pri=104) at ../../../kern/sys_generic.c:1651
#9  0xc0a9fa59 in knote_enqueue (kn=<value optimized out>) at ../../../kern/kern_event.c:1786
#10 0xc0aa073f in kqueue_register (kq=0xc9ef7000, kev=0xf0fc8b20, td=0xc9ab35e0, waitok=1) at ../../../kern/kern_event.c:1154
#11 0xc0aa09f3 in kern_kevent (td=0xc9ab35e0, fd=284, nchanges=2, nevents=0, k_ops=0xf0fc8c20, timeout=0x0) at ../../../kern/kern_event.c:850
#12 0xc0aa16ce in sys_kevent (td=0xc9ab35e0, uap=0xf0fc8ccc) at ../../../kern/kern_event.c:771
#13 0xc0fcc8c3 in syscall (frame=0xf0fc8d08) at subr_syscall.c:135
#14 0xc0fb60f1 in Xint0x80_syscall () at ../../../i386/i386/exception.s:270
#15 0x00000033 in ?? ()

(kgdb) tid 100005
[Switching to thread 18 (Thread 100005)]#0  sched_switch (td=0xc7da12f0, newtd=0xc9ab35e0, flags=259) at ../../../kern/sched_ule.c:1904
1904    ../../../kern/sched_ule.c: No such file or directory.
         in ../../../kern/sched_ule.c

(kgdb) bt
#0  sched_switch (td=0xc7da12f0, newtd=0xc9ab35e0, flags=259) at ../../../kern/sched_ule.c:1904
#1  0xc0ae8b43 in mi_switch (flags=259, newtd=0x0) at ../../../kern/kern_synch.c:485
#2  0xc0b28214 in turnstile_wait (ts=0xc9976800, owner=0xc9ab35e0, queue=<value optimized out>) at ../../../kern/subr_turnstile.c:753
#3  0xc0aca177 in _mtx_lock_sleep (m=0xc9639550, tid=3352957680, opts=<value optimized out>, file=0x0, line=0) at ../../../kern/kern_mutex.c:472
#4  0xc0b2a3e9 in doselwakeup (sip=0xc94cd0e0, pri=-1) at ../../../kern/sys_generic.c:1681
#5  0xc0b2a4d0 in selwakeup (sip=0xc94cd0e0) at ../../../kern/sys_generic.c:1642
#6  0xc9151789 in dahdi_wake_up_interruptible (sel=0xc94cd0e0)
     at /usr/ports/misc/dahdi-kmod26/work/dahdi-freebsd-2.6.1-r10738/bsd-kmod/dahdi/../../drivers/dahdi/dahdi-base.c:227
#7  0xc915946f in __dahdi_transmit_chunk (chan=0xc94cd000, buf=0xc85efba8 "ÕÕÕUUUÕÕ", '˙' <repeats 80 times>)
     at /usr/ports/misc/dahdi-kmod26/work/dahdi-freebsd-2.6.1-r10738/bsd-kmod/dahdi/../../drivers/dahdi/dahdi-base.c:7970
#8  0xc915fea7 in _dahdi_transmit (span=0xc858d03c)
     at /usr/ports/misc/dahdi-kmod26/work/dahdi-freebsd-2.6.1-r10738/bsd-kmod/dahdi/../../drivers/dahdi/dahdi-base.c:9527
#9  0xc941d0e4 in __transmit_span (ts=0xc858d000)
     at /usr/ports/misc/dahdi-kmod26/work/dahdi-freebsd-2.6.1-r10738/bsd-kmod/wct4xxp/../../drivers/dahdi/wct4xxp/base.c:3139
#10 0xc941d084 in t4_prep_gen2 (wc=0xc8612400)
     at /usr/ports/misc/dahdi-kmod26/work/dahdi-freebsd-2.6.1-r10738/bsd-kmod/wct4xxp/../../drivers/dahdi/wct4xxp/base.c:3170
#11 0xc941fd41 in _t4_interrupt_gen2 (dev_id=0xc8612400)
     at /usr/ports/misc/dahdi-kmod26/work/dahdi-freebsd-2.6.1-r10738/bsd-kmod/wct4xxp/../../drivers/dahdi/wct4xxp/base.c:4233
#12 0xc0aaef07 in intr_event_handle (ie=0xc7d9c800, frame=0xc7bc4bec) at ../../../kern/kern_intr.c:1446
#13 0xc0fe42d9 in intr_execute_handlers (isrc=0xc7d8de84, frame=0xc7bc4bec) at ../../../x86/x86/intr_machdep.c:266
#14 0xc0fe7376 in lapic_handle_intr (vector=50, frame=0xc7bc4bec) at ../../../x86/x86/local_apic.c:780
#15 0xc0fb6455 in Xapic_isr1 () at apic_vector.s:89
#16 0xc0fbc16b in spinlock_exit () at cpufunc.h:373
#17 0xc0fd540c in cpu_activeclock () at ../../../kern/kern_clocksource.c:825
#18 0xc0fbee72 in cpu_idle (busy=0) at ../../../i386/i386/machdep.c:1386
#19 0xc0b08c06 in sched_idletd (dummy=0x0) at ../../../kern/sched_ule.c:2609
#20 0xc0aaa96f in fork_exit (callout=0xc0b088a0 <sched_idletd>, arg=0x0, frame=0xc7bc4d08) at ../../../kern/kern_fork.c:996
#21 0xc0fb6104 in fork_trampoline () at ../../../i386/i386/exception.s:279

------------

I have read all the past relevant posts related to my problem.
However, the suggested workarounds don't work for or are irrelavant to
my problem.

pls. let me know if more information is needed.
Thanks in advance.



-- 
Best regards
Hooman Fazaeli




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?57AB349B.2010805>