Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 05 Feb 2022 17:30:14 +0000
From:      bugzilla-noreply@freebsd.org
To:        virtualization@FreeBSD.org
Subject:   [Bug 261198] bhyve host panics with: spin lock 0xffffffff81eac800 (callout) helpanic: spin lock held too long
Message-ID:  <bug-261198-27103-Xp7EeRjKgg@https.bugs.freebsd.org/bugzilla/>
In-Reply-To: <bug-261198-27103@https.bugs.freebsd.org/bugzilla/>
References:  <bug-261198-27103@https.bugs.freebsd.org/bugzilla/>

next in thread | previous in thread | raw e-mail | index | archive | help
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D261198

Mark Johnston <markj@FreeBSD.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|New                         |Open
                 CC|                            |markj@FreeBSD.org

--- Comment #5 from Mark Johnston <markj@FreeBSD.org> ---
I hit what appears to be the same thing.  The system was running a bunch of
CPU-intensive FreeBSD guests and a buildworld on the host, somewhat
oversubscribed.

One of the per-CPU callout spinlocks was held for too long:

spin lock 0xffffffff81ead8c0 (callout) held by 0xfffffe00806e21e0 (tid 1002=
03)
too long
panic: spin lock held too long=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=
=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=
=20=20=20=20=20=20=20=20=20=20
cpuid =3D 17=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=
=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=
=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20
time =3D 1644078265=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=
=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=
=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20
KDB: stack backtrace:=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=
=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=
=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe01c0318=
4e0=20=20
vpanic() at vpanic+0x17f/frame 0xfffffe01c0318530=20=20=20=20=20=20=20=20=
=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20
panic() at panic+0x43/frame 0xfffffe01c0318590=20=20=20=20=20=20=20=20=20=
=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20
_mtx_lock_indefinite_check() at _mtx_lock_indefinite_check+0x74/frame
0xfffffe01c03185b0=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=
=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=
=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20
_mtx_lock_spin_cookie() at _mtx_lock_spin_cookie+0xd5/frame 0xfffffe01c0318=
620=20=20
__mtx_lock_spin_flags() at __mtx_lock_spin_flags+0xd5/frame 0xfffffe01c0318=
660=20=20
_callout_stop_safe() at _callout_stop_safe+0x145/frame 0xfffffe01c03186d0=
=20=20=20=20=20=20=20
sleepq_remove_thread() at sleepq_remove_thread+0x1ad/frame 0xfffffe01c03186=
f0=20=20=20
sleepq_resume_thread() at sleepq_resume_thread+0xaa/frame 0xfffffe01c031873=
0=20=20=20=20
wakeup_one() at wakeup_one+0x1f/frame 0xfffffe01c0318750=20=20=20=20=20=20=
=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20
vcpu_notify_event() at vcpu_notify_event+0x4e/frame 0xfffffe01c0318780

If I look at the mutex state, though, the lock is unheld:

(kgdb) frame 5
#5  0xffffffff80be46d4 in _mtx_lock_indefinite_check
(m=3Dm@entry=3D0xffffffff81ead8c0 <cc_cpu+8000>,
ldap=3Dldap@entry=3D0xfffffe01c03185e0)
    at /usr/home/markj/src/freebsd/sys/kern/kern_mutex.c:1256
1256                    panic("spin lock held too long");
(kgdb) p td
$1 =3D (struct thread *) 0xfffffe00806e21e0
(kgdb) p m
$2 =3D (struct mtx *) 0xffffffff81ead8c0 <cc_cpu+8000>
(kgdb) p *m
$3 =3D {lock_object =3D {lo_name =3D 0xffffffff8130481b "callout", lo_flags=
 =3D 196608,
lo_data =3D 0, lo_witness =3D 0x0}, mtx_lock =3D 0}
(kgdb) p td->td_tid
$4 =3D 100203
(kgdb) tid 100203
(kgdb) bt
#0  cpustop_handler () at /usr/home/markj/src/freebsd/sys/x86/x86/mp_x86.c:=
1490
#1  0xffffffff810ab739 in ipi_nmi_handler () at
/usr/home/markj/src/freebsd/sys/x86/x86/mp_x86.c:1447
#2  0xffffffff810e4dd6 in trap (frame=3D0xfffffe006baf5f30) at
/usr/home/markj/src/freebsd/sys/amd64/amd64/trap.c:240
#3  <signal handler called>
#4  rdtsc () at /usr/home/markj/src/freebsd/sys/amd64/include/cpufunc.h:355
#5  delay_tsc (n=3D1) at /usr/home/markj/src/freebsd/sys/x86/x86/delay.c:68
#6  cpu_lock_delay () at /usr/home/markj/src/freebsd/sys/x86/x86/delay.c:140
#7  0xffffffff80be46d9 in _mtx_lock_indefinite_check (m=3D0x22000ee4,
m@entry=3D0xfffffe006baf70c0, ldap=3D0x9927a00000000,
ldap@entry=3D0xffffffff82d07e70)
    at /usr/home/markj/src/freebsd/sys/kern/kern_mutex.c:1243
#8  0xffffffff80be490a in thread_lock_flags_ (td=3D0xfffffe00806e21e0, opts=
=3D0,
file=3D<optimized out>, line=3D<optimized out>)
    at /usr/home/markj/src/freebsd/sys/kern/kern_mutex.c:929
#9  0xffffffff80c27226 in softclock_thread (arg=3Darg@entry=3D0xffffffff81e=
ad8c0
<cc_cpu+8000>) at /usr/home/markj/src/freebsd/sys/kern/kern_timeout.c:840
#10 0xffffffff80bbfc40 in fork_exit (callout=3D0xffffffff80c271e0
<softclock_thread>, arg=3D0xffffffff81ead8c0 <cc_cpu+8000>,
frame=3D0xffffffff82d07f40)
    at /usr/home/markj/src/freebsd/sys/kern/kern_fork.c:1102
#11 <signal handler called>
#12 0x35f3ad8ceff1b8b6 in ?? ()

So this softclock thread is spinning for its own thread lock, which is:

(kgdb) p/x *m@entry
$5 =3D {lock_object =3D {lo_name =3D 0xfffffe006baf7d40, lo_flags =3D 0x300=
00, lo_data
=3D 0x0, lo_witness =3D 0x0}, mtx_lock =3D 0xfffffe0237c51c80}

and the owning thread is the thread that panicked:

(kgdb) p ((struct thread *)$5.mtx_lock)->td_tid
$6 =3D 161889

Hmm.  I somewhat doubt this is bhyve specific, bhyve just exercises the cal=
lout
subsystem heavily.  This is probably a result of commit
74cf7cae4d2238ae6d1c949b2bbd077e1ab33634 somehow.

--=20
You are receiving this mail because:
You are the assignee for the bug.=



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?bug-261198-27103-Xp7EeRjKgg>