Date: Mon, 1 Jul 2002 10:09:03 +0800 From: "David Xu" <davidx@viasoft.com.cn> To: "\"Matthew Dillon\"" <dillon@apollo.backplane.com> Cc: "Julian Elischer" <julian@elischer.org>, <sgk@troutmask.apl.washington.edu>, <wa1ter@hotmail.com>, <freebsd-current@FreeBSD.ORG> Subject: Re: KSE / interrupt panic (patch) Message-ID: <002801c220a4$4637c2a0$ef01a8c0@davidwnt>
next in thread | raw e-mail | index | archive | help
Now let me describe where the race is:
Thread A: | Thread B:
cv_timedwait() | softclock()
| =
cv_timedwait_end()
| =20
mtx_lock_spin(&sched_lock); | =
mtx_lock_spin(&sched_lock); /* suppose blocked!!! */
... | =20
callout_stop(&td->td_slpcallout) |
td->td_flags |=3D TDF_TIMEOUT; |
td->td_proc->p_stats->p_ru.ru_nivcsw++; |
// problem, does not set thread state to TDS_SLP!!! |
mi_switch(); |
// run at once again |
... |
mtx_unlock_spin(&sched_lock); | mtx_lock_spin() =
returned !!!
thread is still running | if (td->td_flags & =
TDF_TIMEOUT) {
... | td->td_flags =
&=3D ~TDF_TIMEOUT;
some place call mi_switch() and now on runqueue | =
setrunqueue(td); // crash
| }
----------------------------------------------------------
here is the patch:
--- /sys/kern/kern_condvar.c.old Mon Jul 1 09:06:01 2002
+++ /sys/kern/kern_condvar.c Mon Jul 1 09:32:50 2002
@@ -396,6 +396,7 @@
* between msleep and endtsleep.
*/
td->td_flags |=3D TDF_TIMEOUT;
+ td->td_state =3D TDS_SLP;
td->td_proc->p_stats->p_ru.ru_nivcsw++;
mi_switch();
}
@@ -472,6 +473,7 @@
* between msleep and endtsleep.
*/
td->td_flags |=3D TDF_TIMEOUT;
+ td->td_state =3D TDS_SLP;
td->td_proc->p_stats->p_ru.ru_nivcsw++;
mi_switch();
}
-------------------------------------------------------------------
bug is because cv_timedwait() detects timeout callout is running,=20
but it does not correctly wait callout to complete, so panic.
BTW, the bug seems also exists in msleep() and endtsleep(), please=20
fix it!
-David Xu
--- David Xu <bsddiy@yahoo.com> wrote:
> setrunqueue() call can be simply removed from cv_timedwait_end(), =
because
> there
> is a race in softclock() and callout_stop(), when cv_timedwait_end() =
losts a=20
> race, it means that that thread is already running(wokenup by another
> thread),
> when you setrunqueue() it, of course it will panic.
> in cv_timedwait_end(), sentence "if (td->td_flags & TDF_TIMEOUT) =
{...}"
> is to check this race condition.
>=20
> -David Xu
>=20
> ----- Original Message -----=20
> From: "Matthew Dillon" <dillon@apollo.backplane.com>
> To: "Julian Elischer" <julian@elischer.org>
> Cc: "Steve Kargl" <sgk@troutmask.apl.washington.edu>; "walt"
> <wa1ter@hotmail.com>; <freebsd-current@FreeBSD.ORG>
> Sent: Monday, July 01, 2002 4:43 AM
> Subject: KSE / interrupt panic
>=20
>=20
> > Got another one. Different panic, same place.
> >=20
> > panic: setrunqueue: bad thread state
> > cpuid =3D 0; lapic.id =3D 01000000
> > Debugger("panic")
> > Stopped at Debugger+0x46: xchgl %ebx,in_Debugger.0
> > db> trace
> > Debugger(c02ec2ba) at Debugger+0x46
> > panic(c02ec8a9,c6461d80,c6461d80,c6461d80,c01afa30) at panic+0xd6
> > setrunqueue(c6461d80) at setrunqueue+0x1dd
> > cv_timedwait_end(c6461d80) at cv_timedwait_end+0x36
> > softclock(0) at softclock+0x159
> > ithread_loop(c229c700,df3eed48,c22aec00,c01b9c6c,0) at =
ithread_loop+0x12c
> > fork_exit(c01b9c6c,c229c700,df3eed48) at fork_exit+0xa8
> > fork_trampoline() at fork_trampoline+0x37
> > db> gdb
> > ...
> >=20
> > #0 Debugger (msg=3D0xc02ec2ba "panic")
> > at /FreeBSD/FreeBSD-current/src/sys/i386/i386/db_interface.c:324
> > #1 0xc01c878a in panic (fmt=3D0xc02ec8a9 "setrunqueue: bad thread =
state")
> > at /FreeBSD/FreeBSD-current/src/sys/kern/kern_shutdown.c:482
> > #2 0xc01cc6cd in setrunqueue (td=3D0xc6461d80)
> > at /FreeBSD/FreeBSD-current/src/sys/kern/kern_switch.c:396
> > #3 0xc01afa66 in cv_timedwait_end (arg=3D0xc6461d80)
> > at /FreeBSD/FreeBSD-current/src/sys/kern/kern_condvar.c:608
> > #4 0xc01d22c9 in softclock (dummy=3D0x0)
> > at /FreeBSD/FreeBSD-current/src/sys/kern/kern_timeout.c:187
> > #5 0xc01b9d98 in ithread_loop (arg=3D0xc229c700)
> > at /FreeBSD/FreeBSD-current/src/sys/kern/kern_intr.c:535
> > #6 0xc01b923c in fork_exit (callout=3D0xc01b9c6c <ithread_loop>,=20
> > arg=3D0xc229c700, frame=3D0xdf3eed48)
> > at /FreeBSD/FreeBSD-current/src/sys/kern/kern_fork.c:863
> > =20
> > I'm not sure why the panic was 'bad thread state' when gdb seems =
to
> > show it being stuck on 'unexpected ke present'. Maybe it was an =
> > optimization and gdb is confused. The panic is definitely
> > 'bad thread state'.
> >=20
> > (gdb) print td->td_state
> > $2 =3D TDS_RUNQ
> >=20
> > setrunqueue() is being called on a thread which is already on =
the run
> > queue.
> >=20
> > -Matt
> >=20
> >=20
> > To Unsubscribe: send mail to majordomo@FreeBSD.org
> > with "unsubscribe freebsd-current" in the body of the message
>=20
> __________________________________________________
> Do You Yahoo!?
> Yahoo! - Official partner of 2002 FIFA World Cup
> http://fifaworldcup.yahoo.com
>=20
To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-current" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?002801c220a4$4637c2a0$ef01a8c0>
