Date: Thu, 24 Feb 2005 13:33:42 -0500 From: John Baldwin <jhb@FreeBSD.org> To: Kris Kennaway <kris@obsecurity.org> Cc: julian@FreeBSD.org Subject: Re: panic: Assertion td->td_sleepqueue != NULL failed at /usr/src/sys/kern/subr_sleepqueue.c:258 Message-ID: <200502241333.42942.jhb@FreeBSD.org> In-Reply-To: <20050224013447.GA51370@xor.obsecurity.org> References: <20050223235405.GB19137@xor.obsecurity.org> <20050223235515.GA19260@xor.obsecurity.org> <20050224013447.GA51370@xor.obsecurity.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On Wednesday 23 February 2005 08:34 pm, Kris Kennaway wrote: > On Wed, Feb 23, 2005 at 03:55:15PM -0800, Kris Kennaway wrote: > > On Wed, Feb 23, 2005 at 03:54:05PM -0800, Kris Kennaway wrote: > > > I got this on a 12-processor e4500 running RELENG_5: > > > > > > panic: Assertion td->td_sleepqueue != NULL failed at > > > /usr/src/sys/kern/subr_sleepqueue.c:258 cpuid = 0 > > > KDB: enter: panic > > > [thread pid 1 tid 100003 ] > > > Stopped at kdb_enter+0x38: ta %xcc, 1 > > > db> wh > > > Tracing pid 1 tid 100003 td 0xfffff801385067b0 > > > panic() at panic+0x19c > > > sleepq_add() at sleepq_add+0x168 > > > cv_wait() at cv_wait+0x174 > > > _sx_xlock() at _sx_xlock+0x64 > > > kern_wait() at kern_wait+0x3c > > > wait4() at wait4+0x18 > > > syscall() at syscall+0x220 > > > -- syscall (7, FreeBSD ELF64, wait4) %o7=0x10a7b0 -- > > > > 1 fffff80138505ab8 0 0 1 0004200 [SLPQ proctree > > 0xc03de0c8][CPU 0] init > > > > > About the only nonstandard thing I did was set > > > kern.sched.ipiwakeup.onecpu=1 which was suggested for working around > > > other deadlocks. I don't recall if preemption is enabled for this > > > machine (I didn't set it up). Is there any other online debugging I > > > can do? > > No preemption. I did get a core, and I'll add KTR_PROC per discussion > of this same panic when Peter Holm reported it in December. > > Kris I've seen this locally once, Peter and others have seen it as well. It always happens with proctree, which is probably the most heavily contended sx(9) lock in the system. The symptoms are that a thread was asleep on the proctree sleep queue. It was made runnable by someone other than the sleep queue code (so somehow TDI_SLEEPING was cleared somewhere else besides subr_sleepqueue.c) and so when it resumes, it leaves it sleep queue object behind in the sleepqueue chains table. It also still has td_wchan and td_wmesg set. (sleepq_remove_thread() clears those two when it takes a thread off of a sleep queue and gives it a sleep queue object.) -- John Baldwin <jhb@FreeBSD.org> <>< http://www.FreeBSD.org/~jhb/ "Power Users Use the Power to Serve" = http://www.FreeBSD.org
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200502241333.42942.jhb>