Date: Thu, 6 Jan 2005 16:17:49 -0500 From: John Baldwin <jhb@FreeBSD.org> To: freebsd-current@FreeBSD.org Subject: Re: Assertion td->td_sleepqueue != NULL failed at kern/subr_sleepqueue.c:270 Message-ID: <200501061617.49967.jhb@FreeBSD.org> In-Reply-To: <20050105122636.GA31684@peter.osted.lan> References: <20050105122636.GA31684@peter.osted.lan>
next in thread | previous in thread | raw e-mail | index | archive | help
On Wednesday 05 January 2005 07:26 am, Peter Holm wrote: > With GENERIC HEAD from Dec 31 09:28 UTC + bmilekic@'s uma_core > patch + alc's patch I got the following strange assert: > > panic(c0827c46,c082dd18,c082dc8d,10e,c08f4660) at panic+0x190 > sleepq_add(c08eec90,c08ee6e8,c082a9bf,1,c08ee6e8,0,c0827ca9,7d) > at sleepq_add+0x156 > cv_wait(c08eec90,c08ee6e8,c151de30,0,ffffffff) at cv_wait+0x100 > _sx_xlock(c08eec60,c0828867,247,0,c151ddc8) at _sx_xlock+0x59 > kern_wait(c151e450,ffffffff,cbc67c90,0,0) at kern_wait+0x4b > wait4(c151e450,cbc67d14,4,3f8,282) at wait4+0x29 > syscall(2f,2f,bfbf002f,2,0) at syscall+0x128 > Xint0x80_syscall() at Xint0x80_syscall+0x1f > --- syscall (7, FreeBSD ELF32, wait4), eip = 0x805170b, esp = > 0xbfbfedbc, ebp = 0xbfbfedd8 --- > > Looks like td->td_sleepqueue is NULL! > > Details at http://www.holm.cc/stress/log/cons100.html This is a truly odd panic. The basic theory of operation with sleep queues is that every thread that is not already queued on a sleep queue carries a sleep queue structure around that they donate to a wait channel when they block on it. Once they are resumed, they reclaim a sleep queue from the waitchannel. This resuming bit happens in sleepq_remove_thread() in subr_sleepqueue.c. As you can see, in addition to assigning a sleepqueue to the thread being removed from a queue, it also clears td_wchan and td_wmesg. The thread in question has both fields set (as if it were asleep on "proctree", which is what it is trying to back to sleep on now). However, it is not on a sleep queue (td_slpq.tqe_next is NULL). So, apparently, it seems that a thread was removed from the sleep queue and resumed (made runnable) but sleepq_remove_thread() wasn't called. Do you have any local patches that might affect this btw? I notice you get a lot of trap 9's in your dmesg which is somewhat unsettling. -- John Baldwin <jhb@FreeBSD.org> <>< http://www.FreeBSD.org/~jhb/ "Power Users Use the Power to Serve" = http://www.FreeBSD.org
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200501061617.49967.jhb>