Date: Thu, 6 Jan 2005 22:45:54 +0100 From: Peter Holm <peter@holm.cc> To: John Baldwin <jhb@freebsd.org> Cc: freebsd-current@freebsd.org Subject: Re: Assertion td->td_sleepqueue != NULL failed at kern/subr_sleepqueue.c:270 Message-ID: <20050106214554.GA45533@peter.osted.lan> In-Reply-To: <200501061617.49967.jhb@FreeBSD.org> References: <20050105122636.GA31684@peter.osted.lan> <200501061617.49967.jhb@FreeBSD.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, Jan 06, 2005 at 04:17:49PM -0500, John Baldwin wrote: > On Wednesday 05 January 2005 07:26 am, Peter Holm wrote: > > With GENERIC HEAD from Dec 31 09:28 UTC + bmilekic@'s uma_core > > patch + alc's patch I got the following strange assert: > > > > panic(c0827c46,c082dd18,c082dc8d,10e,c08f4660) at panic+0x190 > > sleepq_add(c08eec90,c08ee6e8,c082a9bf,1,c08ee6e8,0,c0827ca9,7d) > > at sleepq_add+0x156 > > cv_wait(c08eec90,c08ee6e8,c151de30,0,ffffffff) at cv_wait+0x100 > > _sx_xlock(c08eec60,c0828867,247,0,c151ddc8) at _sx_xlock+0x59 > > kern_wait(c151e450,ffffffff,cbc67c90,0,0) at kern_wait+0x4b > > wait4(c151e450,cbc67d14,4,3f8,282) at wait4+0x29 > > syscall(2f,2f,bfbf002f,2,0) at syscall+0x128 > > Xint0x80_syscall() at Xint0x80_syscall+0x1f > > --- syscall (7, FreeBSD ELF32, wait4), eip = 0x805170b, esp = > > 0xbfbfedbc, ebp = 0xbfbfedd8 --- > > > > Looks like td->td_sleepqueue is NULL! > > > > Details at http://www.holm.cc/stress/log/cons100.html > > This is a truly odd panic. The basic theory of operation with sleep queues is > that every thread that is not already queued on a sleep queue carries a sleep > queue structure around that they donate to a wait channel when they block on > it. Once they are resumed, they reclaim a sleep queue from the waitchannel. > This resuming bit happens in sleepq_remove_thread() in subr_sleepqueue.c. As > you can see, in addition to assigning a sleepqueue to the thread being > removed from a queue, it also clears td_wchan and td_wmesg. The thread in > question has both fields set (as if it were asleep on "proctree", which is > what it is trying to back to sleep on now). However, it is not on a sleep > queue (td_slpq.tqe_next is NULL). So, apparently, it seems that a thread was > removed from the sleep queue and resumed (made runnable) but > sleepq_remove_thread() wasn't called. Do you have any local patches that > might affect this btw? I notice you get a lot of trap 9's in your dmesg > which is somewhat unsettling. These are the modifications: http://www.holm.cc/stress/log/mods.html The trap 9 are not uncommon for the test suite. > > -- > John Baldwin <jhb@FreeBSD.org> <>< http://www.FreeBSD.org/~jhb/ > "Power Users Use the Power to Serve" = http://www.FreeBSD.org -- Peter Holm
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20050106214554.GA45533>