Date: Tue, 23 Nov 2004 16:35:40 -0500 From: John Baldwin <jhb@FreeBSD.org> To: Peter Holm <peter@holm.cc> Cc: phk@FreeBSD.org Subject: Re: panic: sleeping thread owns a non-sleepable lock Message-ID: <200411231635.40567.jhb@FreeBSD.org> In-Reply-To: <20041123204635.GA42682@peter.osted.lan> References: <20041122143804.GA36649@peter.osted.lan> <200411231136.49362.jhb@FreeBSD.org> <20041123204635.GA42682@peter.osted.lan>
next in thread | previous in thread | raw e-mail | index | archive | help
On Tuesday 23 November 2004 03:46 pm, Peter Holm wrote: > On Tue, Nov 23, 2004 at 11:36:49AM -0500, John Baldwin wrote: > > On Monday 22 November 2004 08:13 pm, Peter Holm wrote: > > > On Mon, Nov 22, 2004 at 04:57:36PM -0500, John Baldwin wrote: > > > > On Monday 22 November 2004 09:38 am, Peter Holm wrote: > > > > > During stress test with GENERIC HEAD from Nov 20 08:40 UTC I got: > > > > > Sleeping on "fdesc" with the following non-sleepable locks held: > > > > > exclusive sleep mutex fdesc r = 0 (0xc08d15a0) locked @ > > > > > kern/kern_descrip.c:2425 and then > > > > > panic: sleeping thread (pid 92279) owns a non-sleepable lock > > > > > > > > > > http://www.holm.cc/stress/log/cons89.html > > > > > > > > Yes, the panic is a result of the earlier warning. Poul-Henning > > > > touched this code last, so it is probably something for him to look > > > > at. I'm unsure how msleep() is getting called, however. The > > > > turnstile panic is not important, can you find the thread that went > > > > to sleep (should be pid 92279) and get stack trace for that? > > > > > > The ddb trace is in the log, just before call doadump. Let me know if > > > you need any gdb output. > > > > Ok, can you use gdb to get the source/file of 'sysctl_kern_file+0x1ae'? > > I've updated to HEAD from Nov 23 08:05 UTC , but was lucky to get the same > panic again :-) http://www.holm.cc/stress/log/cons90.html > > (kgdb) l *sysctl_kern_file+0x1ae > 0xc05f3526 is in sysctl_kern_file (../../../kern/kern_descrip.c:2427). > 2422 mtx_lock(&fdesc_mtx); > 2423 if ((fdp = p->p_fd) == NULL) { > 2424 mtx_unlock(&fdesc_mtx); > 2425 continue; > 2426 } > 2427 FILEDESC_LOCK(fdp); > 2428 for (n = 0; n < fdp->fd_nfiles; ++n) { > 2429 if ((fp = fdp->fd_ofiles[n]) == NULL) > 2430 continue; > 2431 xf.xf_fd = n; Oh, this is because of phk's home rolled msleep locks. Hmm, the basic problem here is that somehow he needs to drop the fdesc_mtx lock after locking the internal mutex but before doing the sleep. Also, he will need to add a reference count (in case the fdp goes away while he is waiting for the xlock), and bump it before going to sleep and drop it after doing the SYSCTL_OUT(). Kind of like: lock(&fdesc_mtx); fdp = p->p_fd; FILEDESC_LOCK_SMALL(fdp); unlock(&fdesc_mtx); filedesc_hold(fdp); FILEDESC_LOCK_BIG(fdp); ... SYSCTL_OUT(); filedesc_free(fdp); On the other hand, since the SYSCTL_OUT() can't block here, he probably just needs to use the FILEDESC_LOCK_FAST() variants that just lock the mutex instead of using the full-blown sleep lock. -- John Baldwin <jhb@FreeBSD.org> <>< http://www.FreeBSD.org/~jhb/ "Power Users Use the Power to Serve" = http://www.FreeBSD.org
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200411231635.40567.jhb>