From owner-freebsd-current@FreeBSD.ORG Tue Nov 23 21:36:48 2004 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 5092716A4CE for ; Tue, 23 Nov 2004 21:36:48 +0000 (GMT) Received: from mail3.speakeasy.net (mail3.speakeasy.net [216.254.0.203]) by mx1.FreeBSD.org (Postfix) with ESMTP id 0A83243D45 for ; Tue, 23 Nov 2004 21:36:48 +0000 (GMT) (envelope-from jhb@FreeBSD.org) Received: (qmail 9251 invoked from network); 23 Nov 2004 21:36:47 -0000 Received: from dsl027-160-063.atl1.dsl.speakeasy.net (HELO server.baldwin.cx) ([216.27.160.63]) (envelope-sender ) encrypted SMTP for ; 23 Nov 2004 21:36:47 -0000 Received: from [10.50.41.235] (gw1.twc.weather.com [216.133.140.1]) (authenticated bits=0) by server.baldwin.cx (8.12.11/8.12.11) with ESMTP id iANLaXfN037718; Tue, 23 Nov 2004 16:36:43 -0500 (EST) (envelope-from jhb@FreeBSD.org) From: John Baldwin To: Peter Holm Date: Tue, 23 Nov 2004 16:35:40 -0500 User-Agent: KMail/1.6.2 References: <20041122143804.GA36649@peter.osted.lan> <200411231136.49362.jhb@FreeBSD.org> <20041123204635.GA42682@peter.osted.lan> In-Reply-To: <20041123204635.GA42682@peter.osted.lan> MIME-Version: 1.0 Content-Disposition: inline Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Message-Id: <200411231635.40567.jhb@FreeBSD.org> X-Spam-Checker-Version: SpamAssassin 2.63 (2004-01-11) on server.baldwin.cx cc: current@FreeBSD.org cc: phk@FreeBSD.org Subject: Re: panic: sleeping thread owns a non-sleepable lock X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 23 Nov 2004 21:36:48 -0000 On Tuesday 23 November 2004 03:46 pm, Peter Holm wrote: > On Tue, Nov 23, 2004 at 11:36:49AM -0500, John Baldwin wrote: > > On Monday 22 November 2004 08:13 pm, Peter Holm wrote: > > > On Mon, Nov 22, 2004 at 04:57:36PM -0500, John Baldwin wrote: > > > > On Monday 22 November 2004 09:38 am, Peter Holm wrote: > > > > > During stress test with GENERIC HEAD from Nov 20 08:40 UTC I got: > > > > > Sleeping on "fdesc" with the following non-sleepable locks held: > > > > > exclusive sleep mutex fdesc r = 0 (0xc08d15a0) locked @ > > > > > kern/kern_descrip.c:2425 and then > > > > > panic: sleeping thread (pid 92279) owns a non-sleepable lock > > > > > > > > > > http://www.holm.cc/stress/log/cons89.html > > > > > > > > Yes, the panic is a result of the earlier warning. Poul-Henning > > > > touched this code last, so it is probably something for him to look > > > > at. I'm unsure how msleep() is getting called, however. The > > > > turnstile panic is not important, can you find the thread that went > > > > to sleep (should be pid 92279) and get stack trace for that? > > > > > > The ddb trace is in the log, just before call doadump. Let me know if > > > you need any gdb output. > > > > Ok, can you use gdb to get the source/file of 'sysctl_kern_file+0x1ae'? > > I've updated to HEAD from Nov 23 08:05 UTC , but was lucky to get the same > panic again :-) http://www.holm.cc/stress/log/cons90.html > > (kgdb) l *sysctl_kern_file+0x1ae > 0xc05f3526 is in sysctl_kern_file (../../../kern/kern_descrip.c:2427). > 2422 mtx_lock(&fdesc_mtx); > 2423 if ((fdp = p->p_fd) == NULL) { > 2424 mtx_unlock(&fdesc_mtx); > 2425 continue; > 2426 } > 2427 FILEDESC_LOCK(fdp); > 2428 for (n = 0; n < fdp->fd_nfiles; ++n) { > 2429 if ((fp = fdp->fd_ofiles[n]) == NULL) > 2430 continue; > 2431 xf.xf_fd = n; Oh, this is because of phk's home rolled msleep locks. Hmm, the basic problem here is that somehow he needs to drop the fdesc_mtx lock after locking the internal mutex but before doing the sleep. Also, he will need to add a reference count (in case the fdp goes away while he is waiting for the xlock), and bump it before going to sleep and drop it after doing the SYSCTL_OUT(). Kind of like: lock(&fdesc_mtx); fdp = p->p_fd; FILEDESC_LOCK_SMALL(fdp); unlock(&fdesc_mtx); filedesc_hold(fdp); FILEDESC_LOCK_BIG(fdp); ... SYSCTL_OUT(); filedesc_free(fdp); On the other hand, since the SYSCTL_OUT() can't block here, he probably just needs to use the FILEDESC_LOCK_FAST() variants that just lock the mutex instead of using the full-blown sleep lock. -- John Baldwin <>< http://www.FreeBSD.org/~jhb/ "Power Users Use the Power to Serve" = http://www.FreeBSD.org