From owner-freebsd-current@FreeBSD.ORG Tue Aug 15 18:04:55 2006 Return-Path: X-Original-To: current@FreeBSD.org Delivered-To: freebsd-current@FreeBSD.ORG Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 5BB6C16A4F8 for ; Tue, 15 Aug 2006 18:04:55 +0000 (UTC) (envelope-from rwatson@FreeBSD.org) Received: from cyrus.watson.org (cyrus.watson.org [209.31.154.42]) by mx1.FreeBSD.org (Postfix) with ESMTP id 7611343E04 for ; Tue, 15 Aug 2006 18:03:43 +0000 (GMT) (envelope-from rwatson@FreeBSD.org) Received: from fledge.watson.org (fledge.watson.org [209.31.154.41]) by cyrus.watson.org (Postfix) with ESMTP id CEBCD46B06; Tue, 15 Aug 2006 14:03:42 -0400 (EDT) Date: Tue, 15 Aug 2006 19:03:42 +0100 (BST) From: Robert Watson X-X-Sender: robert@fledge.watson.org To: Peter Holm In-Reply-To: <20060815174940.GA18388@peter.osted.lan> Message-ID: <20060815190221.H45647@fledge.watson.org> References: <20060815154836.GA10128@peter.osted.lan> <20060815165955.J45647@fledge.watson.org> <20060815174940.GA18388@peter.osted.lan> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: current@FreeBSD.org Subject: Re: panic: mutex nfsd_mtx not owned at nfs_srvsock.c:148 X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 15 Aug 2006 18:04:55 -0000 On Tue, 15 Aug 2006, Peter Holm wrote: > On Tue, Aug 15, 2006 at 05:01:20PM +0100, Robert Watson wrote: >> >> On Tue, 15 Aug 2006, Peter Holm wrote: >> >>> While stress testing GENERIC HEAD from Aug 12 12:55 UTC I got this >>> panic: >>> >>> panic: mutex nfsd_mtx not owned at >>> ../../../nfsserver/nfs_srvsock.c:148 >>> cpuid = 2 >>> KDB: enter: panic >>> [thread pid 761 tid 100096 ] >>> Stopped at kdb_enter+0x2b: nop >>> db> where >>> Tracing pid 761 tid 100096 td 0xc4041a20 >>> kdb_enter(c091cda8) at kdb_enter+0x2b >>> panic(c091c0b7,c09210c9,c093241d,94,0,...) at panic+0x14b >>> _mtx_assert(c0a64ec0,1,c093241d,94,c07ec53c,...) at _mtx_assert+0x66 >>> nfs_rephead(0,c52a0600,48,e662e964,e662e968,...) at nfs_rephead+0x25 >>> nfsrv_symlink(c52a0600,c4071e00,c4041a20,e662ec40) at >>> nfsrv_symlink+0x3b7 >>> nfssvc_nfsd(c4041a20) at nfssvc_nfsd+0x409 >>> nfssvc(c4041a20,e662ed04) at nfssvc+0x18c >>> syscall(3b,3b,3b,1,0,...) at syscall+0x256 >>> >>> More details @ http://people.freebsd.org/~pho/stress/log/cons204.html >> >> Could you use gdb to generate frame debugging information for the frame >> above nfs_rephead() (nfsrv_symlink()) also, please? I'm a bit puzzled as >> to how things got into this state, as under normal circumstances, >> nfsm_reply() is the source of the nfs_rephead() call, and the NFS mutex is >> acquired the line before the call to nfsm_reply(). > > cons204.html has been updated with info from frame 12. Ah, all makes sense now. I didn't realize that nfsm_srvpathsiz() was a route to nfsm_reply(). I'll investigate how best to fix this this evening. Similar bugs may exist elsewhere in the NFS server, and will presumably turn up during mbuf starvation. Thanks! Robert N M Watson Computer Laboratory University of Cambridge