From owner-freebsd-stable@freebsd.org Mon Oct 12 13:56:00 2015 Return-Path: Delivered-To: freebsd-stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 46A5BA11EF9 for ; Mon, 12 Oct 2015 13:56:00 +0000 (UTC) (envelope-from ck-lists@cksoft.de) Received: from mx1.cksoft.de (mx1.cksoft.de [IPv6:2001:67c:24f8:1::25:1]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mx1.cksoft.de", Issuer "CA" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id D1FB263B; Mon, 12 Oct 2015 13:55:59 +0000 (UTC) (envelope-from ck-lists@cksoft.de) Received: from m.cksoft.de (unknown [IPv6:2a01:170:1110:8001::25:1]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx1.cksoft.de (Postfix) with ESMTPSA id 668261E9EB2; Mon, 12 Oct 2015 15:55:56 +0200 (CEST) Received: from amavis.cksoft.de (unknown [IPv6:2a01:170:1110:8001::25:a1]) by m.cksoft.de (Postfix) with ESMTP id 2361A631CF; Mon, 12 Oct 2015 15:54:24 +0200 (CEST) X-Virus-Scanned: amavisd-new at cksoft.de Received: from m.cksoft.de ([IPv6:2a01:170:1110:8001::25:1]) by amavis.cksoft.de (amavis.cksoft.de [IPv6:2a01:170:1110:8001::25:a1]) (amavisd-new, port 10041) with ESMTP id HZvy3XqRLktK; Mon, 12 Oct 2015 15:54:22 +0200 (CEST) Received: from noc1.cksoft.de (noc1.cksoft.de [IPv6:2a01:170:1110:8001::53:1]) by m.cksoft.de (Postfix) with ESMTP id E5ACF62FA4; Mon, 12 Oct 2015 15:54:21 +0200 (CEST) Received: by noc1.cksoft.de (Postfix, from userid 1000) id 6FE3613BD3; Mon, 12 Oct 2015 15:55:53 +0200 (CEST) Received: from localhost (localhost [127.0.0.1]) by noc1.cksoft.de (Postfix) with ESMTP id 6717613B4B; Mon, 12 Oct 2015 15:55:53 +0200 (CEST) Date: Mon, 12 Oct 2015 15:55:53 +0200 (CEST) From: Christian Kratzer X-X-Sender: ck@noc1.cksoft.de Reply-To: Christian Kratzer To: Rick Macklem cc: freebsd-stable@freebsd.org, John Baldwin Subject: Re: smbfs crashes since approx. 10.1-RELEASE In-Reply-To: <2135054744.32546564.1444653156980.JavaMail.zimbra@uoguelph.ca> Message-ID: References: <2148690.gx9M0ZzrG1@ralph.baldwin.cx> <3563189.eDHDcCgW5L@ralph.baldwin.cx> <358885214.31305796.1444518367048.JavaMail.zimbra@uoguelph.ca> <2135054744.32546564.1444653156980.JavaMail.zimbra@uoguelph.ca> User-Agent: Alpine 2.20 (BSF 67 2015-01-07) X-Spammer-Kill-Ratio: 75% MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII; format=flowed X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 12 Oct 2015 13:56:00 -0000 Hi Rick, On Mon, 12 Oct 2015, Rick Macklem wrote: > Christian Kratzer wrote: >> Hi Rick, >> >> there was also a second more recent crash in /var/crash >> >> Mon Oct 12 03:01:16 CEST 2015 >> >> FreeBSD noc3.cksoft.de 10.2-STABLE FreeBSD 10.2-STABLE #2 r288980M: Sun >> Oct 11 08:37:40 CEST 2015 >> ck@noc3.cksoft.de:/usr/obj/usr/src/sys/NOC amd64 >> >> panic: Assertion mtx_unowned(m) failed at >> /usr/src/sys/kern/kern_mutex.c:955 >> > Oops, I screwed up. I should have looked at this panic assertion when you reported > it before. Ok, so if I understand the assertion correctly, it means that another > thread has the mutex locked. If this is correct, I'll have to take another look at > the code and figure out how to wait for these other threads to finish with the mutexes. > > I do think the patch fixes the race I saw, but there must be other races in the code. > > I'll take another look, but if anyone else is conversant with netsmb, feel free to > jump in, because it is all new to me. > > Unfortunately, I won't have any way to do testing for the next month or so, so any > patches I do come up with will be "try this untested..". thats no problem. Just keep the patches coming when you have time and tell me when to reset back to stable, current or whatever so we don't lose sync of the status. As it looks like that the race happens on unmount I could try putting a sleep 60 into the script that does the "mount && rsycn && umount" magic just before the umount. That would allow anything that it slow to go away to perhaps release the mutexes before the umount. Not a real fix of course but might help to verify what's going on. Greetings Christian -- Christian Kratzer CK Software GmbH Email: ck@cksoft.de Wildberger Weg 24/2 Phone: +49 7032 893 997 - 0 D-71126 Gaeufelden Fax: +49 7032 893 997 - 9 HRB 245288, Amtsgericht Stuttgart Mobile: +49 171 1947 843 Geschaeftsfuehrer: Christian Kratzer Web: http://www.cksoft.de/