From owner-freebsd-current@FreeBSD.ORG Sat May 12 22:49:54 2012 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 2F75D1065672; Sat, 12 May 2012 22:49:54 +0000 (UTC) (envelope-from mjguzik@gmail.com) Received: from mail-we0-f182.google.com (mail-we0-f182.google.com [74.125.82.182]) by mx1.freebsd.org (Postfix) with ESMTP id 584B08FC12; Sat, 12 May 2012 22:49:53 +0000 (UTC) Received: by werg1 with SMTP id g1so2287964wer.13 for ; Sat, 12 May 2012 15:49:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=date:from:to:cc:subject:message-id:references:mime-version :content-type:content-disposition:in-reply-to:user-agent; bh=JVaqn4tIaLnhiepMucKeQbXBt8PO6EkZyZJULNkx9Cg=; b=vI0YPokDsR33GK8BWs7o8q/1rPHvkbdrjGWqPZ+KPjNlWqB7ovP8cwzFH4KCHY6Dft TENvd1soU7v8pNic3hEnc5JrVr1lR1KJXHmJcNEyjuZpaAhxVbvAdl0JqqmkVBwK47zk GL3pmdz7y/sGkkn4jCF57AuyBhWg4eWG22684H5QYsbjYtSuNQOCEqSnNctrEsJ0ZXiW JTWrXNrHtfus6BZJpvkTvXx6BpqGOjBwc1GoMgo9dkCn4HZEarLVgI5qbPlEFt3Wg554 gB8NUnkgiPwYDJQ/9gfVuTZzz7RaYvfyfRSUZRjyttIKwJdzZTAXiDnboJnVmHcPZCwf yIKw== Received: by 10.180.87.35 with SMTP id u3mr7133268wiz.11.1336862992490; Sat, 12 May 2012 15:49:52 -0700 (PDT) Received: from dft-labs.eu (dft-labs.eu. [80.87.128.179]) by mx.google.com with ESMTPS id o9sm35275899wia.3.2012.05.12.15.49.49 (version=TLSv1/SSLv3 cipher=OTHER); Sat, 12 May 2012 15:49:50 -0700 (PDT) Date: Sun, 13 May 2012 00:49:38 +0200 From: Mateusz Guzik To: Peter Holm Message-ID: <20120512224938.GA1322@dft-labs.eu> References: <4FA6F324.4080107@FreeBSD.org> <4FA82269.6080406@FreeBSD.org> <20120507201153.GA19942@dft-labs.eu> <20120508194514.GA10688@x2.osted.lan> <20120510102118.GA26472@dft-labs.eu> <20120510103900.GA77554@x2.osted.lan> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <20120510103900.GA77554@x2.osted.lan> User-Agent: Mutt/1.5.20 (2009-06-14) Cc: Doug Barton , Sergey Kandaurov , freebsd-current , mckusick@freebsd.org Subject: Re: panic, seems related to r234386 X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 12 May 2012 22:49:54 -0000 On Thu, May 10, 2012 at 12:39:00PM +0200, Peter Holm wrote: > On Thu, May 10, 2012 at 12:21:18PM +0200, Mateusz Guzik wrote: > > On Tue, May 08, 2012 at 09:45:14PM +0200, Peter Holm wrote: > > > On Mon, May 07, 2012 at 10:11:53PM +0200, Mateusz Guzik wrote: > > > > On Mon, May 07, 2012 at 12:28:41PM -0700, Doug Barton wrote: > > > > > On 05/06/2012 15:19, Sergey Kandaurov wrote: > > > > > > On 7 May 2012 01:54, Doug Barton wrote: > > > > > >> I got this with today's current, previous (working) kernel is r232719. > > > > > >> > > > > > >> panic: _mtx_lock_sleep: recursed on non-recursive mutex struct mount mtx > > > > > >> @ /frontier/svn/head/sys/kern/vfs_subr.c:4595 > > > > > > > > > > ... > > > > > > > > > > > Please try this patch. > > > > > > > > > > > > Index: fs/ext2fs/ext2_vfsops.c > > > > > > =================================================================== > > > > > > --- fs/ext2fs/ext2_vfsops.c (revision 235108) > > > > > > +++ fs/ext2fs/ext2_vfsops.c (working copy) > > > > > > @@ -830,7 +830,6 @@ > > > > > > /* > > > > > > * Write back each (modified) inode. > > > > > > */ > > > > > > - MNT_ILOCK(mp); > > > > > > loop: > > > > > > MNT_VNODE_FOREACH_ALL(vp, mp, mvp) { > > > > > > if (vp->v_type == VNON) { > > > > > > > > > > > > > > > > Didn't help, sorry. I put 234385 through some pretty heavy load > > > > > yesterday, and everything was fine. As soon as I move up to 234386, the > > > > > panic triggered again. So I cleaned everything up, applied your patch, > > > > > built a kernel from scratch, and rebooted. It was Ok for a few seconds > > > > > after boot, then panic'ed again, I think in a different place, but I'm > > > > > not sure because subsequent attempts to fsck the file systems caused new > > > > > panics which overwrote the old ones before they could be saved. > > > > > > > > > > > > > Another MNT_ILOCK was hiding few lines below, try this patch: > > > > > > > > http://student.agh.edu.pl/~mjguzik/patches/ext2fs-ilock.patch > > > > > > > > I've tested this a bit and I believe this fixes your problem. > > > > > > > > > > Gave this a spin and found what looks like a deadlock: > > > > > > http://people.freebsd.org/~pho/stress/log/ext2fs.txt > > > > > > Not a new problem, it would seem. Same issue with 8.3-PRERELEASE r232656M. > > > > > > > pid 2680 (fts) holds lock for vnode cb4be414 and tries to lock cc0ac15c > > pid 2581 (openat) holds lock for vnode cc0ac15c and tries to lock cb4be414 > > > > openat calls rmdir foo/bar and ext2_rmdir unlocks and tries to lock > > again foo's vnode. > > > > This is fairly easly reproducible with concurrently running mkdir and fts > > testcase programs that are provided by stress2. > > > > I'll try to come up with a patch by the end of the week. > > > Easier way to reproduce: mkdir from stress2 and "while true; do find /mnt > /dev/null; done" on another terminal. Assuming foo/bar directory tree, deadlock happens during removal of bar with simultaneous lookup of .. in bar. Proposed trivial patch: http://student.agh.edu.pl/~mjguzik/patches/ext2fs_rmdir-deadlock.patch If the lock cannot be acquired immediately unlocks 'bar' vnode and then locks both vnodes in order. After patching this I ran into another issue - wrong vnode type panics from cache_enter_time after calls by ext2_lookup. (It takes some time to reproduce this, testcase as before.) It looks like ext2_lookup is actually adapted version of ufs_lookup and lacks some bugfixes present in current ufs_lookup. I believe those bugfixes address this bug. Here is my attempt to fix the problem (based on ufs_lookup changes): http://student.agh.edu.pl/~mjguzik/patches/ext2fs_lookup-relookup.patch -- Mateusz Guzik