From owner-freebsd-current@FreeBSD.ORG Wed May 4 16:41:58 2011 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 98D1C1065672 for ; Wed, 4 May 2011 16:41:58 +0000 (UTC) (envelope-from yanegomi@gmail.com) Received: from mail-vx0-f182.google.com (mail-vx0-f182.google.com [209.85.220.182]) by mx1.freebsd.org (Postfix) with ESMTP id 4C8D88FC0A for ; Wed, 4 May 2011 16:41:57 +0000 (UTC) Received: by vxc34 with SMTP id 34so1821581vxc.13 for ; Wed, 04 May 2011 09:41:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:in-reply-to:references:date :message-id:subject:from:to:cc:content-type :content-transfer-encoding; bh=hWn7lk6RhxS0jSxyUmKmrzDfdS+jwHF28SaykwQmdnM=; b=Ljf9WqwTLrD5zya/okaexXWP2m/iF7I0OtRx9jyMOvccs69xAau9ZrmfkIf35SHNzN G+vGkNbSw9ei4H0jL1DyNuPuVi3Zqzc0ZwWAMpw8F0vRUJW6tPxzz16qdT95XPX7hO42 vNfIKcKUgX0kQSGRnFmRgnZf6V2Pm6UnSB0Uc= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; b=KRTS2prlU5yYte2f5nBD5yJvK3oryTl6srqYkRGXMgBcxg0xXwem6dlSZ/goXiM7Jq wiHp5zVj8How4zpAKXJJE4rMWZYRzP/YBYuz7719NIsm2SEiZFKlAfxsPm0PHcdbyJd7 XHHUaHQQvdpf6h1H+wbYv6imN9VRhG139h+ao= MIME-Version: 1.0 Received: by 10.52.176.194 with SMTP id ck2mr1606320vdc.248.1304527316861; Wed, 04 May 2011 09:41:56 -0700 (PDT) Received: by 10.220.199.130 with HTTP; Wed, 4 May 2011 09:41:56 -0700 (PDT) In-Reply-To: <20110504090718.GN48734@deviant.kiev.zoral.com.ua> References: <201105040559.p445xEJ5024585@chez.mckusick.com> <20110504090718.GN48734@deviant.kiev.zoral.com.ua> Date: Wed, 4 May 2011 09:41:56 -0700 Message-ID: From: Garrett Cooper To: Kostik Belousov Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Cc: Kirk McKusick , FreeBSD Current Subject: Re: Nasty non-recursive lockmgr panic on softdep only enabled UFS partition when filesystem full X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 04 May 2011 16:41:58 -0000 2011/5/4 Kostik Belousov : > On Tue, May 03, 2011 at 11:58:49PM -0700, Garrett Cooper wrote: >> On Tue, May 3, 2011 at 11:42 PM, Garrett Cooper wro= te: >> > On Tue, May 3, 2011 at 10:59 PM, Kirk McKusick = wrote: >> >>> Date: Tue, 3 May 2011 22:40:26 -0700 >> >>> Subject: Nasty non-recursive lockmgr panic on softdep only enabled U= FS >> >>> =A0partition when filesystem full >> >>> From: Garrett Cooper >> >>> To: Jeff Roberson , >> >>> =A0 =A0 =A0 =A0 Marshall Kirk McKusick >> >>> Cc: FreeBSD Current >> >>> >> >>> Hi Jeff and Dr. McKusick, >> >>> =A0 =A0 Ran into this panic when /usr ran out of space doing a make >> >>> universe on amd64/r221219 (it took ~15 minutes for the panic to occu= r >> >>> after the filesystem ran out of space -- wasn't quite sure what it w= as >> >>> doing at the time): >> >>> >> >>> ... >> >>> >> >>> =A0 =A0 Let me know what other commands you would like for me to run= in kgdb. >> >>> Thanks, >> >>> -Garrett >> >> >> >> You did not indicate whether you are running an 8.X system or a 9-cur= rent >> >> system. It would be helpful to know that. >> > >> > I've actually been running CURRENT for a few years now, but you're rig= ht -- >> > I didn't mention that part. >> > >> >> Jeff thinks that there may be a potential race in the locking code fo= r >> >> softdep_request_cleanup. If so, this patch for 9-current should fix i= t: >> >> >> >> Index: ffs_softdep.c >> >> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D >> >> --- ffs_softdep.c =A0 =A0 =A0 (revision 221385) >> >> +++ ffs_softdep.c =A0 =A0 =A0 (working copy) >> >> @@ -11380,7 +11380,8 @@ >> >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0contin= ue; >> >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0} >> >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0MNT_IUNLOCK(mp); >> >> - =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 if (vget(lvp, LK_EXCLUS= IVE | LK_INTERLOCK, curthread)) { >> >> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 if (vget(lvp, LK_EXCLUS= IVE | LK_NOWAIT | LK_INTERLOCK, >> >> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 curthread)) { >> >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0MNT_IL= OCK(mp); >> >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0contin= ue; >> >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0} >> >> >> >> If you are running an 8.X system, hopefully you will be able to apply= it. >> > >> > =A0 =A0I've applied it, rebuilt and installed the kernel, and trying t= o >> > repro the case again. Will let you know how things go! >> >> =A0 =A0 Happened again with the change. It's really easy to repro: >> >> 1. Get a filesystem with UFS+SU >> 2. Execute something that does a large number of small writes to a parti= tion. >> 3. 'dd if=3D/dev/zero of=3DFOO bs=3D10m' on the same partition >> >> =A0 =A0 The kernel will panic with the issue I discussed above. >> Thanks! > > Jeff' change is required to avoid LORs, but it is not sufficient to > prevent recursion. We must skip the vnode supplied as a parameter to > softdep_request_cleanup(). Theoretically, other vnodes might be also > locked by curthread, thus I think the change below is needed. Try this. > > diff --git a/sys/ufs/ffs/ffs_softdep.c b/sys/ufs/ffs/ffs_softdep.c > index a6d4441..25fa5d6 100644 > --- a/sys/ufs/ffs/ffs_softdep.c > +++ b/sys/ufs/ffs/ffs_softdep.c > @@ -11380,7 +11380,9 @@ retry: > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0continue; > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0} > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0MNT_IUNLOCK(mp); > - =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 if (vget(lvp, LK_EXCLUSIVE = | LK_INTERLOCK, curthread)) { > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 if (VOP_ISLOCKED(lvp) || > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 vget(lvp, LK_EXCLUS= IVE | LK_INTERLOCK | LK_NOWAIT, > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 curthread)) { > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0MNT_ILOCK(= mp); > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0continue; > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0} Ok. I'll let the make universe I have going run to completion, and once I get back home later on, I'll take a look at repro'ing this again with the above patch applied. Thanks! -Garrett