From owner-freebsd-current@FreeBSD.ORG Thu May 5 17:23:52 2011 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id CC807106564A for ; Thu, 5 May 2011 17:23:52 +0000 (UTC) (envelope-from yanegomi@gmail.com) Received: from mail-px0-f176.google.com (mail-px0-f176.google.com [209.85.212.176]) by mx1.freebsd.org (Postfix) with ESMTP id 94C278FC17 for ; Thu, 5 May 2011 17:23:52 +0000 (UTC) Received: by pxi11 with SMTP id 11so1841837pxi.7 for ; Thu, 05 May 2011 10:23:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:subject:mime-version:content-type:from :in-reply-to:date:cc:content-transfer-encoding:message-id:references :to:x-mailer; bh=2OfKb2u5TFyS7L5ckHB80WNkrawmOylTrO3GY6sk6zc=; b=jqWQMZuQFhZhrqfwEGiO+cgTLn3yM+QKwSpgpv4a0ee1aT017uBRIQL5u0Ku9gTLM4 4Tai42DZPRnQSuMiAYntqbq0418q/p/zVVF+TekjcaCN4JBkjEx8qaaefg0VNPLhe1EN v6Cn9+5fFIvZEYfo+5k1m3beVl3ZEFty9kpnA= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=subject:mime-version:content-type:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to:x-mailer; b=devG1StQfCKOZ/Bao7kOxo5ajxe4HsnRtqZ9v7+DGt6MImb8eGaKMYo0dqPF592nZv zY+ZygAASFnRU6MBt6Msj4NQJUJOAdk56nDAUMcwv9a86vcpPNq+cbqziygvPhoIKMfE PXbKu4TuHeitnPh2UNFWF5uIlBJ9HNVsGGDso= Received: by 10.68.22.163 with SMTP id e3mr3634190pbf.22.1304616231969; Thu, 05 May 2011 10:23:51 -0700 (PDT) Received: from [192.168.20.5] (c-24-6-49-154.hsd1.ca.comcast.net [24.6.49.154]) by mx.google.com with ESMTPS id e2sm1551988pbk.90.2011.05.05.10.23.49 (version=TLSv1/SSLv3 cipher=OTHER); Thu, 05 May 2011 10:23:50 -0700 (PDT) Mime-Version: 1.0 (Apple Message framework v1084) Content-Type: text/plain; charset=us-ascii From: Garrett Cooper In-Reply-To: <20110504090718.GN48734@deviant.kiev.zoral.com.ua> Date: Thu, 5 May 2011 10:23:47 -0700 Content-Transfer-Encoding: quoted-printable Message-Id: <9E4C162F-B4EA-4378-A010-3E8D0D23EA93@gmail.com> References: <201105040559.p445xEJ5024585@chez.mckusick.com> <20110504090718.GN48734@deviant.kiev.zoral.com.ua> To: Kostik Belousov X-Mailer: Apple Mail (2.1084) Cc: Kirk McKusick , FreeBSD Current Subject: Re: Nasty non-recursive lockmgr panic on softdep only enabled UFS partition when filesystem full X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 05 May 2011 17:23:52 -0000 On May 4, 2011, at 2:07 AM, Kostik Belousov wrote: > On Tue, May 03, 2011 at 11:58:49PM -0700, Garrett Cooper wrote: >> On Tue, May 3, 2011 at 11:42 PM, Garrett Cooper = wrote: >>> On Tue, May 3, 2011 at 10:59 PM, Kirk McKusick = wrote: >>>>> Date: Tue, 3 May 2011 22:40:26 -0700 >>>>> Subject: Nasty non-recursive lockmgr panic on softdep only enabled = UFS >>>>> partition when filesystem full >>>>> From: Garrett Cooper >>>>> To: Jeff Roberson , >>>>> Marshall Kirk McKusick >>>>> Cc: FreeBSD Current >>>>>=20 >>>>> Hi Jeff and Dr. McKusick, >>>>> Ran into this panic when /usr ran out of space doing a make >>>>> universe on amd64/r221219 (it took ~15 minutes for the panic to = occur >>>>> after the filesystem ran out of space -- wasn't quite sure what it = was >>>>> doing at the time): >>>>>=20 >>>>> ... >>>>>=20 >>>>> Let me know what other commands you would like for me to run = in kgdb. >>>>> Thanks, >>>>> -Garrett >>>>=20 >>>> You did not indicate whether you are running an 8.X system or a = 9-current >>>> system. It would be helpful to know that. >>>=20 >>> I've actually been running CURRENT for a few years now, but you're = right -- >>> I didn't mention that part. >>>=20 >>>> Jeff thinks that there may be a potential race in the locking code = for >>>> softdep_request_cleanup. If so, this patch for 9-current should fix = it: >>>>=20 >>>> Index: ffs_softdep.c >>>> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D >>>> --- ffs_softdep.c (revision 221385) >>>> +++ ffs_softdep.c (working copy) >>>> @@ -11380,7 +11380,8 @@ >>>> continue; >>>> } >>>> MNT_IUNLOCK(mp); >>>> - if (vget(lvp, LK_EXCLUSIVE | LK_INTERLOCK, = curthread)) { >>>> + if (vget(lvp, LK_EXCLUSIVE | LK_NOWAIT | = LK_INTERLOCK, >>>> + curthread)) { >>>> MNT_ILOCK(mp); >>>> continue; >>>> } >>>>=20 >>>> If you are running an 8.X system, hopefully you will be able to = apply it. >>>=20 >>> I've applied it, rebuilt and installed the kernel, and trying to >>> repro the case again. Will let you know how things go! >>=20 >> Happened again with the change. It's really easy to repro: >>=20 >> 1. Get a filesystem with UFS+SU >> 2. Execute something that does a large number of small writes to a = partition. >> 3. 'dd if=3D/dev/zero of=3DFOO bs=3D10m' on the same partition >>=20 >> The kernel will panic with the issue I discussed above. >> Thanks! >=20 > Jeff' change is required to avoid LORs, but it is not sufficient to > prevent recursion. We must skip the vnode supplied as a parameter to > softdep_request_cleanup(). Theoretically, other vnodes might be also > locked by curthread, thus I think the change below is needed. Try = this. >=20 > diff --git a/sys/ufs/ffs/ffs_softdep.c b/sys/ufs/ffs/ffs_softdep.c > index a6d4441..25fa5d6 100644 > --- a/sys/ufs/ffs/ffs_softdep.c > +++ b/sys/ufs/ffs/ffs_softdep.c > @@ -11380,7 +11380,9 @@ retry: > continue; > } > MNT_IUNLOCK(mp); > - if (vget(lvp, LK_EXCLUSIVE | LK_INTERLOCK, = curthread)) { > + if (VOP_ISLOCKED(lvp) || > + vget(lvp, LK_EXCLUSIVE | LK_INTERLOCK | = LK_NOWAIT, > + curthread)) { > MNT_ILOCK(mp); > continue; > } Ran into the same panic after I applied the patch above with the = repro steps I described before. One thing that I noticed is that the = issue isn't as easy to reproduce unless you add the dd in parallel with = the make operation. Thanks, -Garrett=