Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 4 May 2011 09:41:56 -0700
From:      Garrett Cooper <yanegomi@gmail.com>
To:        Kostik Belousov <kostikbel@gmail.com>
Cc:        Kirk McKusick <mckusick@mckusick.com>, FreeBSD Current <freebsd-current@freebsd.org>
Subject:   Re: Nasty non-recursive lockmgr panic on softdep only enabled UFS partition when filesystem full
Message-ID:  <BANLkTimOA6Dd8eOh_SxvDDbC4t23pJ_OpQ@mail.gmail.com>
In-Reply-To: <20110504090718.GN48734@deviant.kiev.zoral.com.ua>
References:  <BANLkTik4=O_1PWB2GzGzY=m51dG-Kbhe%2BQ@mail.gmail.com> <201105040559.p445xEJ5024585@chez.mckusick.com> <BANLkTikAQ6Jz4Jbjxh51iA-cjCYmdx1mSg@mail.gmail.com> <BANLkTik8F_SvEzW-vPW9=dZUEJuYOy9WcQ@mail.gmail.com> <20110504090718.GN48734@deviant.kiev.zoral.com.ua>

next in thread | previous in thread | raw e-mail | index | archive | help
2011/5/4 Kostik Belousov <kostikbel@gmail.com>:
> On Tue, May 03, 2011 at 11:58:49PM -0700, Garrett Cooper wrote:
>> On Tue, May 3, 2011 at 11:42 PM, Garrett Cooper <yanegomi@gmail.com> wro=
te:
>> > On Tue, May 3, 2011 at 10:59 PM, Kirk McKusick <mckusick@mckusick.com>=
 wrote:
>> >>> Date: Tue, 3 May 2011 22:40:26 -0700
>> >>> Subject: Nasty non-recursive lockmgr panic on softdep only enabled U=
FS
>> >>> =A0partition when filesystem full
>> >>> From: Garrett Cooper <yanegomi@gmail.com>
>> >>> To: Jeff Roberson <jeff@freebsd.org>,
>> >>> =A0 =A0 =A0 =A0 Marshall Kirk McKusick <mckusick@mckusick.com>
>> >>> Cc: FreeBSD Current <freebsd-current@freebsd.org>
>> >>>
>> >>> Hi Jeff and Dr. McKusick,
>> >>> =A0 =A0 Ran into this panic when /usr ran out of space doing a make
>> >>> universe on amd64/r221219 (it took ~15 minutes for the panic to occu=
r
>> >>> after the filesystem ran out of space -- wasn't quite sure what it w=
as
>> >>> doing at the time):
>> >>>
>> >>> ...
>> >>>
>> >>> =A0 =A0 Let me know what other commands you would like for me to run=
 in kgdb.
>> >>> Thanks,
>> >>> -Garrett
>> >>
>> >> You did not indicate whether you are running an 8.X system or a 9-cur=
rent
>> >> system. It would be helpful to know that.
>> >
>> > I've actually been running CURRENT for a few years now, but you're rig=
ht --
>> > I didn't mention that part.
>> >
>> >> Jeff thinks that there may be a potential race in the locking code fo=
r
>> >> softdep_request_cleanup. If so, this patch for 9-current should fix i=
t:
>> >>
>> >> Index: ffs_softdep.c
>> >> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
>> >> --- ffs_softdep.c =A0 =A0 =A0 (revision 221385)
>> >> +++ ffs_softdep.c =A0 =A0 =A0 (working copy)
>> >> @@ -11380,7 +11380,8 @@
>> >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0contin=
ue;
>> >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0}
>> >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0MNT_IUNLOCK(mp);
>> >> - =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 if (vget(lvp, LK_EXCLUS=
IVE | LK_INTERLOCK, curthread)) {
>> >> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 if (vget(lvp, LK_EXCLUS=
IVE | LK_NOWAIT | LK_INTERLOCK,
>> >> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 curthread)) {
>> >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0MNT_IL=
OCK(mp);
>> >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0contin=
ue;
>> >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0}
>> >>
>> >> If you are running an 8.X system, hopefully you will be able to apply=
 it.
>> >
>> > =A0 =A0I've applied it, rebuilt and installed the kernel, and trying t=
o
>> > repro the case again. Will let you know how things go!
>>
>> =A0 =A0 Happened again with the change. It's really easy to repro:
>>
>> 1. Get a filesystem with UFS+SU
>> 2. Execute something that does a large number of small writes to a parti=
tion.
>> 3. 'dd if=3D/dev/zero of=3DFOO bs=3D10m' on the same partition
>>
>> =A0 =A0 The kernel will panic with the issue I discussed above.
>> Thanks!
>
> Jeff' change is required to avoid LORs, but it is not sufficient to
> prevent recursion. We must skip the vnode supplied as a parameter to
> softdep_request_cleanup(). Theoretically, other vnodes might be also
> locked by curthread, thus I think the change below is needed. Try this.
>
> diff --git a/sys/ufs/ffs/ffs_softdep.c b/sys/ufs/ffs/ffs_softdep.c
> index a6d4441..25fa5d6 100644
> --- a/sys/ufs/ffs/ffs_softdep.c
> +++ b/sys/ufs/ffs/ffs_softdep.c
> @@ -11380,7 +11380,9 @@ retry:
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0continue;
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0}
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0MNT_IUNLOCK(mp);
> - =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 if (vget(lvp, LK_EXCLUSIVE =
| LK_INTERLOCK, curthread)) {
> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 if (VOP_ISLOCKED(lvp) ||
> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 vget(lvp, LK_EXCLUS=
IVE | LK_INTERLOCK | LK_NOWAIT,
> + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 curthread)) {
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0MNT_ILOCK(=
mp);
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0continue;
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0}

    Ok. I'll let the make universe I have going run to completion, and
once I get back home later on, I'll take a look at repro'ing this
again with the above patch applied.
Thanks!
-Garrett



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?BANLkTimOA6Dd8eOh_SxvDDbC4t23pJ_OpQ>