Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 27 Jul 2011 15:08:50 +0300
From:      Kostik Belousov <kostikbel@gmail.com>
To:        Rick Macklem <rmacklem@uoguelph.ca>
Cc:        rmacklem@freebsd.org, Herve Boulouis <amon@aelita.org>, freebsd-stable@freebsd.org
Subject:   Re: Sleeping thread owns a nonsleepable lock panic (& lor)
Message-ID:  <20110727120850.GT17489@deviant.kiev.zoral.com.ua>
In-Reply-To: <132828699.1045046.1311721943354.JavaMail.root@erie.cs.uoguelph.ca>
References:  <20110726093258.GF17489@deviant.kiev.zoral.com.ua> <132828699.1045046.1311721943354.JavaMail.root@erie.cs.uoguelph.ca>

next in thread | previous in thread | raw e-mail | index | archive | help

--Ro3o5bxpeh65nwUH
Content-Type: text/plain; charset=koi8-r
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Tue, Jul 26, 2011 at 07:12:23PM -0400, Rick Macklem wrote:
> Kostik Belousov wrote:
> > On Tue, Jul 26, 2011 at 01:17:52PM +0200, Herve Boulouis wrote:
> > > Le 26/07/2011 12:06, Kostik Belousov a =E9crit:
> > > > On Tue, Jul 26, 2011 at 11:49:13AM +0200, Herve Boulouis wrote:
> > > > > Le 25/07/2011 11:59, Kostik Belousov a ?crit:
> > > > >
> > > > > Ok the patched server crashed this morning strangely : all httpd
> > > > > processes were stuck in nfs or vmopar
> > > > > and were unkillable. Below is the full ps.
> > > >
> > > > Please see the
> > > > http://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handboo=
k/kerneldebug-deadlocks.html
> > > > for information required to debug the deadlocks.
> > >
> > > the box was not stricly deadlocked since I was able to interact with
> > > it but I suppose you want me to
> > > break into debugger when the symptoms appears again and report all
> > > the commands listed in the handbook
> > > deadlock section ?
> >=20
> > Exactly.
> >=20
> > I think everything was hung that accessed an nfs mount point.
> > From the usermode, procstat -kk could catch some interesting
> > information,
> > but it is redundant if ddb output is captured.
>=20
> Would it be worth considering reverting r223054?
> (Note that I don't understand the VM side, so this may be completely
>  wrong:-)
>=20
> The sleeps on vmopar could be happening because a dirty page is busy
> and r223054 changes the VM_PAGER_xx value set a couple of ways.
> 1 - When it returns VM_PAGER_ERROR instead of VM_PAGER_AGAIN, the
>     return value of "runlen" from vm_pageout_flush() changes.
> 2 - I'm not sure, but I think the pre-r223054 code marked a partially
>     written page as VM_PAGER_OK instead of VM_PAGER_AGAIN?
>     (I'm wondering about this one, since the problem seems to happen
>      when the file's size has been truncated.)
>=20
> Herve Boulouis, if you want to see what r223054 changes, just go to
>   http://svn.freebsd.org/viewvc/stable/8/sys/nfsclient
>   and then click on nfs_bio.c.
>   (The changes are small and could easily be reverted with a manual
>    edit.)
>=20
> Since r223054 went into stable/8 on Jun 13, it seems a possible
> explanation? rick

I doubt it. The ps output makes it not very inplausible that the
reporter got the LOR between vnode lock and page busy flag. The correct
order is vnode lock -> busy bit. vmopar is a wait for the busy page
state.

Mentioned revision does not change the lock order.

Anyway, this is only a speculation, until the requested data is provided.

--Ro3o5bxpeh65nwUH
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (FreeBSD)

iEYEARECAAYFAk4v/9IACgkQC3+MBN1Mb4g1wwCdHVz5RdsVMC8sia2S5qw36Czi
TH0Anj9y7UxYNzmvj80NZAdIxfvTNB20
=pguu
-----END PGP SIGNATURE-----

--Ro3o5bxpeh65nwUH--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20110727120850.GT17489>