Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 25 Jul 2011 11:59:03 +0300
From:      Kostik Belousov <kostikbel@gmail.com>
To:        Herve Boulouis <amon@aelita.org>
Cc:        rmacklem@freebsd.org, freebsd-stable@freebsd.org
Subject:   Re: Sleeping thread owns a nonsleepable lock panic (& lor)
Message-ID:  <20110725085902.GM17489@deviant.kiev.zoral.com.ua>
In-Reply-To: <20110725102107.GB17204@ra.aabs>
References:  <20110725102107.GB17204@ra.aabs>

next in thread | previous in thread | raw e-mail | index | archive | help

--OxDl9SlxSp5FbYFo
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Mon, Jul 25, 2011 at 12:21:07PM +0200, Herve Boulouis wrote:
> Hi list,
>=20
> We have 2 freebsd 8.2-STABLE (cvsuped june 22) that keeps crashing in a b=
ad way :
>=20
> The are doing heavy apache / php4 web serving from a nfs mount and panic =
at least once a day
> with the following message (no crash dump produced, hand copied from the =
console) :
>=20
> Sleeping on "vmopar" with the following non-sleepable locks held:
> exclusive sleep mutex NFSnode lock (NFSnode lock) r =3D  0 (0xffffff02017=
98000) locked @ nfsclient/nfs_subs.c:538
> lock order reversal:
>  1st 0xffffffff018ff6da80 turnstile lock (turnstile lock) @ kern/subr_tur=
nstile.c:190
>  2nd 0xffffffffff80b52b10 scrlock (scrlock) @ dev/syscons.c:2570
> lock order reversal:
>  1st 0xffffffff018ff6da80 turnstile lock (turnstile lock) @ kern/subr_tur=
nstile.c:190
>  2nd 0xffffffffff80b78ef8 sleepq chain (sleepq chain) @ kern/subr_turnsti=
le.c:203
> lock order reversal:
>  1st 0xffffffffff80b78ef8 sleepq chain (sleepq chain) @ kern/subr_turnsti=
le.c:203
>  2nd 0xffffffffff80b52b10 scrlock (scrlock) @ dev/syscons.c:2570
> Sleeping thread (tid 100998, pid 20700) owns a non-sleepable lock
> panic: sleeping thread
> cpuid =3D 1
> panic: bufwrite: buffer is not busy???
> cpuid =3D 1
>=20
> The 2 servers share the same load and panic consistently. I enabled WITNE=
SS on the 2 in the hope
> it would allow the boxes to auto reboot after panic and get extra debug i=
nfo. I got debug info
> but the servers still hangs after the double panic :(

Try this. Calling vnode_pager_setsize() while holding a mutex is prohibited.
On the other hand, I remember that my attempt to add a strict assert
that a vnode is exclusively locked in vnode_pager_setsize() had to be
reversed because nfs_loadattrcache() sometimes called without vnode
lock held.

commit 2aa7d15c38b0c01e3f724f04d7ed02ce11c82cc0
Author: Konstantin Belousov <kostikbel@gmail.com>
Date:   Mon Jul 25 11:56:04 2011 +0300

    Postpone the vnode_pager_setsize() call until the nfs node mutex is dro=
pped.

diff --git a/sys/nfsclient/nfs_subs.c b/sys/nfsclient/nfs_subs.c
index 19fde06..351885a 100644
--- a/sys/nfsclient/nfs_subs.c
+++ b/sys/nfsclient/nfs_subs.c
@@ -478,7 +478,9 @@ nfs_loadattrcache(struct vnode **vpp, struct mbuf **mdp=
, caddr_t *dposp,
 	struct timespec mtime, mtime_save;
 	int v3 =3D NFS_ISV3(vp);
 	int error =3D 0;
+	int do_setsize;
=20
+	do_setsize =3D 0;
 	md =3D *mdp;
 	t1 =3D (mtod(md, caddr_t) + md->m_len) - *dposp;
 	cp2 =3D nfsm_disct(mdp, dposp, NFSX_FATTR(v3), t1, M_WAIT);
@@ -606,7 +608,7 @@ nfs_loadattrcache(struct vnode **vpp, struct mbuf **mdp=
, caddr_t *dposp,
 				np->n_size =3D vap->va_size;
 				np->n_flag |=3D NSIZECHANGED;
 			}
-			vnode_pager_setsize(vp, np->n_size);
+			do_setsize =3D 1;
 		} else {
 			np->n_size =3D vap->va_size;
 		}
@@ -643,6 +645,8 @@ nfs_loadattrcache(struct vnode **vpp, struct mbuf **mdp=
, caddr_t *dposp,
 		KDTRACE_NFS_ATTRCACHE_LOAD_DONE(vp, &np->n_vattr, 0);
 #endif
 	mtx_unlock(&np->n_mtx);
+	if (do_setsize)
+		vnode_pager_setsize(vp, np->n_size);
 out:
 #ifdef KDTRACE_HOOKS
 	if (error)

--OxDl9SlxSp5FbYFo
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (FreeBSD)

iEYEARECAAYFAk4tMFYACgkQC3+MBN1Mb4hbdQCdFW1D6Ic5r1zMXlMEMV0GoieS
pbQAoL7U3cJ2KV17OwDi6JkqnQQc+cQe
=8/06
-----END PGP SIGNATURE-----

--OxDl9SlxSp5FbYFo--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20110725085902.GM17489>