Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 30 Oct 2006 15:47:37 +0200
From:      Kostik Belousov <kostikbel@gmail.com>
To:        Yar Tikhiy <yar@comp.chem.msu.su>
Cc:        David Malone <dwmalone@maths.tcd.ie>, hackers@freebsd.org
Subject:   Re: File trees: the deeper, the weirder
Message-ID:  <20061030134737.GF1627@deviant.kiev.zoral.com.ua>
In-Reply-To: <20061030130519.GE27062@comp.chem.msu.su>
References:  <20061029140716.GA12058@comp.chem.msu.su> <20061029152227.GA11826@walton.maths.tcd.ie> <006801c6fb77$e4e30100$1200a8c0@gsicomp.on.ca> <20061030130519.GE27062@comp.chem.msu.su>

next in thread | previous in thread | raw e-mail | index | archive | help

--Zi0sgQQBxRFxMTsj
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Mon, Oct 30, 2006 at 04:05:19PM +0300, Yar Tikhiy wrote:
> On Sun, Oct 29, 2006 at 11:32:58AM -0500, Matt Emmerton wrote:
> > [ Restoring some OP context.]
> >=20
> > > On Sun, Oct 29, 2006 at 05:07:16PM +0300, Yar Tikhiy wrote:
> > >
> > > > As for the said program, it keeps its 1 Hz pace, mostly waiting on
> > > > "vlruwk".  It's killable, after a delay.  The system doesn't show .=
..
> > > >
> > > > Weird, eh?  Any ideas what's going on?
> > >
> > > I would guess that you need a new vnode to create the new file, but no
> > > vnodes are obvious candidates for freeing because they all have a chi=
ld
> > > directory in use. Is there some sort of vnode clearing that goes on e=
very
> > > second if we are short of vnodes?
> >=20
> > See sys/vfs_subr.c, subroutine getnewvnode().  We call msleep() if we're
> > waiting on vnodes to be created (or recycled).  And just look at the 'h=
z'
> > parameter passed to msleep()!
> >=20
> > The calling process's mkdir() will end up waiting in getnewvnode() (in
> > "vlruwk" state) while the vnlru kernel thread does it's thing (which is=
 to
> > recycle vnodes.)
> >=20
> > Either the vnlru kernel thread has to work faster, or the caller has to
> > sleep less, in order to avoid this lock-step behaviour.
>=20
> I'm afraid that, though your analysis is right, you arrive at wrong
> conclusions.  The process waits for the whole second in getnewvnode()
> because the vnlru thread cannot free as much vnodes as it wants to.
> vnlru_proc() will wake up sleepers on vnlruproc_sig (i.e.,
> getnewvnode()) only if (numvnodes <=3D desiredvnodes * 9 / 10).
> Whether this condition is attainable depends on vlrureclaim() (called
> from the vnlru thread) freeing vnodes at a sufficient rate.  Perhaps
> vlrureclaim() just can't keep the pace at this conditions.
> debug.vnlru_nowhere increasing is an indication of that.  Consequently,
> each getnewvnode() call sleeps 1 second, then grabs a vnode beyond
> desiredvnodes.  It's no surprise that the 1 second delays start to
> appear after approx. kern.maxvnodes directories were created.

I think that David is right. The references _from_ the directory make it im=
mune
to vnode reclamation. Try this patch. It is very unfair for lsof.

Index: sys/kern/vfs_subr.c
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
RCS file: /usr/local/arch/ncvs/src/sys/kern/vfs_subr.c,v
retrieving revision 1.685
diff -u -r1.685 vfs_subr.c
--- sys/kern/vfs_subr.c	2 Oct 2006 07:25:58 -0000	1.685
+++ sys/kern/vfs_subr.c	30 Oct 2006 13:44:59 -0000
@@ -582,7 +582,7 @@
 		 * If it's been deconstructed already, it's still
 		 * referenced, or it exceeds the trigger, skip it.
 		 */
-		if (vp->v_usecount || !LIST_EMPTY(&(vp)->v_cache_src) ||
+		if (vp->v_usecount || /* !LIST_EMPTY(&(vp)->v_cache_src) || */
 		    (vp->v_iflag & VI_DOOMED) !=3D 0 || (vp->v_object !=3D NULL &&
 		    vp->v_object->resident_page_count > trigger)) {
 			VI_UNLOCK(vp);
@@ -607,7 +607,7 @@
 		 * interlock, the other thread will be unable to drop the
 		 * vnode lock before our VOP_LOCK() call fails.
 		 */
-		if (vp->v_usecount || !LIST_EMPTY(&(vp)->v_cache_src) ||
+		if (vp->v_usecount || /* !LIST_EMPTY(&(vp)->v_cache_src) || */
 		    (vp->v_object !=3D NULL &&=20
 		    vp->v_object->resident_page_count > trigger)) {
 			VOP_UNLOCK(vp, LK_INTERLOCK, td);

--Zi0sgQQBxRFxMTsj
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.5 (FreeBSD)

iD8DBQFFRgJ4C3+MBN1Mb4gRApbYAKC6yQr20AOAweGrOLEtgP7MicI3TQCfXGAa
4oj1SmtFfo6zWiKO+H441Nw=
=dFO6
-----END PGP SIGNATURE-----

--Zi0sgQQBxRFxMTsj--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20061030134737.GF1627>