From owner-freebsd-hackers@FreeBSD.ORG Mon Oct 30 13:48:16 2006 Return-Path: X-Original-To: hackers@freebsd.org Delivered-To: freebsd-hackers@FreeBSD.ORG Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 0805016A412 for ; Mon, 30 Oct 2006 13:48:16 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from fw.zoral.com.ua (fw.zoral.com.ua [213.186.206.134]) by mx1.FreeBSD.org (Postfix) with ESMTP id 2B02843D53 for ; Mon, 30 Oct 2006 13:48:14 +0000 (GMT) (envelope-from kostikbel@gmail.com) Received: from deviant.kiev.zoral.com.ua (root@deviant.kiev.zoral.com.ua [10.1.1.148]) by fw.zoral.com.ua (8.13.4/8.13.4) with ESMTP id k9UDPi6v079499 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Mon, 30 Oct 2006 15:25:44 +0200 (EET) (envelope-from kostikbel@gmail.com) Received: from deviant.kiev.zoral.com.ua (kostik@localhost [127.0.0.1]) by deviant.kiev.zoral.com.ua (8.13.8/8.13.8) with ESMTP id k9UDlbQm007338; Mon, 30 Oct 2006 15:47:37 +0200 (EET) (envelope-from kostikbel@gmail.com) Received: (from kostik@localhost) by deviant.kiev.zoral.com.ua (8.13.8/8.13.8/Submit) id k9UDlbEw007337; Mon, 30 Oct 2006 15:47:37 +0200 (EET) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: deviant.kiev.zoral.com.ua: kostik set sender to kostikbel@gmail.com using -f Date: Mon, 30 Oct 2006 15:47:37 +0200 From: Kostik Belousov To: Yar Tikhiy Message-ID: <20061030134737.GF1627@deviant.kiev.zoral.com.ua> References: <20061029140716.GA12058@comp.chem.msu.su> <20061029152227.GA11826@walton.maths.tcd.ie> <006801c6fb77$e4e30100$1200a8c0@gsicomp.on.ca> <20061030130519.GE27062@comp.chem.msu.su> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="Zi0sgQQBxRFxMTsj" Content-Disposition: inline In-Reply-To: <20061030130519.GE27062@comp.chem.msu.su> User-Agent: Mutt/1.4.2.2i X-Virus-Scanned: ClamAV version 0.88.4, clamav-milter version 0.88.4 on fw.zoral.com.ua X-Virus-Status: Clean X-Spam-Status: No, score=1.9 required=5.0 tests=DNS_FROM_RFC_ABUSE, SPF_NEUTRAL,UNPARSEABLE_RELAY autolearn=no version=3.1.4 X-Spam-Level: * X-Spam-Checker-Version: SpamAssassin 3.1.4 (2006-07-25) on fw.zoral.com.ua Cc: David Malone , hackers@freebsd.org Subject: Re: File trees: the deeper, the weirder X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 30 Oct 2006 13:48:16 -0000 --Zi0sgQQBxRFxMTsj Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Mon, Oct 30, 2006 at 04:05:19PM +0300, Yar Tikhiy wrote: > On Sun, Oct 29, 2006 at 11:32:58AM -0500, Matt Emmerton wrote: > > [ Restoring some OP context.] > >=20 > > > On Sun, Oct 29, 2006 at 05:07:16PM +0300, Yar Tikhiy wrote: > > > > > > > As for the said program, it keeps its 1 Hz pace, mostly waiting on > > > > "vlruwk". It's killable, after a delay. The system doesn't show .= .. > > > > > > > > Weird, eh? Any ideas what's going on? > > > > > > I would guess that you need a new vnode to create the new file, but no > > > vnodes are obvious candidates for freeing because they all have a chi= ld > > > directory in use. Is there some sort of vnode clearing that goes on e= very > > > second if we are short of vnodes? > >=20 > > See sys/vfs_subr.c, subroutine getnewvnode(). We call msleep() if we're > > waiting on vnodes to be created (or recycled). And just look at the 'h= z' > > parameter passed to msleep()! > >=20 > > The calling process's mkdir() will end up waiting in getnewvnode() (in > > "vlruwk" state) while the vnlru kernel thread does it's thing (which is= to > > recycle vnodes.) > >=20 > > Either the vnlru kernel thread has to work faster, or the caller has to > > sleep less, in order to avoid this lock-step behaviour. >=20 > I'm afraid that, though your analysis is right, you arrive at wrong > conclusions. The process waits for the whole second in getnewvnode() > because the vnlru thread cannot free as much vnodes as it wants to. > vnlru_proc() will wake up sleepers on vnlruproc_sig (i.e., > getnewvnode()) only if (numvnodes <=3D desiredvnodes * 9 / 10). > Whether this condition is attainable depends on vlrureclaim() (called > from the vnlru thread) freeing vnodes at a sufficient rate. Perhaps > vlrureclaim() just can't keep the pace at this conditions. > debug.vnlru_nowhere increasing is an indication of that. Consequently, > each getnewvnode() call sleeps 1 second, then grabs a vnode beyond > desiredvnodes. It's no surprise that the 1 second delays start to > appear after approx. kern.maxvnodes directories were created. I think that David is right. The references _from_ the directory make it im= mune to vnode reclamation. Try this patch. It is very unfair for lsof. Index: sys/kern/vfs_subr.c =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D RCS file: /usr/local/arch/ncvs/src/sys/kern/vfs_subr.c,v retrieving revision 1.685 diff -u -r1.685 vfs_subr.c --- sys/kern/vfs_subr.c 2 Oct 2006 07:25:58 -0000 1.685 +++ sys/kern/vfs_subr.c 30 Oct 2006 13:44:59 -0000 @@ -582,7 +582,7 @@ * If it's been deconstructed already, it's still * referenced, or it exceeds the trigger, skip it. */ - if (vp->v_usecount || !LIST_EMPTY(&(vp)->v_cache_src) || + if (vp->v_usecount || /* !LIST_EMPTY(&(vp)->v_cache_src) || */ (vp->v_iflag & VI_DOOMED) !=3D 0 || (vp->v_object !=3D NULL && vp->v_object->resident_page_count > trigger)) { VI_UNLOCK(vp); @@ -607,7 +607,7 @@ * interlock, the other thread will be unable to drop the * vnode lock before our VOP_LOCK() call fails. */ - if (vp->v_usecount || !LIST_EMPTY(&(vp)->v_cache_src) || + if (vp->v_usecount || /* !LIST_EMPTY(&(vp)->v_cache_src) || */ (vp->v_object !=3D NULL &&=20 vp->v_object->resident_page_count > trigger)) { VOP_UNLOCK(vp, LK_INTERLOCK, td); --Zi0sgQQBxRFxMTsj Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.5 (FreeBSD) iD8DBQFFRgJ4C3+MBN1Mb4gRApbYAKC6yQr20AOAweGrOLEtgP7MicI3TQCfXGAa 4oj1SmtFfo6zWiKO+H441Nw= =dFO6 -----END PGP SIGNATURE----- --Zi0sgQQBxRFxMTsj--