From owner-freebsd-current@FreeBSD.ORG Mon Jul 9 00:27:31 2007 Return-Path: X-Original-To: current@freebsd.org Delivered-To: freebsd-current@FreeBSD.ORG Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 1D3D116A41F for ; Mon, 9 Jul 2007 00:27:31 +0000 (UTC) (envelope-from pjd@garage.freebsd.pl) Received: from mail.garage.freebsd.pl (arm132.internetdsl.tpnet.pl [83.17.198.132]) by mx1.freebsd.org (Postfix) with ESMTP id 734A213C465 for ; Mon, 9 Jul 2007 00:27:30 +0000 (UTC) (envelope-from pjd@garage.freebsd.pl) Received: by mail.garage.freebsd.pl (Postfix, from userid 65534) id 6DAF645CD9; Mon, 9 Jul 2007 02:09:31 +0200 (CEST) Received: from localhost (154.81.datacomsa.pl [195.34.81.154]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mail.garage.freebsd.pl (Postfix) with ESMTP id C802345696; Mon, 9 Jul 2007 02:09:26 +0200 (CEST) Date: Mon, 9 Jul 2007 02:09:18 +0200 From: Pawel Jakub Dawidek To: Doug Rabson Message-ID: <20070709000918.GD1208@garage.freebsd.pl> References: <200707071426.18202.dfr@rabson.org> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="F8dlzb82+Fcn6AgP" Content-Disposition: inline In-Reply-To: <200707071426.18202.dfr@rabson.org> X-PGP-Key-URL: http://people.freebsd.org/~pjd/pjd.asc X-OS: FreeBSD 7.0-CURRENT i386 User-Agent: mutt-ng/devel-r804 (FreeBSD) X-Spam-Checker-Version: SpamAssassin 3.0.4 (2005-06-05) on mail.garage.freebsd.pl X-Spam-Level: X-Spam-Status: No, score=-2.6 required=3.0 tests=BAYES_00 autolearn=ham version=3.0.4 Cc: current@freebsd.org Subject: Re: ZFS leaking vnodes (sort of) X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 09 Jul 2007 00:27:31 -0000 --F8dlzb82+Fcn6AgP Content-Type: text/plain; charset=iso-8859-2 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Sat, Jul 07, 2007 at 02:26:17PM +0100, Doug Rabson wrote: > I've been testing ZFS recently and I noticed some performance issues=20 > while doing large-scale port builds on a ZFS mounted /usr/ports tree.=20 > Eventually I realised that virtually nothing ever ended up on the vnode= =20 > free list. This meant that when the system reached its maximum vnode=20 > limit, it had to resort to reclaiming vnodes from the various=20 > filesystem's active vnode lists (via vlrureclaim). Since those lists=20 > are not sorted in LRU order, this led to pessimal cache performance=20 > after the system got into that state. >=20 > I looked a bit closer at the ZFS code and poked around with DDB and I=20 > think the problem was caused by a couple of extraneous calls to vhold=20 > when creating a new ZFS vnode. On FreeBSD, getnewvnode returns a vnode=20 > which is already held (not on the free list) so there is no need to=20 > call vhold again. Whoa! Nice catch... The patch works here - I did some pretty heavy tests, so please commit it ASAP. I also wonder if this can help with some of those 'kmem_map too small' panics. I was observing that ARC cannot reclaim memory and this may be because all vnodes and thus associated data are beeing held. To ZFS users having problems with performance and/or stability of ZFS: Can you test the patch and see if it helps? > This patch appears to fix the problem (only very lightly tested): >=20 > Index: zfs_vnops.c > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > RCS=20 > file: /home/ncvs/src/sys/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.= c,v > retrieving revision 1.22 > diff -u -r1.22 zfs_vnops.c > --- zfs_vnops.c 28 May 2007 02:37:43 -0000 1.22 > +++ zfs_vnops.c 7 Jul 2007 13:01:41 -0000 > @@ -3493,7 +3493,7 @@ > rele =3D 0; > vp->v_data =3D NULL; > ASSERT(vp->v_holdcnt > 1); > - vdropl(vp); > + VI_UNLOCK(vp); > if (!zp->z_unlinked && rele) > VFS_RELE(zfsvfs->z_vfs); > return (0); > Index: zfs_znode.c > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > RCS=20 > file: /home/ncvs/src/sys/contrib/opensolaris/uts/common/fs/zfs/zfs_znode.= c,v > retrieving revision 1.8 > diff -u -r1.8 zfs_znode.c > --- zfs_znode.c 6 May 2007 19:05:37 -0000 1.8 > +++ zfs_znode.c 7 Jul 2007 13:17:32 -0000 > @@ -115,7 +115,6 @@ > ASSERT(error =3D=3D 0); > zp->z_vnode =3D vp; > vp->v_data =3D (caddr_t)zp; > - vhold(vp); > vp->v_vnlock->lk_flags |=3D LK_CANRECURSE; > vp->v_vnlock->lk_flags &=3D ~LK_NOSHARE; > } else { > @@ -601,7 +600,6 @@ > ASSERT(err =3D=3D 0); > vp =3D ZTOV(zp); > vp->v_data =3D (caddr_t)zp; > - vhold(vp); > vp->v_vnlock->lk_flags |=3D LK_CANRECURSE; > vp->v_vnlock->lk_flags &=3D ~LK_NOSHARE; > vp->v_type =3D IFTOVT((mode_t)zp->z_phys->zp_mode); --=20 Pawel Jakub Dawidek http://www.wheel.pl pjd@FreeBSD.org http://www.FreeBSD.org FreeBSD committer Am I Evil? Yes, I Am! --F8dlzb82+Fcn6AgP Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (FreeBSD) iD8DBQFGkXyuForvXbEpPzQRAkHvAKDFBCHWRhxQk0P10U/0mOC3WnuGSQCg3ZfS 3EHjm23mHkelZtf8/EeSm7U= =Jx74 -----END PGP SIGNATURE----- --F8dlzb82+Fcn6AgP--