From owner-freebsd-fs@FreeBSD.ORG Sat Sep 29 15:40:43 2012 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id DEE1F1065670; Sat, 29 Sep 2012 15:40:43 +0000 (UTC) (envelope-from pawel@dawidek.net) Received: from mail.dawidek.net (garage.dawidek.net [91.121.88.72]) by mx1.freebsd.org (Postfix) with ESMTP id 0AA1F8FC1B; Sat, 29 Sep 2012 15:40:43 +0000 (UTC) Received: from localhost (89-73-195-149.dynamic.chello.pl [89.73.195.149]) by mail.dawidek.net (Postfix) with ESMTPSA id AFCBF26E; Sat, 29 Sep 2012 17:39:39 +0200 (CEST) Date: Sat, 29 Sep 2012 17:41:02 +0200 From: Pawel Jakub Dawidek To: Konstantin Belousov Message-ID: <20120929154101.GK1402@garage.freebsd.pl> References: <505DB4E6.8030407@smeets.im> <20120924224606.GE79077@ithaqua.etoilebsd.net> <20120925090840.GD35915@deviant.kiev.zoral.com.ua> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="gKaPnVNVpyX08bAX" Content-Disposition: inline In-Reply-To: <20120925090840.GD35915@deviant.kiev.zoral.com.ua> X-OS: FreeBSD 10.0-CURRENT amd64 User-Agent: Mutt/1.5.21 (2010-09-15) Cc: freebsd-fs@freebsd.org Subject: Re: panic: _sx_xlock_hard: recursed on non-recursive sx zfsvfs->z_hold_mtx[i] @ ...cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_znode.c:1407 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 29 Sep 2012 15:40:44 -0000 --gKaPnVNVpyX08bAX Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Tue, Sep 25, 2012 at 12:08:40PM +0300, Konstantin Belousov wrote: > On Tue, Sep 25, 2012 at 12:46:07AM +0200, Baptiste Daroussin wrote: > > Hi, > >=20 > > I have the exact same problem making: tinderbox and poudriere highly > > unusable. > > > > This is really problematic because pointyhat also rely on nullfs and > > zfs, which means we can't upgrade the building nodes if we need to for > > example. > > > > regards, Bapt >=20 > This is zfs bug. Filesystems shall not call getnewvnode() while holding > internal locks. At least not the locks which are needed during reclaim. > Nullfs changes amplified the probability of the problematic situation, > since now nullfs vnodes are indeed cached instead of being recreated > on each access, so the overall count of used vnodes could be twice as > high. >=20 > You might try to increase the kern.maxvnodes to reduce the probability of > the recursive calls into vnlnru() from getnewvnode(). But for real, bug > needs to be fixed in zfs. With all FreeBSD's VFS constraints, it is really hard to breath, especially within file system that was not designed with our VFS complexity in mind. For example it would be nice of VFS to not reclaim vnodes from getnewvnode() and leave this task entirely to the vnlru process. It is pretty obvious layering violation to me - file system code needs new vnode, it calls VFS routine to allocate one, which then calls file system again to reclaim one of its vnodes. It would also be nice to handle EAGAIN from VOP_RECLAIM(). Currently we panic on error. This would be useful to return if some of the locks cannot be acquired immediately. ZFS reclaim already discovers potential deadlocks and defer some reclamation portion to separate thread. --=20 Pawel Jakub Dawidek http://www.wheelsystems.com FreeBSD committer http://www.FreeBSD.org Am I Evil? Yes, I Am! http://tupytaj.pl --gKaPnVNVpyX08bAX Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.19 (FreeBSD) iEYEARECAAYFAlBnFo0ACgkQForvXbEpPzQ9AgCdE3vMR7ftQFE2OoT84R6K7b0w gAEAoKiRNATQHM1RmGIc0p/T3sn1/Hed =yysf -----END PGP SIGNATURE----- --gKaPnVNVpyX08bAX--