Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 29 Sep 2012 17:41:02 +0200
From:      Pawel Jakub Dawidek <pjd@FreeBSD.org>
To:        Konstantin Belousov <kostikbel@gmail.com>
Cc:        freebsd-fs@freebsd.org
Subject:   Re: panic: _sx_xlock_hard: recursed on non-recursive sx zfsvfs->z_hold_mtx[i] @ ...cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_znode.c:1407
Message-ID:  <20120929154101.GK1402@garage.freebsd.pl>
In-Reply-To: <20120925090840.GD35915@deviant.kiev.zoral.com.ua>
References:  <505DB4E6.8030407@smeets.im> <20120924224606.GE79077@ithaqua.etoilebsd.net> <20120925090840.GD35915@deviant.kiev.zoral.com.ua>

next in thread | previous in thread | raw e-mail | index | archive | help

--gKaPnVNVpyX08bAX
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Tue, Sep 25, 2012 at 12:08:40PM +0300, Konstantin Belousov wrote:
> On Tue, Sep 25, 2012 at 12:46:07AM +0200, Baptiste Daroussin wrote:
> > Hi,
> >=20
> > I have the exact same problem making: tinderbox and poudriere highly
> > unusable.
> >
> > This is really problematic because pointyhat also rely on nullfs and
> > zfs, which means we can't upgrade the building nodes if we need to for
> > example.
> >
> > regards, Bapt
>=20
> This is zfs bug. Filesystems shall not call getnewvnode() while holding
> internal locks. At least not the locks which are needed during reclaim.
> Nullfs changes amplified the probability of the problematic situation,
> since now nullfs vnodes are indeed cached instead of being recreated
> on each access, so the overall count of used vnodes could be twice as
> high.
>=20
> You might try to increase the kern.maxvnodes to reduce the probability of
> the recursive calls into vnlnru() from getnewvnode(). But for real, bug
> needs to be fixed in zfs.

With all FreeBSD's VFS constraints, it is really hard to breath,
especially within file system that was not designed with our VFS
complexity in mind.

For example it would be nice of VFS to not reclaim vnodes from
getnewvnode() and leave this task entirely to the vnlru process.
It is pretty obvious layering violation to me - file system code needs
new vnode, it calls VFS routine to allocate one, which then calls file
system again to reclaim one of its vnodes.

It would also be nice to handle EAGAIN from VOP_RECLAIM(). Currently we
panic on error. This would be useful to return if some of the locks
cannot be acquired immediately. ZFS reclaim already discovers potential
deadlocks and defer some reclamation portion to separate thread.

--=20
Pawel Jakub Dawidek                       http://www.wheelsystems.com
FreeBSD committer                         http://www.FreeBSD.org
Am I Evil? Yes, I Am!                     http://tupytaj.pl

--gKaPnVNVpyX08bAX
Content-Type: application/pgp-signature

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.19 (FreeBSD)

iEYEARECAAYFAlBnFo0ACgkQForvXbEpPzQ9AgCdE3vMR7ftQFE2OoT84R6K7b0w
gAEAoKiRNATQHM1RmGIc0p/T3sn1/Hed
=yysf
-----END PGP SIGNATURE-----

--gKaPnVNVpyX08bAX--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20120929154101.GK1402>