Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 21 Mar 2014 11:06:33 +0100
From:      Pawel Jakub Dawidek <pjd@FreeBSD.org>
To:        Andriy Gapon <avg@FreeBSD.org>
Cc:        freebsd-fs@FreeBSD.org, Andreas Longwitz <longwitz@incore.de>, freebsd-geom@FreeBSD.org
Subject:   Re: g_mirror_access() dropping geom topology_lock [Was: Kernel crash trying to import a ZFS pool with log device]
Message-ID:  <20140321100633.GA1656@garage.freebsd.pl>
In-Reply-To: <532C085D.3020201@FreeBSD.org>
References:  <532B5A0C.1010008@incore.de> <532C085D.3020201@FreeBSD.org>

next in thread | previous in thread | raw e-mail | index | archive | help

--EVF5PPMfhYS0aIcm
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Fri, Mar 21, 2014 at 11:37:33AM +0200, Andriy Gapon wrote:
> I see two issues here.
> First, the ZFS tasting code could be made more robust.  If it never tried=
 to
> re-use the consumer and always created a new one, then most likely this c=
rash
> could be avoided.  But there is no bug in the code.  The code is correct =
and it
> it uses GEOM topology lock to avoid any concurrency issues.

This is the problem, in my opinion. GEOM classes have to have the
ability to drop the topology lock in the access method. Without such
ability any more complex GEOM class cannot work or will require tons of
hacks to do their job. Not only my GEOM classes do that - GRAID does the
same.

I'd much prefer for us to accept the fact that GEOM classes are allowed
to drop the topology lock in their access methods and fix it in ZFS.

> But GEOM mirror code breaks a contract on which the ZFS code relies.
> g_access() must be called with the topology lock hold.
> I extend this requirement to a requirement that access method of any GEOM
> provider must operate under the topology lock and must never drop it.
> In other words, if a caller must acquire g_topology_lock before calling
> g_access, then in return it must have a guarantee that the GEOM topology =
stays
> unchanged across the call to g_access().
> g_mirror_access() breaks the above contract.
>=20
> So, the code in vdev_geom_attach() obtains g_topology_lock, then it finds=
 an
> existing valid consumer and calls g_access() on it.  It reasonably expect=
s that
> the consumer remains valid, but because g_mirror_access() drops and requi=
res the
> topology lock, there is a chance that the topology can change and the con=
sumer
> may become invalid.
>=20
> I am not very familiar with gmirror code, so I am not sure how to fix the
> problem from that end.

--=20
Pawel Jakub Dawidek                       http://www.wheelsystems.com
FreeBSD committer                         http://www.FreeBSD.org
Am I Evil? Yes, I Am!                     http://mobter.com

--EVF5PPMfhYS0aIcm
Content-Type: application/pgp-signature

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (FreeBSD)

iEYEARECAAYFAlMsDykACgkQForvXbEpPzT2/wCgikwhKj4jipMzxnUyD8EvW0Ag
vWIAoK8QSmWe+fx5e7x99qfP3JqmlGCL
=JY2h
-----END PGP SIGNATURE-----

--EVF5PPMfhYS0aIcm--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20140321100633.GA1656>