Date: Fri, 21 Mar 2014 11:06:33 +0100 From: Pawel Jakub Dawidek <pjd@FreeBSD.org> To: Andriy Gapon <avg@FreeBSD.org> Cc: freebsd-fs@FreeBSD.org, Andreas Longwitz <longwitz@incore.de>, freebsd-geom@FreeBSD.org Subject: Re: g_mirror_access() dropping geom topology_lock [Was: Kernel crash trying to import a ZFS pool with log device] Message-ID: <20140321100633.GA1656@garage.freebsd.pl> In-Reply-To: <532C085D.3020201@FreeBSD.org> References: <532B5A0C.1010008@incore.de> <532C085D.3020201@FreeBSD.org>
next in thread | previous in thread | raw e-mail | index | archive | help
[-- Attachment #1 --] On Fri, Mar 21, 2014 at 11:37:33AM +0200, Andriy Gapon wrote: > I see two issues here. > First, the ZFS tasting code could be made more robust. If it never tried to > re-use the consumer and always created a new one, then most likely this crash > could be avoided. But there is no bug in the code. The code is correct and it > it uses GEOM topology lock to avoid any concurrency issues. This is the problem, in my opinion. GEOM classes have to have the ability to drop the topology lock in the access method. Without such ability any more complex GEOM class cannot work or will require tons of hacks to do their job. Not only my GEOM classes do that - GRAID does the same. I'd much prefer for us to accept the fact that GEOM classes are allowed to drop the topology lock in their access methods and fix it in ZFS. > But GEOM mirror code breaks a contract on which the ZFS code relies. > g_access() must be called with the topology lock hold. > I extend this requirement to a requirement that access method of any GEOM > provider must operate under the topology lock and must never drop it. > In other words, if a caller must acquire g_topology_lock before calling > g_access, then in return it must have a guarantee that the GEOM topology stays > unchanged across the call to g_access(). > g_mirror_access() breaks the above contract. > > So, the code in vdev_geom_attach() obtains g_topology_lock, then it finds an > existing valid consumer and calls g_access() on it. It reasonably expects that > the consumer remains valid, but because g_mirror_access() drops and requires the > topology lock, there is a chance that the topology can change and the consumer > may become invalid. > > I am not very familiar with gmirror code, so I am not sure how to fix the > problem from that end. -- Pawel Jakub Dawidek http://www.wheelsystems.com FreeBSD committer http://www.FreeBSD.org Am I Evil? Yes, I Am! http://mobter.com [-- Attachment #2 --] -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (FreeBSD) iEYEARECAAYFAlMsDykACgkQForvXbEpPzT2/wCgikwhKj4jipMzxnUyD8EvW0Ag vWIAoK8QSmWe+fx5e7x99qfP3JqmlGCL =JY2h -----END PGP SIGNATURE-----
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20140321100633.GA1656>
