From owner-freebsd-geom@FreeBSD.ORG Mon Jun 11 22:28:06 2012 Return-Path: Delivered-To: freebsd-geom@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id EF282106566C for ; Mon, 11 Jun 2012 22:28:06 +0000 (UTC) (envelope-from steven.haber@isilon.com) Received: from mexforwardwc.lss.emc.com (mexforwardwc.lss.emc.com [137.69.117.200]) by mx1.freebsd.org (Postfix) with ESMTP id C79318FC12 for ; Mon, 11 Jun 2012 22:28:06 +0000 (UTC) Received: from scl02-01d02-si01.isus.emc.com (scl02-01d02-si01.isus.emc.com [137.69.225.84]) by mexforwardwc.lss.emc.com (Switch-3.4.3/Switch-3.4.3) with ESMTP id q5BMS67u016101 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Mon, 11 Jun 2012 15:28:06 -0700 Received: from mailhubwc.lss.emc.com (mailhubscprd02.lss.emc.com [137.69.224.75]) by scl02-01d02-si01.isus.emc.com (RSA Interceptor); Mon, 11 Jun 2012 15:27:52 -0700 Received: from seacasht01.desktop.isilon.com (seacasht01.isilon.com [137.69.159.79]) by mailhubwc.lss.emc.com (Switch-3.4.3/Switch-3.4.3) with ESMTP id q5BMRqVe005547; Mon, 11 Jun 2012 15:27:52 -0700 Received: from seaxch01.isilon.com (137.69.158.60) by SEACASHT01.desktop.isilon.com (137.69.159.79) with Microsoft SMTP Server id 14.2.298.4; Mon, 11 Jun 2012 15:27:51 -0700 X-MimeOLE: Produced By Microsoft Exchange V6.5 Content-Class: urn:content-classes:message MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Date: Mon, 11 Jun 2012 15:27:39 -0700 Message-ID: <56CE86D6660FF84498426FA7A33170B403429115@seaxch01.desktop.isilon.com> In-Reply-To: <20120611215610.GJ2337@deviant.kiev.zoral.com.ua> X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: Geom / destroy_dev() deadlock Thread-Index: Ac1IHRE10CKhtsqGSi+pmXihpPeo6wAA6eRg References: <56CE86D6660FF84498426FA7A33170B4033672EF@seaxch01.desktop.isilon.com> <20120611204334.GH2337@deviant.kiev.zoral.com.ua> <56CE86D6660FF84498426FA7A33170B403367535@seaxch01.desktop.isilon.com> <20120611215610.GJ2337@deviant.kiev.zoral.com.ua> From: Steven Haber To: Konstantin Belousov X-EMM-MHVC: 1 Cc: freebsd-geom@freebsd.org Subject: RE: Geom / destroy_dev() deadlock X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 11 Jun 2012 22:28:07 -0000 > On Mon, Jun 11, 2012 at 02:15:56PM -0700, Steven Haber wrote: > > > On Mon, Jun 11, 2012 at 10:19:07AM -0700, Steven Haber wrote: > > > > Hey FreeBSD devs, > > > > =20 > > > > I hope this is the right forum for this. I haven't used the FreeBSD > > > > mailing lists before. I'm a relatively new hire at EMC Isilon. We are > > > > using FreeBSD 7.0 as the basis for our scale-out NAS product line. We > > > > are frequently hitting the deadlock described here: > > > >=20 > > > > > > http://lists.freebsd.org/pipermail/freebsd-geom/2010-February/003888.htm l > > > >=20 > > > > The race is that destroy_dev() sleeps indefinitely waiting for thread > > > > references to drop from a dev. In the case of geom, we hold the > > topology > > > > lock the whole time we're in the dev layer. In devfs_open() and > > > > devfs_close(), we take a ref on the dev before calling into to geom. > > > >=20 > > > > The proposed solution of destroy_dev_sched() makes sense to me. I am > > > > trying to understand the necessity of Alexander Motin's additional > > > > changes. destroy_dev_tq() holds the devmtx during the destroy and > > devfs > > > > uses this lock to take refs before calling into geom. To me it seems > > we > > > > should be able to protect the cdev with just the devmtx, provided > > > > consumers of d_close(), d_open(), etc. ref and rel the dev > > > > appropriately. > >=20 > >=20 > > > From: Konstantin Belousov > > > Sent: Monday, June 11, 2012 1:44 PM > > > To: Steven Haber > > > Cc: freebsd-geom@freebsd.org > > > Subject: Re: Geom / destroy_dev() deadlock > >=20 > > > devmtx is mutex. If taken, then cdevsw methods would be unable to > > sleep. > >=20 > > To clarify, I'm not suggesting additional locking with devmtx. Currently > > we use the devmtx to take thread references on the dev. destroy_devl() > > sleeps until there are no references. I don't see why it matters if we a > > foreground destroy (current code, destroy_dev()) or scheduled destroy > > (using destroy_dev_sched()) since the devmtx combined with thread refs > > should guarantee we are never in the cdevsw methods during a destroy. I > > think that the single routine change is sufficient, and I can't seem to > > cause any sort of crash on my system with that change in place. > From: Konstantin Belousov > Sent: Monday, June 11, 2012 2:56 PM > To: Steven Haber > Cc: freebsd-geom@freebsd.org > Subject: Re: Geom / destroy_dev() deadlock > I do not understand what you are proposing. Could you, please, show > the patch ? --- src/sys/geom/geom_dev.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/sys/geom/geom_dev.c b/src/sys/geom/geom_dev.c index 38251e1..787235a 100644 --- a/src/sys/geom/geom_dev.c +++ b/src/sys/geom/geom_dev.c @@ -497,7 +497,7 @@ g_dev_orphan(struct g_consumer *cp) =20 /* Destroy the struct cdev *so we get no more requests */ unit =3D dev2unit(dev); - destroy_dev(dev); + destroy_dev_sched(dev); free_unr(unithdr, unit); =20 /* Wait for the cows to come home */ --=20 Steven