From owner-freebsd-current@FreeBSD.ORG  Fri Jun 24 19:35:59 2011
Return-Path: <owner-freebsd-current@FreeBSD.ORG>
Delivered-To: current@freebsd.org
Received: from mx1.freebsd.org (unknown [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 45E081065676;
	Fri, 24 Jun 2011 19:35:59 +0000 (UTC)
	(envelope-from scottl@samsco.org)
Received: from pooker.samsco.org (pooker.samsco.org [168.103.85.57])
	by mx1.freebsd.org (Postfix) with ESMTP id 0904B8FC14;
	Fri, 24 Jun 2011 19:35:58 +0000 (UTC)
Received: from [127.0.0.1] (pooker.samsco.org [168.103.85.57])
	(authenticated bits=0)
	by pooker.samsco.org (8.14.4/8.14.4) with ESMTP id p5OJZs03093483;
	Fri, 24 Jun 2011 13:35:54 -0600 (MDT)
	(envelope-from scottl@samsco.org)
Mime-Version: 1.0 (Apple Message framework v1084)
Content-Type: text/plain; charset=us-ascii
From: Scott Long <scottl@samsco.org>
In-Reply-To: <55FDA4B1-CA5E-4304-9239-3AAF0FC6FF5F@samsco.org>
Date: Fri, 24 Jun 2011 13:35:54 -0600
Content-Transfer-Encoding: quoted-printable
Message-Id: <C7D47D8B-5E29-4066-892A-F547F6DB9E8B@samsco.org>
References: <20110619232307.GA57530@vniz.net> <20110620001912.GA60252@vniz.net>
	<4DFEAD4F.1040603@FreeBSD.org> <20110620070222.GA74009@vniz.net>
	<20110620080146.GF48734@deviant.kiev.zoral.com.ua>
	<20110620114656.GA83524@vniz.net>
	<20110621161719.GA16166@nargothrond.kdm.org>
	<20110621204934.GB9877@vniz.net>
	<20110622035404.GA38834@nargothrond.kdm.org>
	<20110622041325.GA13754@vniz.net>
	<20110622200919.GA72504@nargothrond.kdm.org>
	<4E03FDFD.70203@FreeBSD.org>
	<55FDA4B1-CA5E-4304-9239-3AAF0FC6FF5F@samsco.org>
To: Justin Gibbs <gibbs@freebsd.org>
X-Mailer: Apple Mail (2.1084)
X-Spam-Status: No, score=-50.0 required=3.8 tests=ALL_TRUSTED,
	T_RP_MATCHES_RCVD autolearn=unavailable version=3.3.0
X-Spam-Checker-Version: SpamAssassin 3.3.0 (2010-01-18) on pooker.samsco.org
Cc: Kostik Belousov <kostikbel@gmail.com>, Eir Nym <eirnym@gmail.com>,
	"Kenneth D. Merry" <ken@freebsd.org>, current@freebsd.org, will@freebsd.org
Subject: Re: Exactly that commit (was Re: Latest -current 100% hang at the
	late boot stage)
X-BeenThere: freebsd-current@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Discussions about the use of FreeBSD-current
	<freebsd-current.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-current>, 
	<mailto:freebsd-current-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-current>
List-Post: <mailto:freebsd-current@freebsd.org>
List-Help: <mailto:freebsd-current-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-current>,
	<mailto:freebsd-current-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 24 Jun 2011 19:35:59 -0000


On Jun 23, 2011, at 11:18 PM, Scott Long wrote:

>=20
> On Jun 23, 2011, at 9:01 PM, Justin T. Gibbs wrote:
>=20
>> On 6/22/11 4:09 PM, Kenneth D. Merry wrote:
>>> On Wed, Jun 22, 2011 at 08:13:25 +0400, Andrey Chernov wrote:
>>>> On Tue, Jun 21, 2011 at 09:54:04PM -0600, Kenneth D. Merry wrote:
>>>>> These two are interesting:
>>>>>=20
>>>>>> http://img825.imageshack.us/img825/1249/21062011014m.jpg
>>>>>> http://img839.imageshack.us/img839/3791/21062011015.jpg
>>>>>=20
>>>>> It looks like the GEOM event thread is stuck inside the cd(4)=20
>> driver. The
>>>>> cd(4) driver is trying to acquire the peripheral lock, and is =
sleeping
>>>>> until it gets it.
>>>>>=20
>>>>> What isn't clear is who is holding it.
>>=20
>> ...
>>=20
>>> The GEOM event thread is stuck sleeping in the mtx_sleep() call =
above. So
>>> that tells me that one of several things is going on:
>>>=20
>>> - There is a path in the cd(4) driver where it can call =
cam_periph_hold()
>>> but not cam_periph_unhold().
>>>=20
>>> - There is another thread in the system that has called =
cam_periph_hold(),
>>> and has gotten stuck before it can call cam_periph_unhold().
>>>=20
>>> - The hold/unhold logic is broken, and there is a case where a =
thread
>>> waiting for the lock can miss the wakeup. After looking at the code, =
I
>>> don't think this is the case, but I may have missed something.
>>>=20
>>> So it is probably one of the first two cases.
>>=20
>> ...
>>=20
>> I have a theory for the cause of this hang.
>>=20
>> The commit that triggers this problem added calls to g_access() =
during the
>> geom_dev probe.  I believe this hit a race in cdregister() where
>> the periph hold lock is dropped around the changer probe code.  Why =
the
>> periph hold lock is dropped there, I do not know as I haven't fully
>> reviewed the changer probe code.
>>=20
>=20
> Are you talking about this?
>=20
>        disk_create(softc->disk, DISK_VERSION);
>        cam_periph_lock(periph);
>        cam_periph_unhold(periph);
> [...]
>        if (((cgd->ccb_h.target_lun > 0)
>          && ((softc->quirks & CD_Q_NO_CHANGER) =3D=3D 0))
>         || ((softc->quirks & CD_Q_CHANGER) !=3D 0)) {
>=20
> The unhold there compliments the hold that was done prior to =
disk_create().  The hold/unhold is done as a hack around the need to =
drop the periph/sim mutex while calling disk_create(), due to the =
later's insistence on using blocking calls.  I've wanted to re-think how =
that pattern is done (it's the same gross hack in nearly all of the =
periph drivers), but haven't gotten around to it.  If the 'hold' =
semaphore needs to be held longer to prevent the race that you're =
theorizing, then it should be possible to simply extend its coverage in =
the code block, but I'm not sure if it'll result in an unintended =
deadlock with the changer enumeration/matching code.  I _think_ that =
it'll be ok, but the density of magic in the code is a bit overwhelming =
at this time of night =3D-)
>=20

Actually, what's probably happening is that the sim/periph lock is being =
dropped but the hold semaphore is held, disk_create() is called, which =
kicks off GEOM to do GEOM-ish things including the new g_access() call.  =
It tries to call cdopen(), which grabs the periph/sim mutex in order to =
then get the hold semaphore, and winds up sleeping because the semaphore =
is already held in cdregister().  If/when disk_create() returns, =
cdregister() winds up waiting for the periph lock because it's held by =
cdopen(), and now you have a deadlock between the two.

Again, this is my fault for being lazy and not re-organzing the periph =
drivers so that disk_create() is safely called without shady locking =
hacks.  I don't think that extending the coverage of the hold semaphore =
is going to help in this case.  The periphs need to adopt a new pattern =
where disk_create() isn't called until the periph is completely =
initialized and ready for periph_open() to be called without any locking =
gymnastics.

Scott