From owner-freebsd-arch@FreeBSD.ORG Thu Nov 13 12:40:31 2014 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id A34D8BB5; Thu, 13 Nov 2014 12:40:31 +0000 (UTC) Received: from vps1.elischer.org (vps1.elischer.org [204.109.63.16]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "vps1.elischer.org", Issuer "CA Cert Signing Authority" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 810E9383; Thu, 13 Nov 2014 12:40:30 +0000 (UTC) Received: from Julian-MBP3.local (50-196-156-133-static.hfc.comcastbusiness.net [50.196.156.133]) (authenticated bits=0) by vps1.elischer.org (8.14.9/8.14.9) with ESMTP id sADCeI3h065602 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES128-SHA bits=128 verify=NO); Thu, 13 Nov 2014 04:40:22 -0800 (PST) (envelope-from julian@freebsd.org) Message-ID: <5464A6AB.5060006@freebsd.org> Date: Thu, 13 Nov 2014 20:40:11 +0800 From: Julian Elischer User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:24.0) Gecko/20100101 Thunderbird/24.6.0 MIME-Version: 1.0 To: Adrian Chadd , Alexander Kabaev Subject: Re: Questions about locking; turnstiles and sleeping threads References: <20141112212613.21037929@kan> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: "freebsd-arch@freebsd.org" X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 13 Nov 2014 12:40:31 -0000 On 11/13/14, 11:39 AM, Adrian Chadd wrote: > On 12 November 2014 18:26, Alexander Kabaev wrote: >> On Wed, 12 Nov 2014 18:13:55 -0800 >> Adrian Chadd wrote: >> >>> Hi, >>> >>> I have a bit of an odd case here. >>> >>> I'm getting panics in the net80211/ath code, "sleeping thread (X) owns >>> non-sleepable lock." >>> >>> show alllocks just showed one lock held - the net80211 comlock. It's a >>> recursive mutex, that's supposed to be sleepable. >>> >>> The two threads in question look like this: >>> >>> thread X: net80211_newstate_cb (grabs IEEE80211_LOCK()) >>> ath_newstate >>> callout_drain - which grabs the ATH_LOCK as part of the callout >>> drain side of things >>> that enters sleepq_wait() and goes to sleep, waiting for >>> whatever's running the callout to >>> finish >>> >>> thread Y: >>> rx_path in if_ath_rx_edma >>> ath_rx_pkt -> sta_input -> ath_recv_mgmt -> sta_recv_mgmt (grabs >>> IEEE80211_LOCK()) -> panics >>> >>> Thread Y doesn't hold any other locks. It's just trying to grab the >>> IEEE80211_LOCK that is being held by thread X. But thread X is asleep >>> waiting for whatever callout to finish so it can continue. The code in >>> propagate_priority() sees that thread X is sleeping and panics. >>> >>> So, what's really going on? I don't mind (well, "don't mind") having >>> to take another deep dive through all of this to sort it out so it >>> doesn't tickle the callout / turnstile code in this particular >>> fashion, but I'd first like to ensure that it's not some corner case >>> that isn't handled by the check in propagate_priority(). >>> >>> Thanks, >>> >>> >>> -adrian >>> _______________________________________________ >> Hi, >> >> mutexes are blocking and not sleepable primitives, so doing any >> unbounded sleep with mutex locked, such as one you are attempting by >> calling callout_drain is illegal. In other words, you are getting an >> expected assert and the code in question is wrong. > Hi, > > Right. That isn't mentioned in the manpage. The manpage says: > > The function callout_drain() is identical to callout_stop() except that > it will wait for the callout to be completed if it is already in > progress. This function MUST NOT be called while holding any locks on > which the callout might block, or deadlock will result. Note that if the > callout subsystem has already begun processing this callout, then the > callout function may be invoked during the execution of callout_drain(). > However, the callout subsystem does guarantee that the callout will be > fully stopped before callout_drain() returns. also look at 'man locking' > > The callout isn't going to block here, but another thread may block. > > This is good to know. I'll see if I can come up with an addition to > the manpage about this. > > I'm going to have to do another pass over all of the wifi drivers and > stack to see where this is happening. Ugh. :( > > Thanks! > > > > -adrian > _______________________________________________ > freebsd-arch@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-arch > To unsubscribe, send any mail to "freebsd-arch-unsubscribe@freebsd.org" > >