Date: Thu, 13 Nov 2014 20:40:11 +0800 From: Julian Elischer <julian@freebsd.org> To: Adrian Chadd <adrian@freebsd.org>, Alexander Kabaev <kabaev@gmail.com> Cc: "freebsd-arch@freebsd.org" <freebsd-arch@freebsd.org> Subject: Re: Questions about locking; turnstiles and sleeping threads Message-ID: <5464A6AB.5060006@freebsd.org> In-Reply-To: <CAJ-Vmok-8znyycyOBS_ZQU275zFy%2BzuZ2C-jt4N3DnuEVS=PWg@mail.gmail.com> References: <CAJ-VmomrauhCMoF_dZfMWWhZp0EgwfE9RmxL5Pc37PhLSzZ6Qg@mail.gmail.com> <20141112212613.21037929@kan> <CAJ-Vmok-8znyycyOBS_ZQU275zFy%2BzuZ2C-jt4N3DnuEVS=PWg@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On 11/13/14, 11:39 AM, Adrian Chadd wrote: > On 12 November 2014 18:26, Alexander Kabaev <kabaev@gmail.com> wrote: >> On Wed, 12 Nov 2014 18:13:55 -0800 >> Adrian Chadd <adrian@freebsd.org> wrote: >> >>> Hi, >>> >>> I have a bit of an odd case here. >>> >>> I'm getting panics in the net80211/ath code, "sleeping thread (X) owns >>> non-sleepable lock." >>> >>> show alllocks just showed one lock held - the net80211 comlock. It's a >>> recursive mutex, that's supposed to be sleepable. >>> >>> The two threads in question look like this: >>> >>> thread X: net80211_newstate_cb (grabs IEEE80211_LOCK()) >>> ath_newstate >>> callout_drain - which grabs the ATH_LOCK as part of the callout >>> drain side of things >>> that enters sleepq_wait() and goes to sleep, waiting for >>> whatever's running the callout to >>> finish >>> >>> thread Y: >>> rx_path in if_ath_rx_edma >>> ath_rx_pkt -> sta_input -> ath_recv_mgmt -> sta_recv_mgmt (grabs >>> IEEE80211_LOCK()) -> panics >>> >>> Thread Y doesn't hold any other locks. It's just trying to grab the >>> IEEE80211_LOCK that is being held by thread X. But thread X is asleep >>> waiting for whatever callout to finish so it can continue. The code in >>> propagate_priority() sees that thread X is sleeping and panics. >>> >>> So, what's really going on? I don't mind (well, "don't mind") having >>> to take another deep dive through all of this to sort it out so it >>> doesn't tickle the callout / turnstile code in this particular >>> fashion, but I'd first like to ensure that it's not some corner case >>> that isn't handled by the check in propagate_priority(). >>> >>> Thanks, >>> >>> >>> -adrian >>> _______________________________________________ >> Hi, >> >> mutexes are blocking and not sleepable primitives, so doing any >> unbounded sleep with mutex locked, such as one you are attempting by >> calling callout_drain is illegal. In other words, you are getting an >> expected assert and the code in question is wrong. > Hi, > > Right. That isn't mentioned in the manpage. The manpage says: > > The function callout_drain() is identical to callout_stop() except that > it will wait for the callout to be completed if it is already in > progress. This function MUST NOT be called while holding any locks on > which the callout might block, or deadlock will result. Note that if the > callout subsystem has already begun processing this callout, then the > callout function may be invoked during the execution of callout_drain(). > However, the callout subsystem does guarantee that the callout will be > fully stopped before callout_drain() returns. also look at 'man locking' > > The callout isn't going to block here, but another thread may block. > > This is good to know. I'll see if I can come up with an addition to > the manpage about this. > > I'm going to have to do another pass over all of the wifi drivers and > stack to see where this is happening. Ugh. :( > > Thanks! > > > > -adrian > _______________________________________________ > freebsd-arch@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-arch > To unsubscribe, send any mail to "freebsd-arch-unsubscribe@freebsd.org" > >
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?5464A6AB.5060006>