Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 30 Nov 2023 17:14:21 -0800
From:      Adrian Chadd <adrian.chadd@gmail.com>
To:        Farhan Khan <farhan@farhan.codes>
Cc:        freebsd-wireless <freebsd-wireless@freebsd.org>
Subject:   Re: Why newstate handler runs IEEE80211_LOCK/UNLOCK?
Message-ID:  <CAJ-Vmo=CM7mZ%2Bid6wt6tUgxbRnA2_CeFx=L_aFHCR07CYv4-gQ@mail.gmail.com>
In-Reply-To: <fb0c074a-e821-4ff3-8b62-9b58a6abce95@app.fastmail.com>
References:  <629e3534-705a-4dcc-ad16-edba170da251@app.fastmail.com> <CAJ-VmonnRP0Gt5DwUw5wtiS3TygdsVS20k6HBHxwTa=d7EP2ow@mail.gmail.com> <fb0c074a-e821-4ff3-8b62-9b58a6abce95@app.fastmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
--000000000000d3c6fa060b687d04
Content-Type: text/plain; charset="UTF-8"

On Thu, 30 Nov 2023 at 08:10, Farhan Khan <farhan@farhan.codes> wrote:

> On Thu, Nov 30, 2023, at 1:24 AM, Adrian Chadd wrote:
> > On Wed, 29 Nov 2023 at 22:12, Farhan Khan <farhan@farhan.codes> wrote:
> >> Hi all,
> >>
> >> I'm studying the implementations of net80211 and noticed that in all
> newstate handlers the code begins by running IEEE80211_UNLOCK(ic) and ends
> with IEEE80211_LOCK(ic). I was not clear on why this was, I would have
> expected the opposite order. Also, why not just use the softc device-wide
> mutex over one for ieee80211com. Overall, I do not seem to understand the
> intent of the unlock and am seeking clarification.
> >
> > That part of the net80211 locking handling is ... unfortunately unfun.
> > Without doing that, there'd be lots of lock order inversion issues and
> > sleeping whilst lock held issues (since it's a mutex, not a sleep lock.)
> > The newstate code in net80211 at least (now?) runs in a taskqueue, so
> > whenever something changes state, it isn't happening in a random
> > drivers rx/tx/ioctl path. That way newstate transitions are at
> > hopefully serialised and not running in overlapping/concurrent threads.
>
> I'm still a little unclear here. Why does it inverted UNLOCK first?
> Wouldn't that mean the state /can/ change until still be a LOCK first? And,
> why not just do a softc-wide lock, why IEEE80211's lock function? But then
> there is also a driver softc lock, which confuses me. I'm also not
> understanding the double lock mechanism.
>
>
(bz@ please correct me if things have changed :-) )

The only thing that TECHNICALLY should be driving  state changes are calls
to ieee80211_newstate() which will eventually schedule a newstate taskqueue
item or a callout. I forget.
Thus all of the state changes are serialised by that single newstate. It's
one of tehe reasons why there's those occasional "state change lost"
messages - there's no queue of state changes. Just the state change call,
some currenat and new state, and then that callout.

 Changing the upcoming state is protected by the IEEE80211_LOCK mutex.

But, it does lock, vap->iv_newstate or whichever method it is, then lock
because you can't sleep whilst holding a mutex (eg what happens when you do
a firmware command send in the intel drivers that completes via an
interrupt / another kernel thread.) Ideally the net80211 lock would just be
held across the whole thing - and for the driver net80211 was written for -
if_ath - that would've been possible! But the moment sam brought in other
drivers for firmware chipsets, this model broke. Lots of drivers, wifi and
otherwise do this, especially in their receive packet path. It's quite
frankly a terrible code pattern. One that I've had to carefully use. :-)


>
> > However, since drivers do a /lot/ of potentially sleeping work in the
> > newstate path - think all the sleeping that goes on when things wait
> > for firmware commands to complete - you can't hold a mutex across those.
>
> This seems relevant but I did not understand. :/
>

You can't hold a mutex and sleep. You need to use a sleep lock (man sx, or
man sx_lock, I forget.)

If you're still having trouble understanding the freebsd locks, mutexes and
net epoch stuff then it may be worthwhile for us to post a quick overview
of it. ;)


-a

--000000000000d3c6fa060b687d04
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr"><div dir=3D"ltr"><br></div><br><div class=3D"gmail_quote">=
<div dir=3D"ltr" class=3D"gmail_attr">On Thu, 30 Nov 2023 at 08:10, Farhan =
Khan &lt;farhan@farhan.codes&gt; wrote:<br></div><blockquote class=3D"gmail=
_quote" style=3D"margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204=
,204);padding-left:1ex">On Thu, Nov 30, 2023, at 1:24 AM, Adrian Chadd wrot=
e:<br>
&gt; On Wed, 29 Nov 2023 at 22:12, Farhan Khan &lt;farhan@farhan.codes&gt; =
wrote:<br>
&gt;&gt; Hi all,<br>
&gt;&gt; <br>
&gt;&gt; I&#39;m studying the implementations of net80211 and noticed that =
in all newstate handlers the code begins by running IEEE80211_UNLOCK(ic) an=
d ends with IEEE80211_LOCK(ic). I was not clear on why this was, I would ha=
ve expected the opposite order. Also, why not just use the softc device-wid=
e mutex over one for ieee80211com. Overall, I do not seem to understand the=
 intent of the unlock and am seeking clarification.<br>
&gt;<br>
&gt; That part of the net80211 locking handling is ... unfortunately unfun.=
 <br>
&gt; Without doing that, there&#39;d be lots of lock order inversion issues=
 and <br>
&gt; sleeping whilst lock held issues (since it&#39;s a mutex, not a sleep =
lock.)<br>
&gt; The newstate code in net80211 at least (now?) runs in a taskqueue, so =
<br>
&gt; whenever something changes state, it isn&#39;t happening in a random <=
br>
&gt; drivers rx/tx/ioctl path. That way newstate transitions are at <br>
&gt; hopefully serialised and not running in overlapping/concurrent threads=
.<br>
<br>
I&#39;m still a little unclear here. Why does it inverted UNLOCK first? Wou=
ldn&#39;t that mean the state /can/ change until still be a LOCK first? And=
, why not just do a softc-wide lock, why IEEE80211&#39;s lock function? But=
 then there is also a driver softc lock, which confuses me. I&#39;m also no=
t understanding the double lock mechanism.<br>
<br></blockquote><div><br></div><div>(bz@ please correct me if things have =
changed :-) )=C2=A0</div><div><br></div><div>The only thing that TECHNICALL=
Y should be driving=C2=A0 state changes are calls to ieee80211_newstate() w=
hich will eventually schedule a newstate taskqueue item or a callout. I for=
get.</div><div>Thus all of the state changes are serialised by that single =
newstate. It&#39;s one of tehe reasons why there&#39;s those occasional &qu=
ot;state change lost&quot; messages - there&#39;s no queue of state changes=
. Just the state change call, some currenat and new state, and then that ca=
llout.</div><div><br></div><div>=C2=A0Changing the upcoming state is protec=
ted by the IEEE80211_LOCK mutex.</div><div><br></div><div>But, it does lock=
, vap-&gt;iv_newstate or whichever method it is, then lock because you can&=
#39;t sleep whilst holding a mutex (eg what happens when you do a firmware =
command send in the intel drivers that completes via an interrupt / another=
 kernel thread.) Ideally the net80211 lock would just be held across the wh=
ole thing - and for the driver net80211 was written for - if_ath - that wou=
ld&#39;ve been possible! But the moment sam brought in other drivers for fi=
rmware chipsets, this model broke. Lots of drivers, wifi and otherwise do t=
his, especially in their receive packet path. It&#39;s quite frankly a terr=
ible code pattern. One that I&#39;ve had to carefully use. :-)</div><div><b=
r></div><div><br></div><blockquote class=3D"gmail_quote" style=3D"margin:0p=
x 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
&gt;<br>
&gt; However, since drivers do a /lot/ of potentially sleeping work in the =
<br>
&gt; newstate path - think all the sleeping that goes on when things wait <=
br>
&gt; for firmware commands to complete - you can&#39;t hold a mutex across =
those.<br>
<br>
This seems relevant but I did not understand. :/<br></blockquote><div><br><=
/div><div>You can&#39;t hold a mutex and sleep. You need to use a sleep loc=
k (man sx, or man sx_lock, I forget.)</div><div><br></div><div>If you&#39;r=
e still having trouble understanding the freebsd locks, mutexes and net epo=
ch stuff then it may be worthwhile for us to post a quick overview of it. ;=
)</div><div><br></div><div><br></div><div>-a</div><div><br></div></div></di=
v>

--000000000000d3c6fa060b687d04--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAJ-Vmo=CM7mZ%2Bid6wt6tUgxbRnA2_CeFx=L_aFHCR07CYv4-gQ>