Date: Fri, 22 Feb 2013 17:18:44 -0800 From: Adrian Chadd <adrian@freebsd.org> To: freebsd-wireless@freebsd.org Subject: Re: [RFT] net80211 TX serialisation, take #4 Message-ID: <CAJ-Vmom8HNLNc-=CogiF1v2pJHcn73rB0w9EOHoBtTjAp=jReA@mail.gmail.com> In-Reply-To: <CAJ-VmokpuxUtB6Zeuq8ffyQn3DVAsjEpnQaEVZ-fmeB8Dai8HQ@mail.gmail.com> References: <CAJ-VmokpuxUtB6Zeuq8ffyQn3DVAsjEpnQaEVZ-fmeB8Dai8HQ@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On 22 February 2013 15:25, Adrian Chadd <adrian@freebsd.org> wrote: > Hi, > > Here's take four of the TX serialisation. > > http://people.freebsd.org/~adrian/ath/20130223-net80211-tx-lock-4.diff > > This patch increases the lock "reach" so it locks the encap path for > both data and management frames, so the path between sequence number > allocation and driver queuing is held. > > There are some drivers that directly access ni_txseqs[] (ie, iwn and > ath) and I'll have to think about this a little more. Sometimes it'll > be called with the VAP TX lock held (ie, if it's called from driver > if_transmit / driver if_start / ic_raw_xmit) but sometimes the TX path > is called from the addba response callback, the TX completion methods, > a software frame taskqueue. None of these grab the VAP TX lock at any > point. > > I'd like to avoid making the VAP TX lock a reentrant lock (ew.) Well, it turns out that this is almost-but-not-really-right. The problem is this: * A frame is going out on VAP A * so the VAP TX lock is held for VAP A * then the driver if_transmit() method is called, which will (for now) map to enqueue and call driver if_start() * now, if_start() will dequeue the top frame, which may be for VAP B this is fine so far, as the VAP lock is intended to serialise stuff through to the driver transmission phase. It's not designed to serialise _between_ VAPs. However, then we have a little hiccup. iwn and ath use the ni_txseqs[] space for TX sequence numbers when transmitting aggregate frames. I guess mwl does sequence number allocation in-firmware. So to be correct, i should grab the VAP lock when the driver transmit code wants to assign sequence numbers. However I can't do that: (a) It may already have it held from the net80211 call; (b) it may NOT already have it held (eg from a deferred call to the driver start method - eg, if the driver calls if_start after TX completion has occured, or upon driver reset to start TX again); (c) it may have a VAP lock held from a _different_ VAP, because of the conditions above. So, this is all pretty terrible. The only sane solution for now is to make my VAP TX lock an IC TX lock,and grab said IC TX lock for all VAPs. That way the driver can grab the IC TX lock when it's doing deferred sends and it'll be sure the lock is held when it decides to grab/increment sequence numbers from ni->ni_txseqs[]. This is all pretty terrible. Honestly, what I really need is a way to do this: * serialise frame TX handling in net80211 so the sequence of frames handed to the driver _matches exactly_ the sequence going through the VAP TX code, the VAP management TX code, etc.; * call the driver without the VAP or IC TX lock held - at this point, the frames should be in the driver queue _in the order_ they were processed via the VAP TX and management / raw TX path; * the driver can grab the VAP / IC TX lock if it needs to allocate sequence numbers itself; * the driver is then responsible for ensuring that frames are processed to software/hardware queuing in the same order they were received from net80211 (so CCMP IV sequence assignment is 'correct' too); * .. and do this without tearing my hair out. So the short term fix is: * convert my VAP TX Lock to an IC TX lock; * for the drivers that touch the ni_txseqs[] values (ath, iwn at least) - have them grab the IC TX lock before any deferred if_start/if_transmit call - that way the driver TX path holds the same locks no matter whether it came from the net80211 stack or a deferred start; * chase down and eliminate any and all of the rest of the subtle "packets out of sequence" crap that occurs when you're doing high throughput 802.11n with CCMP encryption; * .. think about how to properly separate out the queuing from the driver processing. I'm open to suggestions here. Adrian
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAJ-Vmom8HNLNc-=CogiF1v2pJHcn73rB0w9EOHoBtTjAp=jReA>