Date: Tue, 30 Oct 2012 10:39:54 -0700 From: Adrian Chadd <adrian@freebsd.org> To: freebsd-wireless@freebsd.org Subject: Re: updates: net80211/ath now do AP power save "better" (except ps-poll); I broke performance Message-ID: <CAJ-Vmo=1=vH%2B%2BxRS-YtYLPtE%2BHBca46512gw3ct=vUyqNuE6nw@mail.gmail.com> In-Reply-To: <CAJ-VmokOVNx3hL8Za=_EUu3QRmheDp3BMxxTYtaZFESgiKjeHw@mail.gmail.com> References: <CAJ-VmokOVNx3hL8Za=_EUu3QRmheDp3BMxxTYtaZFESgiKjeHw@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On 28 October 2012 20:48, Adrian Chadd <adrian@freebsd.org> wrote:
> Now for the problem: I broke throughput. Instead of getting
> 150Mbit/sec TCP, I now get ~ 100MBit/sec TCP. The culprit is almost
> exclusively going to be the TX serialisation. Now, that's easily
> tested and I'll do that tomorrow - I can just undo the TX
> serialisation and make ath_start() direct dispatch. If this _isn't_
> the case, I'll have to spend whatever time needed to figure it out.
> But if it is the case, I'll need to figure out how to serialise TX
> without that performance drop. So, if you do update to -HEAD, please
> keep that in mind.
Yes, it's the TX taskqueue changes. If I go back to direct dispatch, I
actually now get 150mbit -> 170mbit TCP throughput. Yes, it's that
crazy.
So at least for the station-side running iperf, here's the problem:
* iperf queues some traffic;
* it waits for an ACK before it can queue more;
* ath(4) RX interrupt occurs;
* RX tasklet gets scheduled - and if nothing else is going on, it runs;
* So.. RX frames are handled, and I guess one is an ACK, as it wakes
up iperf - so whilst the RX taskqueue is running, the scheduler
switches to iperf (maybe it hits a lock that needs waking up? Not
sure.)
* The iperf thread sends more data - so a whole lot of ath_start() is
called, which in the TX taskqueue implementation just schedules the TX
tasklet to run;
* .. but as the RX tasklet is running - it can't run.
* Then the rest of the RX tasklet runs to completion;
* Once that's done, TX occurs.
That extra latency is costing like, half the performance.
Now, the new if_bridge code is likely making that worse in the
bridging path - since it's now direct-dispatching from wifi to the
arge interface, if arge blocks at all, RX is going to stall; which
means any traffic the other way is going to have to wait. Whereas
before it'd just populate the bridge ifnet->if_snd queue. It's very
likely more complicated than that, but I can totally see that
happening.
So! Given this is the behaviour of the IP/TCP stack, I'm not entirely
sure what to do next.
* I can go back to direct-dispatch, but that introduces lots of
synchronisation issues again, especially with preemption and SMP.
* I could go to "only schedule ath_start() if it's not already
running" but there's a small race window with that - specifically, if
you do this:
ath_start()
if (! running) {
set run=1;
run;
set run=0;
}
That seems innocuous - but if you have a second thread that gets run
just as set run=0 is about to be run, the second thread won't run -
but the first thread will terminate. So it's not _that- easy to do.
* I could just go the linux path and create a wifi RX and TX lock -
and hold that across any RX or TX respectively.
Thoughts are welcome. :-)
Thanks,
Adrian
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAJ-Vmo=1=vH%2B%2BxRS-YtYLPtE%2BHBca46512gw3ct=vUyqNuE6nw>
