Date: Tue, 30 Oct 2012 10:39:54 -0700 From: Adrian Chadd <adrian@freebsd.org> To: freebsd-wireless@freebsd.org Subject: Re: updates: net80211/ath now do AP power save "better" (except ps-poll); I broke performance Message-ID: <CAJ-Vmo=1=vH%2B%2BxRS-YtYLPtE%2BHBca46512gw3ct=vUyqNuE6nw@mail.gmail.com> In-Reply-To: <CAJ-VmokOVNx3hL8Za=_EUu3QRmheDp3BMxxTYtaZFESgiKjeHw@mail.gmail.com> References: <CAJ-VmokOVNx3hL8Za=_EUu3QRmheDp3BMxxTYtaZFESgiKjeHw@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On 28 October 2012 20:48, Adrian Chadd <adrian@freebsd.org> wrote: > Now for the problem: I broke throughput. Instead of getting > 150Mbit/sec TCP, I now get ~ 100MBit/sec TCP. The culprit is almost > exclusively going to be the TX serialisation. Now, that's easily > tested and I'll do that tomorrow - I can just undo the TX > serialisation and make ath_start() direct dispatch. If this _isn't_ > the case, I'll have to spend whatever time needed to figure it out. > But if it is the case, I'll need to figure out how to serialise TX > without that performance drop. So, if you do update to -HEAD, please > keep that in mind. Yes, it's the TX taskqueue changes. If I go back to direct dispatch, I actually now get 150mbit -> 170mbit TCP throughput. Yes, it's that crazy. So at least for the station-side running iperf, here's the problem: * iperf queues some traffic; * it waits for an ACK before it can queue more; * ath(4) RX interrupt occurs; * RX tasklet gets scheduled - and if nothing else is going on, it runs; * So.. RX frames are handled, and I guess one is an ACK, as it wakes up iperf - so whilst the RX taskqueue is running, the scheduler switches to iperf (maybe it hits a lock that needs waking up? Not sure.) * The iperf thread sends more data - so a whole lot of ath_start() is called, which in the TX taskqueue implementation just schedules the TX tasklet to run; * .. but as the RX tasklet is running - it can't run. * Then the rest of the RX tasklet runs to completion; * Once that's done, TX occurs. That extra latency is costing like, half the performance. Now, the new if_bridge code is likely making that worse in the bridging path - since it's now direct-dispatching from wifi to the arge interface, if arge blocks at all, RX is going to stall; which means any traffic the other way is going to have to wait. Whereas before it'd just populate the bridge ifnet->if_snd queue. It's very likely more complicated than that, but I can totally see that happening. So! Given this is the behaviour of the IP/TCP stack, I'm not entirely sure what to do next. * I can go back to direct-dispatch, but that introduces lots of synchronisation issues again, especially with preemption and SMP. * I could go to "only schedule ath_start() if it's not already running" but there's a small race window with that - specifically, if you do this: ath_start() if (! running) { set run=1; run; set run=0; } That seems innocuous - but if you have a second thread that gets run just as set run=0 is about to be run, the second thread won't run - but the first thread will terminate. So it's not _that- easy to do. * I could just go the linux path and create a wifi RX and TX lock - and hold that across any RX or TX respectively. Thoughts are welcome. :-) Thanks, Adrian
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAJ-Vmo=1=vH%2B%2BxRS-YtYLPtE%2BHBca46512gw3ct=vUyqNuE6nw>