Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 30 Oct 2012 10:39:54 -0700
From:      Adrian Chadd <adrian@freebsd.org>
To:        freebsd-wireless@freebsd.org
Subject:   Re: updates: net80211/ath now do AP power save "better" (except ps-poll); I broke performance
Message-ID:  <CAJ-Vmo=1=vH%2B%2BxRS-YtYLPtE%2BHBca46512gw3ct=vUyqNuE6nw@mail.gmail.com>
In-Reply-To: <CAJ-VmokOVNx3hL8Za=_EUu3QRmheDp3BMxxTYtaZFESgiKjeHw@mail.gmail.com>
References:  <CAJ-VmokOVNx3hL8Za=_EUu3QRmheDp3BMxxTYtaZFESgiKjeHw@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On 28 October 2012 20:48, Adrian Chadd <adrian@freebsd.org> wrote:

> Now for the problem: I broke throughput. Instead of getting
> 150Mbit/sec TCP, I now get ~ 100MBit/sec TCP. The culprit is almost
> exclusively going to be the TX serialisation. Now, that's easily
> tested and I'll do that tomorrow - I can just undo the TX
> serialisation and make ath_start() direct dispatch. If this _isn't_
> the case, I'll have to spend whatever time needed to figure it out.
> But if it is the case, I'll need to figure out how to serialise TX
> without that performance drop. So, if you do update to -HEAD, please
> keep that in mind.

Yes, it's the TX taskqueue changes. If I go back to direct dispatch, I
actually now get 150mbit -> 170mbit TCP throughput. Yes, it's that
crazy.

So at least for the station-side running iperf, here's the problem:

* iperf queues some traffic;
* it waits for an ACK before it can queue more;
* ath(4) RX interrupt occurs;
* RX tasklet gets scheduled - and if nothing else is going on, it runs;
* So.. RX frames are handled, and I guess one is an ACK, as it wakes
up iperf - so whilst the RX taskqueue is running, the scheduler
switches to iperf (maybe it hits a lock that needs waking up? Not
sure.)
* The iperf thread sends more data - so a whole lot of ath_start() is
called, which in the TX taskqueue implementation just schedules the TX
tasklet to run;
* .. but as the RX tasklet is running - it can't run.
* Then the rest of the RX tasklet runs to completion;
* Once that's done, TX occurs.

That extra latency is costing like, half the performance.

Now, the new if_bridge code is likely making that worse in the
bridging path - since it's now direct-dispatching from wifi to the
arge interface, if arge blocks at all, RX is going to stall; which
means any traffic the other way is going to have to wait. Whereas
before it'd just populate the bridge ifnet->if_snd queue. It's very
likely more complicated than that, but I can totally see that
happening.

So! Given this is the behaviour of the IP/TCP stack, I'm not entirely
sure what to do next.

* I can go back to direct-dispatch, but that introduces lots of
synchronisation issues again, especially with preemption and SMP.
* I could go to "only schedule ath_start() if it's not already
running" but there's a small race window with that - specifically, if
you do this:

ath_start()
  if (! running) {
    set run=1;
    run;
    set run=0;
  }

That seems innocuous - but if you have a second thread that gets run
just as set run=0 is about to be run, the second thread won't run -
but the first thread will terminate. So it's not _that- easy to do.

* I could just go the linux path and create a wifi RX and TX lock -
and hold that across any RX or TX respectively.

Thoughts are welcome. :-)

Thanks,


Adrian



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAJ-Vmo=1=vH%2B%2BxRS-YtYLPtE%2BHBca46512gw3ct=vUyqNuE6nw>