Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 1 Jun 2012 00:46:02 -0700
From:      Adrian Chadd <adrian@freebsd.org>
To:        freebsd-net@freebsd.org
Cc:        freebsd-wireless@freebsd.org
Subject:   if_start / if_transmit handling and packet ordering - how can I guarantee it?
Message-ID:  <CAJ-Vmo=58aZcWJub8uej4xjddHsBLCOeRCZ1_MXAgiA4nQsG_g@mail.gmail.com>

next in thread | raw e-mail | index | archive | help
Hi all,

I've been pushing my ath 11n driver hard (250Mbit UDP) and I've found
a rather interesting behaviour, at least on my SMP machine.

I'm running single direction UDP iperf tests. This is all on stable/9,
with -head ath/net80211 stacks.

The receiver logs about 40/20000 frames a second as being "out of
order". The net80211 stack and ath software TX aggregation path isn't
reordering frames.

So I reviewed the code again and I wondered - is what I'm doing with
ath_start() and the TX side locking allowing for multiple ath_start()
invocations to occur, where they interleave processing frames? The
short answer is yes.

Then, something else hit me - I wondered whether ath_start() itself is
being preempted by the ath taskqueue (say, doing RX, or TX completion)
- where ath_start() is then called. Ie;

* iperf -> sendto() -> socket layer -> net80211 ieee80211_start() ->
if_transmit -> if_start() -> ath_start()

And then an interrupt coming in, during ath_start() having dequeued a
frame, but before it had placed it on any hardware/software queue:

* ath0 taskqueue -> ath_tx_proc() (tx completion) -> ath_start()

Or, similarly, CPU #0 running iperf, CPU #1 running the ath taskqueue,
and one or the other being preempted by something higher priority (eg
an interrupt coming in) between having removed a frame from the ifnet
queue and it being thrown into the software/hardware queue.

The ath driver locking only holds the TX locks as long as is needed to
do individual TX queue operations (ie, queue a frame to the software /
hardware queue) rather than holding it for the entirety of the TX
path. Because of this, I wonder if preemption is possibly causing
issues.

So I then went grovelling through ixgbe to see what it does with
if_transmit. It holds the TX lock for that particular NIC hardware
queue for the entirety of:

* a TX to hardware operation (which can be an individual frame queue,
or servicing the whole bufring contents);
* completing the frames via _txeof(), until nothing more needs to be done.

So by holding the TX lock for the entirety of the queue operation, it
means that any other entries into that particular TX path for that NIC
queue will stall, waiting for the lock, and thus effectively be
serialised, avoiding my initial issue. Any TX completion which leads
to further transmissions from the bufring (and simultaneous incoming
TX frames) will block each other.

Ok, so now that I've mostly tried to lucidly dump what's going on-
what do people think about holding the locks for (potentially) so
long? I know iwn(4) holds the driver lock for as long as it can for
_everything_, so it avoids this issue. But again, I don't really like
the idea of holding a lock for this long. Does anyone else have any
other ideas?

FWIW - I temporarily converted the ath driver to make ath_start()
enqueue a taskqueue task, which then did all of the TX inside the
taskqueue. This serialised with TX completion and RX, which can also
start TX; and all of my out of order issues went away. I unfortunately
then become very, very susceptible to scheduling latency (hence my
initial post about KTR, SMP, preemption and what looks to be like
Cx/idle/powerd issues. If I run my laptop (Lenovo T60) with all the
power/Cx stuff disabled, I can quite happily sustain 240-250MBit of
UDP without any reordering or packet latency.

Thanks in advance (and phew!)


Adrian



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAJ-Vmo=58aZcWJub8uej4xjddHsBLCOeRCZ1_MXAgiA4nQsG_g>