Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 15 Jun 2012 02:06:58 -0700
From:      Adrian Chadd <adrian@freebsd.org>
To:        Johann Hugo <jhugo@meraka.csir.co.za>
Cc:        freebsd-wireless@freebsd.org
Subject:   Re: [heads up] please test -HEAD ath(4) 802.11n!
Message-ID:  <CAJ-Vmomw6sO9eLQ3VqKBKsk77COwti_0KxZYaeT1%2B858n7oOLw@mail.gmail.com>
In-Reply-To: <201206120837.20027.jhugo@meraka.csir.co.za>
References:  <CAJ-Vmo=d3CyNOoG2JqpOwdkrFCdLSD6-aWaKuM41hA4WUviyAw@mail.gmail.com> <201206120837.20027.jhugo@meraka.csir.co.za>

next in thread | previous in thread | raw e-mail | index | archive | help
On 11 June 2012 23:37, Johann Hugo <jhugo@meraka.csir.co.za> wrote:
> Are you interested in -HEAD ath(4) 802.11n on -9 ?

Sure, if you turn on debugging!

> The 11n adapter was in AP mode. It worked for a while giving a lot of
> ath_tx_normal_setup messages and then it stopped.
>
> =A0After an ifconfig down/up it worked again, with no more =A0ath_tx_norm=
al_setup
> messges, but I only get half the speed.

Did it log anything interesting in dmesg when you brought it down/up?
If the traffic stalled, it should have logged some basic information
about the state of the software queue.


> ifconfig wlan0 down
> ath0: ath_tx_tid_drain: node 0xc27e7000: bf=3D0xc21cff48: addbaw=3D0, dob=
aw=3D1,
> seqno=3D496, retry=3D0
> ath0: ath_tx_tid_drain: node 0xc27e7000: bf=3D0xc21cff48: tid txq_depth=
=3D54
> hwq_depth=3D0, bar_wait=3D1

Ah, here it is.

> ath0: ath_tx_tid_drain: node 0xc27e7000: tid 0: txq_depth=3D1, txq_aggr_d=
epth=3D0,
> sched=3D0, paused=3D1, hwq_depth=3D0, incomp=3D0, baw_head=3D17, baw_tail=
=3D17
> txa_start=3D496, ni_txseqs=3D550
> FRDS 00:1b:21:13:31:b6->14:7d:c5:65:4b:88(00:80:48:66:54:b4) data QoS [TI=
D 0]
> 0M
> =A08802 0000 147d c565 4b88 0080 4866 54b4 001b 2113 31b6 001f 0000 0000 =
aaaa
> 0300 0000 0800
> ath0: ath_tx_default_comp: bf 0xc21cff48: seqno 496: dobaw should've been
> cleared!

So, hm. There was one frame in the TID queue, the queue was paused,
the hardware queue had one frame in it (but not from that TID, maybe
it was the BAR frame going out) and bar_wait is 1.

So I bet something weird has happened with BAR TX.

What happens here is:

* TX aggregation fails and the TX sender needs to inform the receiver
that the Block ack window (BAW) needs to be moved along artificially
as there's a "hole" where the TX has failed;
* .. and this happens from too many retries;
* So it pauses the node TID queue and waits for the hardware frames to
finish transmitting;
* once that occurs, it knows where the "hole" will be and sends a BAR
frame to the receiver to say what sequence number the subsequent
aggregate frames will begin from;
* the receiver ACKs the frame normally;
* the sender then restarts the traffic by unpausing the TID and
marking the BAR flag as done.

It seems your BAR TX "stuck".

I've just modified the TX path a little to have (a) a small pool of TX
buffers just for management traffic, and (b) to limit how many buffers
are allocated when sending traffic so there's a small headroom
available for cloning buffers when retransmitting them. I have to work
on (c) soon - to limit how many TX buffers a given node / TID can
consume so you don't have one node monopolising things.

So please try the latest -HEAD and see if the problem persists. It
_should_ fix itself after 30 seconds (when the BAR TX times out and
the session is dropped back to non-aggregate.)

Thanks!


Adrian



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAJ-Vmomw6sO9eLQ3VqKBKsk77COwti_0KxZYaeT1%2B858n7oOLw>