From owner-svn-src-head@FreeBSD.ORG Wed Sep 26 03:45:43 2012 Return-Path: Delivered-To: svn-src-head@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id A8112106564A; Wed, 26 Sep 2012 03:45:43 +0000 (UTC) (envelope-from adrian@FreeBSD.org) Received: from svn.freebsd.org (svn.freebsd.org [IPv6:2001:4f8:fff6::2c]) by mx1.freebsd.org (Postfix) with ESMTP id 927EE8FC0C; Wed, 26 Sep 2012 03:45:43 +0000 (UTC) Received: from svn.freebsd.org (localhost [127.0.0.1]) by svn.freebsd.org (8.14.4/8.14.4) with ESMTP id q8Q3jhj7046880; Wed, 26 Sep 2012 03:45:43 GMT (envelope-from adrian@svn.freebsd.org) Received: (from adrian@localhost) by svn.freebsd.org (8.14.4/8.14.4/Submit) id q8Q3jhVP046878; Wed, 26 Sep 2012 03:45:43 GMT (envelope-from adrian@svn.freebsd.org) Message-Id: <201209260345.q8Q3jhVP046878@svn.freebsd.org> From: Adrian Chadd Date: Wed, 26 Sep 2012 03:45:43 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org X-SVN-Group: head MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Cc: Subject: svn commit: r240946 - head/sys/dev/ath X-BeenThere: svn-src-head@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: SVN commit messages for the src tree for head/-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 26 Sep 2012 03:45:43 -0000 Author: adrian Date: Wed Sep 26 03:45:42 2012 New Revision: 240946 URL: http://svn.freebsd.org/changeset/base/240946 Log: Map the non-QoS TID to the voice queue, in order to ensure important things like EAPOL frames make it out. After a whole bunch of hacking/testing, I discovered that they weren't being early-dropped by the stack (but I should look at ensuring that later..) but were even making to the hardware transmit queue. They were mostly even being received by the remote end. However, the remote end was completely ignoring them. This didn't happen under 150-170MBit TCP tests as I'm guessing the TX queue stayed very busy and the STA didn't do any scanning. However, when doing 100Mbit/s of TCP traffic, the STA would do background scanning - which involves it coming in and out of powersave mode with the AP. Now, this is a total and utter hack around the real problems, which are: * I need to implement proper power save handling and integrate it into the filtered frames support, so the driver/stack doesn't send frames whilst the station is actually in sleep; * .. but frames were actually making it to the STA (macbook pro) and the AP did receive an ACK; but a tcpdump on the receiving side showed the EAPOL frame never made it. So the stack was dropping it for some reason; * Importantly - the EAPOL frames are currently going into the non-QoS TID, which maps to the BE queue and is susceptible to that queue being busy doing other things, but; * There's other traffic going on in the non-QoS TID from other contexts when scanning is going on and it's possible there's some races causing sequence number/IV issues, but; * Importantly importantlly, I think the interaction with TID 16 multicast traffic in power save mode is causing issues - since I -believe- the sequence number space being used by the EAPOL frames on TID 16 overlaps with the multicast frames that have sequence numbers allocated and are then stuffed on the cabq. Since with EAPOL frames being in TID 16 and queued to the BE queue, it's going to be waiting to be serviced with all of the aggregate traffic going on - and if the CABQ gets emptied beforehand, those TID 16 multicast frames with sequence numbers will go out beforehand. Now, there's quite likely a bunch of "stuff happening slightly out of sequence" going on due to the nature of the TX path (read: lots of overlapping and concurrent ath_start() and ath_raw_xmit() calls going on, sigh) but I thought I had caught them all and stuffed each TID TX behind a lock (that lasted as long as it needed to in order to get the frame onto the relevant destination queue - thus keeping things in order.) Unfortunately the last problem is the big one and I'm going to stare at it some more. If it _is_ So this is a work around for now to ensure that EAPOL frames actually make it out before any other stuff in the non-QoS TID and HOPEFULLY before the CABQ gets active. I'm now going to spend a little time in the TX path figuring out exactly why the sender is rejecting things. There's two (well, three if you count EAPOL contents invalid) possibilities: * The sequence number is out of order (ie, something else like the multicast traffic on CABQ) is going out first on TID 16; * The CCMP IV is out of order (similar to above - but less likely, as the TX key for multicast traffic is different to unicast traffic); * EAPOL contents strangely invalid. AP: Ubiquiti RSPRO, AR9160/AR9220 NICs STA: Macbook Pro, Broadcom 11n NIC Modified: head/sys/dev/ath/if_ath_tx.c Modified: head/sys/dev/ath/if_ath_tx.c ============================================================================== --- head/sys/dev/ath/if_ath_tx.c Wed Sep 26 01:09:19 2012 (r240945) +++ head/sys/dev/ath/if_ath_tx.c Wed Sep 26 03:45:42 2012 (r240946) @@ -106,6 +106,11 @@ __FBSDID("$FreeBSD$"); */ #define SWMAX_RETRIES 10 +/* + * What queue to throw the non-QoS TID traffic into + */ +#define ATH_NONQOS_TID_AC WME_AC_VO + static int ath_tx_ampdu_pending(struct ath_softc *sc, struct ath_node *an, int tid); static int ath_tx_ampdu_running(struct ath_softc *sc, struct ath_node *an, @@ -191,7 +196,7 @@ ath_tx_getac(struct ath_softc *sc, const if (IEEE80211_QOS_HAS_SEQ(wh)) return pri; - return WME_AC_BE; + return ATH_NONQOS_TID_AC; } void @@ -2861,7 +2866,7 @@ ath_tx_tid_init(struct ath_softc *sc, st atid->cleanup_inprogress = 0; atid->clrdmask = 1; /* Always start by setting this bit */ if (i == IEEE80211_NONQOS_TID) - atid->ac = WME_AC_BE; + atid->ac = ATH_NONQOS_TID_AC; else atid->ac = TID_TO_WME_AC(i); }