From owner-svn-src-all@FreeBSD.ORG Tue Nov 8 22:43:14 2011 Return-Path: Delivered-To: svn-src-all@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 01562106566B; Tue, 8 Nov 2011 22:43:14 +0000 (UTC) (envelope-from adrian@FreeBSD.org) Received: from svn.freebsd.org (svn.freebsd.org [IPv6:2001:4f8:fff6::2c]) by mx1.freebsd.org (Postfix) with ESMTP id E33378FC1E; Tue, 8 Nov 2011 22:43:13 +0000 (UTC) Received: from svn.freebsd.org (localhost [127.0.0.1]) by svn.freebsd.org (8.14.4/8.14.4) with ESMTP id pA8MhD1o052564; Tue, 8 Nov 2011 22:43:13 GMT (envelope-from adrian@svn.freebsd.org) Received: (from adrian@localhost) by svn.freebsd.org (8.14.4/8.14.4/Submit) id pA8MhDVr052556; Tue, 8 Nov 2011 22:43:13 GMT (envelope-from adrian@svn.freebsd.org) Message-Id: <201111082243.pA8MhDVr052556@svn.freebsd.org> From: Adrian Chadd Date: Tue, 8 Nov 2011 22:43:13 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org X-SVN-Group: head MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Cc: Subject: svn commit: r227364 - in head/sys/dev/ath: . ath_rate/amrr ath_rate/onoe ath_rate/sample X-BeenThere: svn-src-all@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "SVN commit messages for the entire src tree \(except for " user" and " projects" \)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 08 Nov 2011 22:43:14 -0000 Author: adrian Date: Tue Nov 8 22:43:13 2011 New Revision: 227364 URL: http://svn.freebsd.org/changeset/base/227364 Log: Introduce TX aggregation and software TX queue management for Atheros AR5416 and later wireless devices. This is a very large commit - the complete history can be found in the user/adrian/if_ath_tx branch. Legacy (ie, pre-AR5416) devices also use the per-software TXQ support and (in theory) can support non-aggregation ADDBA sessions. However, the net80211 stack doesn't currently support this. In summary: TX path: * queued frames normally go onto a per-TID, per-node queue * some special frames (eg ADDBA control frames) are thrown directly onto the relevant hardware queue so they can go out before any software queued frames are queued. * Add methods to create, suspend, resume and tear down an aggregation session. * Add in software retransmission of both normal and aggregate frames. * Add in completion handling of aggregate frames, including parsing the block ack bitmap provided by the hardware. * Write an aggregation function which can assemble frames into an aggregate based on the selected rate control and channel configuration. * The per-TID queues are locked based on their target hardware TX queue. This matches what ath9k/atheros does, and thus simplified porting over some of the aggregation logic. * When doing TX aggregation, stick the sequence number allocation in the TX path rather than net80211 TX path, and protect it by the TXQ lock. Rate control: * Delay rate control selection until the frame is about to be queued to the hardware, so retried frames can have their rate control choices changed. Frames with a static rate control selection have that applied before each TX, just to simplify the TX path (ie, not have "static" and "dynamic" rate control special cased.) * Teach ath_rate_sample about aggregates - both completion and errors. * Add an EWMA for tracking what the current "good" MCS rate is based on failure rates. Misc: * Introduce a bunch of dirty hacks and workarounds so TID mapping and net80211 frame inspection can be kept out of the net80211 layer. Because of the way this code works (and it's from Atheros and Linux ath9k), there is a consistent, 1:1 mapping between TID and AC. So we need to ensure that frames going to a specific TID will _always_ end up on the right AC, and vice versa, or the completion/locking will simply get very confused. I plan on addressing this mess in the future. Known issues: * There is no BAR frame transmission just yet. A whole lot of tidying up needs to occur before BAR frame TX can occur in the "correct" place - ie, once the TID TX queue has been drained. * Interface reset/purge/etc results in frames in the TX and RX queues being removed. This creates holes in the sequence numbers being assigned and the TX/RX AMPDU code (on either side) just hangs. * There's no filtered frame support at the present moment, so stations going into power saving mode will simply have a number of frames dropped - likely resulting in a traffic "hang". * Raw frame TX is going to just not function with 11n aggregation. Likely this needs to be modified to always override the sequence number if the frame is going into an aggregation session. However, general raw frame injection currently doesn't work in general in net80211, so let's just ignore this for now until this is sorted out. * HT protection is just not implemented and won't be until the above is sorted out. In addition, the AR5416 has issues RTS protecting large aggregates (anything >8k), so the work around needs to be ported and tested. Thus, this will be put on hold until the above work is complete. * The rate control module 'sample' is the only currently supported module; onoe/amrr haven't been tested and have likely bit rotted a little. I'll follow up with some commits to make them work again for non-11n rates, but they won't be updated to handle 11n and aggregation. If someone wishes to do so then they're welcome to send along patches. * .. and "sample" doesn't really do a good job of 11n TX. Specifically, the metrics used (packet TX time and failure/success rates) isn't as useful for 11n. It's likely that it should be extended to take into account the aggregate throughput possible and then choose a rate which maximises that. Ie, it may be acceptable for a higher MCS rate with a higher failure to be used if it gives a more acceptable throughput/latency then a lower MCS rate @ a lower error rate. Again, patches will be gratefully accepted. Because of this, ATH_ENABLE_11N is still not enabled by default. Sponsored by: Hobnob, Inc. Obtained from: Linux, Atheros Modified: head/sys/dev/ath/ath_rate/amrr/amrr.c head/sys/dev/ath/ath_rate/onoe/onoe.c head/sys/dev/ath/ath_rate/sample/sample.c head/sys/dev/ath/ath_rate/sample/sample.h head/sys/dev/ath/if_ath.c head/sys/dev/ath/if_ath_misc.h head/sys/dev/ath/if_ath_tx.c head/sys/dev/ath/if_ath_tx.h head/sys/dev/ath/if_ath_tx_ht.c head/sys/dev/ath/if_ath_tx_ht.h head/sys/dev/ath/if_athrate.h Modified: head/sys/dev/ath/ath_rate/amrr/amrr.c ============================================================================== --- head/sys/dev/ath/ath_rate/amrr/amrr.c Tue Nov 8 22:01:03 2011 (r227363) +++ head/sys/dev/ath/ath_rate/amrr/amrr.c Tue Nov 8 22:43:13 2011 (r227364) @@ -122,19 +122,21 @@ ath_rate_findrate(struct ath_softc *sc, */ void ath_rate_getxtxrates(struct ath_softc *sc, struct ath_node *an, - uint8_t rix0, uint8_t *rix, uint8_t *try) + uint8_t rix0, struct ath_rc_series *rc) { struct amrr_node *amn = ATH_NODE_AMRR(an); -/* rix[0] = amn->amn_tx_rate0; */ - rix[1] = amn->amn_tx_rate1; - rix[2] = amn->amn_tx_rate2; - rix[3] = amn->amn_tx_rate3; - - try[0] = amn->amn_tx_try0; - try[1] = amn->amn_tx_try1; - try[2] = amn->amn_tx_try2; - try[3] = amn->amn_tx_try3; + rc[0].flags = rc[1].flags = rc[2].flags = rc[3].flags = 0; + + rc[0].rix = amn->amn_tx_rate0; + rc[1].rix = amn->amn_tx_rate1; + rc[2].rix = amn->amn_tx_rate2; + rc[3].rix = amn->amn_tx_rate3; + + rc[0].tries = amn->amn_tx_try0; + rc[1].tries = amn->amn_tx_try1; + rc[2].tries = amn->amn_tx_try2; + rc[3].tries = amn->amn_tx_try3; } @@ -153,10 +155,10 @@ ath_rate_setupxtxdesc(struct ath_softc * void ath_rate_tx_complete(struct ath_softc *sc, struct ath_node *an, - const struct ath_buf *bf) + const struct ath_rc_series *rc, const struct ath_tx_status *ts, + int frame_size, int nframes, int nbad) { struct amrr_node *amn = ATH_NODE_AMRR(an); - const struct ath_tx_status *ts = &bf->bf_status.ds_txstat; int sr = ts->ts_shortretry; int lr = ts->ts_longretry; int retry_count = sr + lr; Modified: head/sys/dev/ath/ath_rate/onoe/onoe.c ============================================================================== --- head/sys/dev/ath/ath_rate/onoe/onoe.c Tue Nov 8 22:01:03 2011 (r227363) +++ head/sys/dev/ath/ath_rate/onoe/onoe.c Tue Nov 8 22:43:13 2011 (r227364) @@ -130,19 +130,21 @@ ath_rate_findrate(struct ath_softc *sc, */ void ath_rate_getxtxrates(struct ath_softc *sc, struct ath_node *an, - uint8_t rix0, uint8_t *rix, uint8_t *try) + uint8_t rix0, struct ath_rc_series *rc) { struct onoe_node *on = ATH_NODE_ONOE(an); -/* rix[0] = on->on_tx_rate0; */ - rix[1] = on->on_tx_rate1; - rix[2] = on->on_tx_rate2; - rix[3] = on->on_tx_rate3; - - try[0] = on->on_tx_try0; - try[1] = 2; - try[2] = 2; - try[3] = 2; + rc[0].flags = rc[1].flags = rc[2].flags = rc[3].flags = 0; + + rc[0].rix = on->on_tx_rate0; + rc[1].rix = on->on_tx_rate1; + rc[2].rix = on->on_tx_rate2; + rc[3].rix = on->on_tx_rate3; + + rc[0].tries = on->on_tx_try0; + rc[1].tries = 2; + rc[2].tries = 2; + rc[3].tries = 2; } void @@ -160,10 +162,10 @@ ath_rate_setupxtxdesc(struct ath_softc * void ath_rate_tx_complete(struct ath_softc *sc, struct ath_node *an, - const struct ath_buf *bf) + const struct ath_rc_series *rc, const struct ath_tx_status *ts, + int frame_size, int nframes, int nbad) { struct onoe_node *on = ATH_NODE_ONOE(an); - const struct ath_tx_status *ts = &bf->bf_status.ds_txstat; if (ts->ts_status == 0) on->on_tx_ok++; Modified: head/sys/dev/ath/ath_rate/sample/sample.c ============================================================================== --- head/sys/dev/ath/ath_rate/sample/sample.c Tue Nov 8 22:01:03 2011 (r227363) +++ head/sys/dev/ath/ath_rate/sample/sample.c Tue Nov 8 22:43:13 2011 (r227364) @@ -170,12 +170,13 @@ pick_best_rate(struct ath_node *an, cons int size_bin, int require_acked_before) { struct sample_node *sn = ATH_NODE_SAMPLE(an); - int best_rate_rix, best_rate_tt; + int best_rate_rix, best_rate_tt, best_rate_pct; uint32_t mask; - int rix, tt; + int rix, tt, pct; best_rate_rix = 0; best_rate_tt = 0; + best_rate_pct = 0; for (mask = sn->ratemask, rix = 0; mask != 0; mask >>= 1, rix++) { if ((mask & 1) == 0) /* not a supported rate */ continue; @@ -192,13 +193,54 @@ pick_best_rate(struct ath_node *an, cons !sn->stats[size_bin][rix].packets_acked)) continue; + /* Calculate percentage if possible */ + if (sn->stats[size_bin][rix].total_packets > 0) { + pct = sn->stats[size_bin][rix].ewma_pct; + } else { + /* XXX for now, assume 95% ok */ + pct = 95; + } + /* don't use a bit-rate that has been failing */ if (sn->stats[size_bin][rix].successive_failures > 3) continue; - if (best_rate_tt == 0 || tt < best_rate_tt) { - best_rate_tt = tt; - best_rate_rix = rix; + /* + * For HT, Don't use a bit rate that is much more + * lossy than the best. + * + * XXX this isn't optimal; it's just designed to + * eliminate rates that are going to be obviously + * worse. + */ + if (an->an_node.ni_flags & IEEE80211_NODE_HT) { + if (best_rate_pct > (pct + 50)) + continue; + } + + /* + * For non-MCS rates, use the current average txtime for + * comparison. + */ + if (! (an->an_node.ni_flags & IEEE80211_NODE_HT)) { + if (best_rate_tt == 0 || tt <= best_rate_tt) { + best_rate_tt = tt; + best_rate_rix = rix; + best_rate_pct = pct; + } + } + + /* + * Since 2 stream rates have slightly higher TX times, + * allow a little bit of leeway. This should later + * be abstracted out and properly handled. + */ + if (an->an_node.ni_flags & IEEE80211_NODE_HT) { + if (best_rate_tt == 0 || (tt * 8 <= best_rate_tt * 10)) { + best_rate_tt = tt; + best_rate_rix = rix; + best_rate_pct = pct; + } } } return (best_rate_tt ? best_rate_rix : -1); @@ -257,6 +299,28 @@ pick_sample_rate(struct sample_softc *ss goto nextrate; } + /* + * When doing aggregation, successive failures don't happen + * as often, as sometimes some of the sub-frames get through. + * + * If the sample rix average tx time is greater than the + * average tx time of the current rix, don't immediately use + * the rate for sampling. + */ + if (an->an_node.ni_flags & IEEE80211_NODE_HT) { + if ((sn->stats[size_bin][rix].average_tx_time * 10 > + sn->stats[size_bin][current_rix].average_tx_time * 9) && + (ticks - sn->stats[size_bin][rix].last_tx < ssc->stale_failure_timeout)) { + mask &= ~(1< 11M for non-HT rates */ if (! (an->an_node.ni_flags & IEEE80211_NODE_HT)) { if (DOT11RATE(rix) > 2*11 && rix > current_rix + 2) { @@ -320,6 +384,96 @@ ath_rate_update_static_rix(struct ath_so } } +/* + * Pick a non-HT rate to begin using. + */ +static int +ath_rate_pick_seed_rate_legacy(struct ath_softc *sc, struct ath_node *an, + int frameLen) +{ +#define DOT11RATE(ix) (rt->info[ix].dot11Rate & IEEE80211_RATE_VAL) +#define MCS(ix) (rt->info[ix].dot11Rate | IEEE80211_RATE_MCS) +#define RATE(ix) (DOT11RATE(ix) / 2) + int rix = -1; + const HAL_RATE_TABLE *rt = sc->sc_currates; + struct sample_node *sn = ATH_NODE_SAMPLE(an); + const int size_bin = size_to_bin(frameLen); + + /* no packet has been sent successfully yet */ + for (rix = rt->rateCount-1; rix > 0; rix--) { + if ((sn->ratemask & (1<info[rix].phy == IEEE80211_T_HT) + continue; + + /* + * Pick the highest rate <= 36 Mbps + * that hasn't failed. + */ + if (DOT11RATE(rix) <= 72 && + sn->stats[size_bin][rix].successive_failures == 0) { + break; + } + } + return rix; +#undef RATE +#undef MCS +#undef DOT11RATE +} + +/* + * Pick a HT rate to begin using. + * + * Don't use any non-HT rates; only consider HT rates. + */ +static int +ath_rate_pick_seed_rate_ht(struct ath_softc *sc, struct ath_node *an, + int frameLen) +{ +#define DOT11RATE(ix) (rt->info[ix].dot11Rate & IEEE80211_RATE_VAL) +#define MCS(ix) (rt->info[ix].dot11Rate | IEEE80211_RATE_MCS) +#define RATE(ix) (DOT11RATE(ix) / 2) + int rix = -1, ht_rix = -1; + const HAL_RATE_TABLE *rt = sc->sc_currates; + struct sample_node *sn = ATH_NODE_SAMPLE(an); + const int size_bin = size_to_bin(frameLen); + + /* no packet has been sent successfully yet */ + for (rix = rt->rateCount-1; rix > 0; rix--) { + /* Skip rates we can't use */ + if ((sn->ratemask & (1<info[rix].phy == IEEE80211_T_HT) + ht_rix = rix; + + /* Skip non-HT rates */ + if (rt->info[rix].phy != IEEE80211_T_HT) + continue; + + /* + * Pick a medium-speed rate regardless of stream count + * which has not seen any failures. Higher rates may fail; + * we'll try them later. + */ + if (((MCS(rix) & 0x7) <= 4) && + sn->stats[size_bin][rix].successive_failures == 0) { + break; + } + } + + /* + * If all the MCS rates have successive failures, rix should be + * > 0; otherwise use the lowest MCS rix (hopefully MCS 0.) + */ + return MAX(rix, ht_rix); +#undef RATE +#undef MCS +#undef DOT11RATE +} void @@ -363,9 +517,14 @@ ath_rate_findrate(struct ath_softc *sc, if (sn->sample_tt[size_bin] < average_tx_time * (sn->packets_since_sample[size_bin]*ssc->sample_rate/100)) { rix = pick_sample_rate(ssc, an, rt, size_bin); IEEE80211_NOTE(an->an_node.ni_vap, IEEE80211_MSG_RATECTL, - &an->an_node, "size %u sample rate %d current rate %d", - bin_to_size(size_bin), RATE(rix), - RATE(sn->current_rix[size_bin])); + &an->an_node, "att %d sample_tt %d size %u sample rate %d %s current rate %d %s", + average_tx_time, + sn->sample_tt[size_bin], + bin_to_size(size_bin), + dot11rate(rt, rix), + dot11rate_label(rt, rix), + dot11rate(rt, sn->current_rix[size_bin]), + dot11rate_label(rt, sn->current_rix[size_bin])); if (rix != sn->current_rix[size_bin]) { sn->current_sample_rix[size_bin] = rix; } else { @@ -376,29 +535,58 @@ ath_rate_findrate(struct ath_softc *sc, change_rates = 0; if (!sn->packets_sent[size_bin] || best_rix == -1) { /* no packet has been sent successfully yet */ - for (rix = rt->rateCount-1; rix > 0; rix--) { - if ((sn->ratemask & (1<stats[size_bin][rix].successive_failures == 0) { - break; - } - } change_rates = 1; - best_rix = rix; + if (an->an_node.ni_flags & IEEE80211_NODE_HT) + best_rix = + ath_rate_pick_seed_rate_ht(sc, an, frameLen); + else + best_rix = + ath_rate_pick_seed_rate_legacy(sc, an, frameLen); } else if (sn->packets_sent[size_bin] < 20) { /* let the bit-rate switch quickly during the first few packets */ + IEEE80211_NOTE(an->an_node.ni_vap, + IEEE80211_MSG_RATECTL, &an->an_node, + "%s: switching quickly..", __func__); change_rates = 1; } else if (ticks - ssc->min_switch > sn->ticks_since_switch[size_bin]) { /* min_switch seconds have gone by */ + IEEE80211_NOTE(an->an_node.ni_vap, + IEEE80211_MSG_RATECTL, &an->an_node, + "%s: min_switch %d > ticks_since_switch %d..", + __func__, ticks - ssc->min_switch, sn->ticks_since_switch[size_bin]); change_rates = 1; - } else if (2*average_tx_time < sn->stats[size_bin][sn->current_rix[size_bin]].average_tx_time) { + } else if ((! (an->an_node.ni_flags & IEEE80211_NODE_HT)) && + (2*average_tx_time < sn->stats[size_bin][sn->current_rix[size_bin]].average_tx_time)) { /* the current bit-rate is twice as slow as the best one */ + IEEE80211_NOTE(an->an_node.ni_vap, + IEEE80211_MSG_RATECTL, &an->an_node, + "%s: 2x att (= %d) < cur_rix att %d", + __func__, + 2 * average_tx_time, sn->stats[size_bin][sn->current_rix[size_bin]].average_tx_time); change_rates = 1; + } else if ((an->an_node.ni_flags & IEEE80211_NODE_HT)) { + int cur_rix = sn->current_rix[size_bin]; + int cur_att = sn->stats[size_bin][cur_rix].average_tx_time; + /* + * If the node is HT, upgrade it if the MCS rate is + * higher and the average tx time is within 20% of + * the current rate. It can fail a little. + * + * This is likely not optimal! + */ +#if 0 + printf("cur rix/att %x/%d, best rix/att %x/%d\n", + MCS(cur_rix), cur_att, MCS(best_rix), average_tx_time); +#endif + if ((MCS(best_rix) > MCS(cur_rix)) && + (average_tx_time * 8) <= (cur_att * 10)) { + IEEE80211_NOTE(an->an_node.ni_vap, + IEEE80211_MSG_RATECTL, &an->an_node, + "%s: HT: best_rix 0x%d > cur_rix 0x%x, average_tx_time %d, cur_att %d", + __func__, + MCS(best_rix), MCS(cur_rix), average_tx_time, cur_att); + change_rates = 1; + } } sn->packets_since_sample[size_bin]++; @@ -450,22 +638,24 @@ done: */ void ath_rate_getxtxrates(struct ath_softc *sc, struct ath_node *an, - uint8_t rix0, uint8_t *rix, uint8_t *try) + uint8_t rix0, struct ath_rc_series *rc) { struct sample_node *sn = ATH_NODE_SAMPLE(an); const struct txschedule *sched = &sn->sched[rix0]; KASSERT(rix0 == sched->r0, ("rix0 (%x) != sched->r0 (%x)!\n", rix0, sched->r0)); -/* rix[0] = sched->r0; */ - rix[1] = sched->r1; - rix[2] = sched->r2; - rix[3] = sched->r3; - - try[0] = sched->t0; - try[1] = sched->t1; - try[2] = sched->t2; - try[3] = sched->t3; + rc[0].flags = rc[1].flags = rc[2].flags = rc[3].flags = 0; + + rc[0].rix = sched->r0; + rc[1].rix = sched->r1; + rc[2].rix = sched->r2; + rc[3].rix = sched->r3; + + rc[0].tries = sched->t0; + rc[1].tries = sched->t1; + rc[2].tries = sched->t2; + rc[3].tries = sched->t3; } void @@ -493,6 +683,71 @@ ath_rate_setupxtxdesc(struct ath_softc * s3code, sched->t3); /* series 3 */ } +/* + * Update the EWMA percentage. + * + * This is a simple hack to track an EWMA based on the current + * rate scenario. For the rate codes which failed, this will + * record a 0% against it. For the rate code which succeeded, + * EWMA will record the nbad*100/nframes percentage against it. + */ +static void +update_ewma_stats(struct ath_softc *sc, struct ath_node *an, + int frame_size, + int rix0, int tries0, + int rix1, int tries1, + int rix2, int tries2, + int rix3, int tries3, + int short_tries, int tries, int status, + int nframes, int nbad) +{ + struct sample_node *sn = ATH_NODE_SAMPLE(an); + struct sample_softc *ssc = ATH_SOFTC_SAMPLE(sc); + const int size_bin = size_to_bin(frame_size); + int tries_so_far; + int pct; + int rix = rix0; + + /* Calculate percentage based on current rate */ + if (nframes == 0) + nframes = nbad = 1; + pct = ((nframes - nbad) * 1000) / nframes; + + /* Figure out which rate index succeeded */ + tries_so_far = tries0; + + if (tries1 && tries_so_far < tries) { + tries_so_far += tries1; + rix = rix1; + /* XXX bump ewma pct */ + } + + if (tries2 && tries_so_far < tries) { + tries_so_far += tries2; + rix = rix2; + /* XXX bump ewma pct */ + } + + if (tries3 && tries_so_far < tries) { + rix = rix3; + /* XXX bump ewma pct */ + } + + /* rix is the successful rate, update EWMA for final rix */ + if (sn->stats[size_bin][rix].total_packets < + ssc->smoothing_minpackets) { + /* just average the first few packets */ + int a_pct = (sn->stats[size_bin][rix].packets_acked * 1000) / + (sn->stats[size_bin][rix].total_packets); + sn->stats[size_bin][rix].ewma_pct = a_pct; + } else { + /* use a ewma */ + sn->stats[size_bin][rix].ewma_pct = + ((sn->stats[size_bin][rix].ewma_pct * ssc->smoothing_rate) + + (pct * (100 - ssc->smoothing_rate))) / 100; + } +} + static void update_stats(struct ath_softc *sc, struct ath_node *an, int frame_size, @@ -500,10 +755,12 @@ update_stats(struct ath_softc *sc, struc int rix1, int tries1, int rix2, int tries2, int rix3, int tries3, - int short_tries, int tries, int status) + int short_tries, int tries, int status, + int nframes, int nbad) { struct sample_node *sn = ATH_NODE_SAMPLE(an); struct sample_softc *ssc = ATH_SOFTC_SAMPLE(sc); + const HAL_RATE_TABLE *rt = sc->sc_currates; const int size_bin = size_to_bin(frame_size); const int size = bin_to_size(size_bin); int tt, tries_so_far; @@ -542,7 +799,7 @@ update_stats(struct ath_softc *sc, struc /* just average the first few packets */ int avg_tx = sn->stats[size_bin][rix0].average_tx_time; int packets = sn->stats[size_bin][rix0].total_packets; - sn->stats[size_bin][rix0].average_tx_time = (tt+(avg_tx*packets))/(packets+1); + sn->stats[size_bin][rix0].average_tx_time = (tt+(avg_tx*packets))/(packets+nframes); } else { /* use a ewma */ sn->stats[size_bin][rix0].average_tx_time = @@ -550,38 +807,50 @@ update_stats(struct ath_softc *sc, struc (tt * (100 - ssc->smoothing_rate))) / 100; } - if (status != 0) { + /* + * XXX Don't mark the higher bit rates as also having failed; as this + * unfortunately stops those rates from being tasted when trying to + * TX. This happens with 11n aggregation. + */ + if (nframes == nbad) { +#if 0 int y; - sn->stats[size_bin][rix0].successive_failures++; +#endif + sn->stats[size_bin][rix0].successive_failures += nbad; +#if 0 for (y = size_bin+1; y < NUM_PACKET_SIZE_BINS; y++) { /* * Also say larger packets failed since we * assume if a small packet fails at a * bit-rate then a larger one will also. */ - sn->stats[y][rix0].successive_failures++; + sn->stats[y][rix0].successive_failures += nbad; sn->stats[y][rix0].last_tx = ticks; sn->stats[y][rix0].tries += tries; - sn->stats[y][rix0].total_packets++; + sn->stats[y][rix0].total_packets += nframes; } +#endif } else { - sn->stats[size_bin][rix0].packets_acked++; + sn->stats[size_bin][rix0].packets_acked += (nframes - nbad); sn->stats[size_bin][rix0].successive_failures = 0; } sn->stats[size_bin][rix0].tries += tries; sn->stats[size_bin][rix0].last_tx = ticks; - sn->stats[size_bin][rix0].total_packets++; + sn->stats[size_bin][rix0].total_packets += nframes; if (rix0 == sn->current_sample_rix[size_bin]) { IEEE80211_NOTE(an->an_node.ni_vap, IEEE80211_MSG_RATECTL, &an->an_node, -"%s: size %d %s sample rate %d tries (%d/%d) tt %d avg_tt (%d/%d)", +"%s: size %d %s sample rate %d %s tries (%d/%d) tt %d avg_tt (%d/%d) nfrm %d nbad %d", __func__, size, status ? "FAIL" : "OK", - rix0, short_tries, tries, tt, + dot11rate(rt, rix0), + dot11rate_label(rt, rix0), + short_tries, tries, tt, sn->stats[size_bin][rix0].average_tx_time, - sn->stats[size_bin][rix0].perfect_tx_time); + sn->stats[size_bin][rix0].perfect_tx_time, + nframes, nbad); sn->sample_tt[size_bin] = tt; sn->current_sample_rix[size_bin] = -1; } @@ -596,21 +865,21 @@ badrate(struct ifnet *ifp, int series, i void ath_rate_tx_complete(struct ath_softc *sc, struct ath_node *an, - const struct ath_buf *bf) + const struct ath_rc_series *rc, const struct ath_tx_status *ts, + int frame_size, int nframes, int nbad) { struct ifnet *ifp = sc->sc_ifp; struct ieee80211com *ic = ifp->if_l2com; struct sample_node *sn = ATH_NODE_SAMPLE(an); - const struct ath_tx_status *ts = &bf->bf_status.ds_txstat; - const struct ath_desc *ds0 = &bf->bf_desc[0]; - int final_rix, short_tries, long_tries, frame_size; + int final_rix, short_tries, long_tries; const HAL_RATE_TABLE *rt = sc->sc_currates; + int status = ts->ts_status; int mrr; final_rix = rt->rateCodeToIndex[ts->ts_rate]; short_tries = ts->ts_shortretry; long_tries = ts->ts_longretry + 1; - frame_size = ds0->ds_ctl0 & 0x0fff; /* low-order 12 bits of ds_ctl0 */ + if (frame_size == 0) /* NB: should not happen */ frame_size = 1500; @@ -620,63 +889,73 @@ ath_rate_tx_complete(struct ath_softc *s "%s: size %d %s rate/try %d/%d no rates yet", __func__, bin_to_size(size_to_bin(frame_size)), - ts->ts_status ? "FAIL" : "OK", + status ? "FAIL" : "OK", short_tries, long_tries); return; } mrr = sc->sc_mrretry && !(ic->ic_flags & IEEE80211_F_USEPROT); if (!mrr || ts->ts_finaltsi == 0) { if (!IS_RATE_DEFINED(sn, final_rix)) { - badrate(ifp, 0, ts->ts_rate, long_tries, ts->ts_status); + badrate(ifp, 0, ts->ts_rate, long_tries, status); return; } /* * Only one rate was used; optimize work. */ IEEE80211_NOTE(an->an_node.ni_vap, IEEE80211_MSG_RATECTL, - &an->an_node, "%s: size %d (%d bytes) %s rate/try %d %s/%d/%d", + &an->an_node, "%s: size %d (%d bytes) %s rate/try %d %s/%d/%d nframes/nbad [%d/%d]", __func__, bin_to_size(size_to_bin(frame_size)), frame_size, - ts->ts_status ? "FAIL" : "OK", - dot11rate(rt, final_rix), dot11rate_label(rt, final_rix), short_tries, long_tries); + status ? "FAIL" : "OK", + dot11rate(rt, final_rix), dot11rate_label(rt, final_rix), + short_tries, long_tries, nframes, nbad); update_stats(sc, an, frame_size, final_rix, long_tries, 0, 0, 0, 0, 0, 0, - short_tries, long_tries, ts->ts_status); + short_tries, long_tries, status, + nframes, nbad); + update_ewma_stats(sc, an, frame_size, + final_rix, long_tries, + 0, 0, + 0, 0, + 0, 0, + short_tries, long_tries, status, + nframes, nbad); + } else { - int hwrates[4], tries[4], rix[4]; int finalTSIdx = ts->ts_finaltsi; int i; /* * Process intermediate rates that failed. */ - ath_hal_gettxcompletionrates(sc->sc_ah, ds0, hwrates, tries); - - for (i = 0; i < 4; i++) { - rix[i] = rt->rateCodeToIndex[hwrates[i]]; - } IEEE80211_NOTE(an->an_node.ni_vap, IEEE80211_MSG_RATECTL, &an->an_node, -"%s: size %d (%d bytes) finaltsidx %d tries %d %s rate/try [%d %s/%d %d %s/%d %d %s/%d %d %s/%d]", +"%s: size %d (%d bytes) finaltsidx %d tries %d %s rate/try [%d %s/%d %d %s/%d %d %s/%d %d %s/%d] nframes/nbad [%d/%d]", __func__, bin_to_size(size_to_bin(frame_size)), frame_size, finalTSIdx, - long_tries, - ts->ts_status ? "FAIL" : "OK", - dot11rate(rt, rix[0]), dot11rate_label(rt, rix[0]), tries[0], - dot11rate(rt, rix[1]), dot11rate_label(rt, rix[1]), tries[1], - dot11rate(rt, rix[2]), dot11rate_label(rt, rix[2]), tries[2], - dot11rate(rt, rix[3]), dot11rate_label(rt, rix[3]), tries[3]); + long_tries, + status ? "FAIL" : "OK", + dot11rate(rt, rc[0].rix), + dot11rate_label(rt, rc[0].rix), rc[0].tries, + dot11rate(rt, rc[1].rix), + dot11rate_label(rt, rc[1].rix), rc[1].tries, + dot11rate(rt, rc[2].rix), + dot11rate_label(rt, rc[2].rix), rc[2].tries, + dot11rate(rt, rc[3].rix), + dot11rate_label(rt, rc[3].rix), rc[3].tries, + nframes, nbad); for (i = 0; i < 4; i++) { - if (tries[i] && !IS_RATE_DEFINED(sn, rix[i])) - badrate(ifp, 0, hwrates[i], tries[i], ts->ts_status); + if (rc[i].tries && !IS_RATE_DEFINED(sn, rc[i].rix)) + badrate(ifp, 0, rc[i].ratecode, rc[i].tries, + status); } /* @@ -686,48 +965,62 @@ ath_rate_tx_complete(struct ath_softc *s * sample higher rates 1 try at a time doing so * may unfairly penalize them. */ - if (tries[0]) { - update_stats(sc, an, frame_size, - rix[0], tries[0], - rix[1], tries[1], - rix[2], tries[2], - rix[3], tries[3], - short_tries, long_tries, - long_tries > tries[0]); - long_tries -= tries[0]; + if (rc[0].tries) { + update_stats(sc, an, frame_size, + rc[0].rix, rc[0].tries, + rc[1].rix, rc[1].tries, + rc[2].rix, rc[2].tries, + rc[3].rix, rc[3].tries, + short_tries, long_tries, + long_tries > rc[0].tries, + nframes, nbad); + long_tries -= rc[0].tries; } - if (tries[1] && finalTSIdx > 0) { - update_stats(sc, an, frame_size, - rix[1], tries[1], - rix[2], tries[2], - rix[3], tries[3], - 0, 0, - short_tries, long_tries, - ts->ts_status); - long_tries -= tries[1]; + if (rc[1].tries && finalTSIdx > 0) { + update_stats(sc, an, frame_size, + rc[1].rix, rc[1].tries, + rc[2].rix, rc[2].tries, + rc[3].rix, rc[3].tries, + 0, 0, + short_tries, long_tries, + status, + nframes, nbad); + long_tries -= rc[1].tries; } - if (tries[2] && finalTSIdx > 1) { - update_stats(sc, an, frame_size, - rix[2], tries[2], - rix[3], tries[3], + if (rc[2].tries && finalTSIdx > 1) { + update_stats(sc, an, frame_size, + rc[2].rix, rc[2].tries, + rc[3].rix, rc[3].tries, 0, 0, 0, 0, - short_tries, long_tries, - ts->ts_status); - long_tries -= tries[2]; + short_tries, long_tries, + status, + nframes, nbad); + long_tries -= rc[2].tries; } - if (tries[3] && finalTSIdx > 2) { - update_stats(sc, an, frame_size, - rix[3], tries[3], + if (rc[3].tries && finalTSIdx > 2) { + update_stats(sc, an, frame_size, + rc[3].rix, rc[3].tries, 0, 0, 0, 0, 0, 0, - short_tries, long_tries, - ts->ts_status); + short_tries, long_tries, + status, + nframes, nbad); } + + update_ewma_stats(sc, an, frame_size, + rc[0].rix, rc[0].tries, + rc[1].rix, rc[1].tries, + rc[2].rix, rc[2].tries, + rc[3].rix, rc[3].tries, + short_tries, long_tries, + long_tries > rc[0].tries, + nframes, nbad); + } } @@ -849,6 +1142,7 @@ ath_rate_ctl_reset(struct ath_softc *sc, sn->stats[y][rix].total_packets = 0; sn->stats[y][rix].packets_acked = 0; sn->stats[y][rix].last_tx = 0; + sn->stats[y][rix].ewma_pct = 0; sn->stats[y][rix].perfect_tx_time = calc_usecs_unicast_packet(sc, size, rix, 0, 0, @@ -886,18 +1180,24 @@ sample_stats(void *arg, struct ieee80211 uint32_t mask; int rix, y; - printf("\n[%s] refcnt %d static_rix %d ratemask 0x%x\n", + printf("\n[%s] refcnt %d static_rix (%d %s) ratemask 0x%x\n", ether_sprintf(ni->ni_macaddr), ieee80211_node_refcnt(ni), - sn->static_rix, sn->ratemask); + dot11rate(rt, sn->static_rix), + dot11rate_label(rt, sn->static_rix), + sn->ratemask); for (y = 0; y < NUM_PACKET_SIZE_BINS; y++) { printf("[%4u] cur rix %d (%d %s) since switch: packets %d ticks %u\n", bin_to_size(y), sn->current_rix[y], dot11rate(rt, sn->current_rix[y]), dot11rate_label(rt, sn->current_rix[y]), sn->packets_since_switch[y], sn->ticks_since_switch[y]); - printf("[%4u] last sample %d cur sample %d packets sent %d\n", - bin_to_size(y), sn->last_sample_rix[y], - sn->current_sample_rix[y], sn->packets_sent[y]); + printf("[%4u] last sample (%d %s) cur sample (%d %s) packets sent %d\n", + bin_to_size(y), + dot11rate(rt, sn->last_sample_rix[y]), + dot11rate_label(rt, sn->last_sample_rix[y]), + dot11rate(rt, sn->current_sample_rix[y]), + dot11rate_label(rt, sn->current_sample_rix[y]), + sn->packets_sent[y]); printf("[%4u] packets since sample %d sample tt %u\n", bin_to_size(y), sn->packets_since_sample[y], sn->sample_tt[y]); @@ -908,13 +1208,15 @@ sample_stats(void *arg, struct ieee80211 for (y = 0; y < NUM_PACKET_SIZE_BINS; y++) { if (sn->stats[y][rix].total_packets == 0) continue; - printf("[%2u %s:%4u] %8ju:%-8ju (%3d%%) T %8ju F %4d avg %5u last %u\n", + printf("[%2u %s:%4u] %8ju:%-8ju (%3d%%) (EWMA %3d.%1d%%) T %8ju F %4d avg %5u last %u\n", dot11rate(rt, rix), dot11rate_label(rt, rix), bin_to_size(y), (uintmax_t) sn->stats[y][rix].total_packets, (uintmax_t) sn->stats[y][rix].packets_acked, (int) ((sn->stats[y][rix].packets_acked * 100ULL) / sn->stats[y][rix].total_packets), + sn->stats[y][rix].ewma_pct / 10, + sn->stats[y][rix].ewma_pct % 10, (uintmax_t) sn->stats[y][rix].tries, sn->stats[y][rix].successive_failures, sn->stats[y][rix].average_tx_time, Modified: head/sys/dev/ath/ath_rate/sample/sample.h ============================================================================== --- head/sys/dev/ath/ath_rate/sample/sample.h Tue Nov 8 22:01:03 2011 (r227363) +++ head/sys/dev/ath/ath_rate/sample/sample.h Tue Nov 8 22:43:13 2011 (r227364) @@ -51,6 +51,7 @@ struct sample_softc { int max_successive_failures; int stale_failure_timeout; /* how long to honor max_successive_failures */ int min_switch; /* min time between rate changes */ + int min_good_pct; /* min good percentage for a rate to be considered */ }; #define ATH_SOFTC_SAMPLE(sc) ((struct sample_softc *)sc->sc_rc) @@ -60,6 +61,7 @@ struct rate_stats { uint64_t tries; uint64_t total_packets; /* pkts total since assoc */ uint64_t packets_acked; /* pkts acked since assoc */ + int ewma_pct; /* EWMA percentage */ unsigned perfect_tx_time; /* transmit time for 0 retries */ int last_tx; }; Modified: head/sys/dev/ath/if_ath.c ============================================================================== --- head/sys/dev/ath/if_ath.c Tue Nov 8 22:01:03 2011 (r227363) +++ head/sys/dev/ath/if_ath.c Tue Nov 8 22:43:13 2011 (r227364) @@ -723,6 +723,19 @@ ath_attach(u_int16_t devid, struct ath_s ic->ic_scan_end = ath_scan_end; ic->ic_set_channel = ath_set_channel; + /* 802.11n specific - but just override anyway */ + sc->sc_addba_request = ic->ic_addba_request; + sc->sc_addba_response = ic->ic_addba_response; + sc->sc_addba_stop = ic->ic_addba_stop; + sc->sc_bar_response = ic->ic_bar_response; + sc->sc_addba_response_timeout = ic->ic_addba_response_timeout; + + ic->ic_addba_request = ath_addba_request; + ic->ic_addba_response = ath_addba_response; + ic->ic_addba_response_timeout = ath_addba_response_timeout; + ic->ic_addba_stop = ath_addba_stop; + ic->ic_bar_response = ath_bar_response; + ieee80211_radiotap_attach(ic, &sc->sc_tx_th.wt_ihdr, sizeof(sc->sc_tx_th), ATH_TX_RADIOTAP_PRESENT, @@ -3343,6 +3356,9 @@ ath_node_alloc(struct ieee80211vap *vap, device_get_nameunit(sc->sc_dev), an); mtx_init(&an->an_mtx, an->an_name, NULL, MTX_DEF); + /* XXX setup ath_tid */ + ath_tx_tid_init(sc, an); + DPRINTF(sc, ATH_DEBUG_NODE, "%s: an %p\n", __func__, an); return &an->an_node; } @@ -3354,6 +3370,7 @@ ath_node_cleanup(struct ieee80211_node * struct ath_softc *sc = ic->ic_ifp->if_softc; /* Cleanup ath_tid, free unused bufs, unlink bufs in TXQ */ + ath_tx_node_flush(sc, ATH_NODE(ni)); ath_rate_node_cleanup(sc, ATH_NODE(ni)); sc->sc_node_cleanup(ni); } @@ -4364,6 +4381,29 @@ ath_tx_default_comp(struct ath_softc *sc } /* + * Update rate control with the given completion status. + */ +void +ath_tx_update_ratectrl(struct ath_softc *sc, struct ieee80211_node *ni, + struct ath_rc_series *rc, struct ath_tx_status *ts, int frmlen, + int nframes, int nbad) +{ + struct ath_node *an; + + /* Only for unicast frames */ + if (ni == NULL) + return; + + an = ATH_NODE(ni); + + if ((ts->ts_status & HAL_TXERR_FILT) == 0) { + ATH_NODE_LOCK(an); + ath_rate_tx_complete(sc, an, rc, ts, frmlen, nframes, nbad); + ATH_NODE_UNLOCK(an); + } +} + +/* * Update the busy status of the last frame on the free list. * When doing TDMA, the busy flag tracks whether the hardware * currently points to this buffer or not, and thus gated DMA @@ -4396,6 +4436,8 @@ ath_tx_update_busy(struct ath_softc *sc) /* * Process completed xmit descriptors from the specified queue. + * Kick the packet scheduler if needed. This can occur from this + * particular task. */ static int ath_tx_processq(struct ath_softc *sc, struct ath_txq *txq, int dosched) @@ -4405,6 +4447,7 @@ ath_tx_processq(struct ath_softc *sc, st struct ath_desc *ds; struct ath_tx_status *ts; struct ieee80211_node *ni; + struct ath_node *an; int nacked; HAL_STATUS status; @@ -4471,6 +4514,7 @@ ath_tx_processq(struct ath_softc *sc, st /* If unicast frame, update general statistics */ if (ni != NULL) { + an = ATH_NODE(ni); /* update statistics */ ath_tx_update_stats(sc, ts, bf); } @@ -4490,7 +4534,10 @@ ath_tx_processq(struct ath_softc *sc, st * XXX assume this isn't an aggregate * frame. */ - ath_rate_tx_complete(sc, ATH_NODE(ni), bf); + ath_tx_update_ratectrl(sc, ni, + bf->bf_state.bfs_rc, ts, + bf->bf_state.bfs_pktlen, 1, + (ts->ts_status == 0 ? 0 : 1)); } ath_tx_default_comp(sc, bf, 0); } else @@ -4503,6 +4550,14 @@ ath_tx_processq(struct ath_softc *sc, st if (txq->axq_depth <= 1) ieee80211_ff_flush(ic, txq->axq_ac); #endif + + /* Kick the TXQ scheduler */ + if (dosched) { + ATH_TXQ_LOCK(txq); + ath_txq_sched(sc, txq); + ATH_TXQ_UNLOCK(txq); + } + return nacked; } @@ -4736,6 +4791,11 @@ ath_tx_draintxq(struct ath_softc *sc, st ath_tx_default_comp(sc, bf, 1); } + /* + * Drain software queued frames which are on + * active TIDs. + */ + ath_tx_txq_drain(sc, txq); } static void Modified: head/sys/dev/ath/if_ath_misc.h ============================================================================== --- head/sys/dev/ath/if_ath_misc.h Tue Nov 8 22:01:03 2011 (r227363) *** DIFF OUTPUT TRUNCATED AT 1000 LINES ***