From owner-svn-src-head@freebsd.org Fri May 15 18:51:22 2020 Return-Path: Delivered-To: svn-src-head@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 0D4A12FD655; Fri, 15 May 2020 18:51:22 +0000 (UTC) (envelope-from adrian@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) server-signature RSA-PSS (4096 bits) client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 49NyDT6zc0z4HLw; Fri, 15 May 2020 18:51:21 +0000 (UTC) (envelope-from adrian@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id EA926F1C; Fri, 15 May 2020 18:51:21 +0000 (UTC) (envelope-from adrian@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id 04FIpL1r010433; Fri, 15 May 2020 18:51:21 GMT (envelope-from adrian@FreeBSD.org) Received: (from adrian@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id 04FIpKgJ010425; Fri, 15 May 2020 18:51:20 GMT (envelope-from adrian@FreeBSD.org) Message-Id: <202005151851.04FIpKgJ010425@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: adrian set sender to adrian@FreeBSD.org using -f From: Adrian Chadd Date: Fri, 15 May 2020 18:51:20 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org Subject: svn commit: r361085 - in head/sys/dev/ath: . ath_hal/ar5416 ath_rate/amrr ath_rate/onoe ath_rate/sample X-SVN-Group: head X-SVN-Commit-Author: adrian X-SVN-Commit-Paths: in head/sys/dev/ath: . ath_hal/ar5416 ath_rate/amrr ath_rate/onoe ath_rate/sample X-SVN-Commit-Revision: 361085 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-head@freebsd.org X-Mailman-Version: 2.1.33 Precedence: list List-Id: SVN commit messages for the src tree for head/-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 15 May 2020 18:51:22 -0000 Author: adrian Date: Fri May 15 18:51:20 2020 New Revision: 361085 URL: https://svnweb.freebsd.org/changeset/base/361085 Log: [ath] [ath_rate] Extend ath_rate_sample to better handle 11n rates and aggregates. My initial rate control code was .. suboptimal. I wanted to at least get MCS rates sent, but it didn't do anywhere near enough to handle low signal level links or remotely keep accurate statistics. So, 8 years later, here's what I should've done back then. * Firstly, I wasn't at all tracking packet sizes other than the two buckets (250 and 1600 bytes.) So, extend it to include 4096, 8192, 16384, 32768 and 65536. I may go add 2048 at some point if I find it's useful. This is important for a few reasons. First, when forming A-MPDU or AMSDU aggregates the frame sizes are larger, and thus the TX time calculation is woefully, increasingly wrong. Secondly, the behaviour of 802.11 channels isn't some fixed thing, both due to channel conditions and radios themselves. Notably, there was some observations done a few years ago on 11n chipsets which noticed longer aggregates showed an increase in failed A-MPDU sub-frame reception as you got further along in the transmit time. It could be due to a variety of things - transmitter linearity, channel conditions changing, frequency/phase drift, etc - but the observation was to potentially form shorter aggregates to improve BER. * .. and then modify the ath TX path to report the length of the aggregate sent, so as the statistics kept would line up with the correct bucket. * Then on the rate control look-up side - i was also only using the first frame length for an A-MPDU rate control lookup which isn't good enough here. So, add a new method that walks the TID software queue for that node to find out what the likely length of data available is. It isn't ALL of the data in the queue because we'll only ever send enough data to fit inside the block-ack window, so limit how many bytes we return to roughly what ath_tx_form_aggr() would do. * .. and cache that in the first ath_buf in the aggregate so it and the eventual AMPDU length can be returned to the rate control code. * THEN, modify the rate control code to look at them both when deciding which bucket to attribute the sent frame on. I'm erring on the side of caution and using the size bucket that the lookup is based on. Ok, so now the rate lookups and statistics are "more correct". However, MCS rates are not the same as 11abg rates in that they're not a monotonically incrementing set of faster rates and you can't assume that just because a given MCS rate fails, the next higher one wouldn't work better or be a lower average tx time. So, I had to do a bunch of surgery to the best rate and sample rate math. This is the bit that's a WIP. * First, simplify the statistics updates (update_stats()) to do a single pass on all rates. * Next, make sure that each rate average tx time is updated based on /its/ failure/success. Eg if you sent a frame with { MCS15, MCS12, MCS8 } and MCS8 succeeded, MCS15 and MCS 12 would have their average tx time updated for /their/ part of the transmission, not the whole transmission. * Next, EWMA wasn't being fully calculated based on the /failures/ in each of the rate attempts. So, if MCS15, MCS12 failed above but MCS8 didn't, then ensure that the statistics noted that /all/ subframes failed at those rates, rather than the eventual set of transmitted/sent frames. This ensures the EWMA /and/ average TX time are updated correctly. * When picking a sample rate and initial rate, probe rates aroud the current MCS but limit it to MCS0..7 /for all spatial streams/, rather than doing crazy things like hitting MCS7 and then probing MCS8 - MCS8 is basically MCS0 but two spatial streams. It's a /lot/ slower than MCS7. Also, the reverse is true - if we're at MCS8 then don't probe MCS7 as part of it, it's not likely to succeed. * Fix bugs in pick_best_rate() where I was /immediately/ choosing the highest MCS rate if there weren't any frames yet transmitted. I was defaulting to 25% EWMA and .. then each comparison would accept the higher rate. Just skip those; sampling will fill in the details. So, this seems to work a lot better. It's not perfect; I'm still seeing a lot of instability around higher MCS rates because there are bursts of loss/retransmissions that aren't /too/ bad. But i'll keep iterating over this and tidying up my hacks. Ok, so why this still something I'm poking at? rather than porting minstrel_ht? ath_rate_sample tries to minimise airtime, not maximise throughput. I have extended it with an EWMA based on sub-frame success/failures - high MCS rates that have partially successful receptions still show super short average frame times, but a /lot/ of retransmits have to happen for that to work. So for MCS rates I also track this EWMA and ensure that the rates I'm choosing don't have super crappy packet failures. I don't mind not getting lower peak throughput versus minstrel_ht; instead I want to see if I can make "minimise airtime" work well. Tested: * AR9380, STA mode * AR9344, STA mode * AR9580, STA/AP mode Modified: head/sys/dev/ath/ath_hal/ar5416/ar2133.c head/sys/dev/ath/ath_rate/amrr/amrr.c head/sys/dev/ath/ath_rate/onoe/onoe.c head/sys/dev/ath/ath_rate/sample/sample.c head/sys/dev/ath/ath_rate/sample/sample.h head/sys/dev/ath/if_ath.c head/sys/dev/ath/if_ath_misc.h head/sys/dev/ath/if_ath_tx.c head/sys/dev/ath/if_ath_tx_ht.c head/sys/dev/ath/if_athrate.h head/sys/dev/ath/if_athvar.h Modified: head/sys/dev/ath/ath_hal/ar5416/ar2133.c ============================================================================== --- head/sys/dev/ath/ath_hal/ar5416/ar2133.c Fri May 15 17:37:08 2020 (r361084) +++ head/sys/dev/ath/ath_hal/ar5416/ar2133.c Fri May 15 18:51:20 2020 (r361085) @@ -419,6 +419,7 @@ ar2133GetChannelMaxMinPower(struct ath_hal *ah, return(AH_FALSE); } #else + // XXX TODO: actually go implement for 11n chips! *maxPow = *minPow = 0; return AH_FALSE; #endif Modified: head/sys/dev/ath/ath_rate/amrr/amrr.c ============================================================================== --- head/sys/dev/ath/ath_rate/amrr/amrr.c Fri May 15 17:37:08 2020 (r361084) +++ head/sys/dev/ath/ath_rate/amrr/amrr.c Fri May 15 18:51:20 2020 (r361085) @@ -104,8 +104,9 @@ ath_rate_node_cleanup(struct ath_softc *sc, struct ath void ath_rate_findrate(struct ath_softc *sc, struct ath_node *an, - int shortPreamble, size_t frameLen, int tid, bool is_aggr, - u_int8_t *rix, int *try0, u_int8_t *txrate, int *maxdur) + int shortPreamble, size_t frameLen, int tid, int is_aggr, + u_int8_t *rix, int *try0, u_int8_t *txrate, int *maxdur, + int *maxpktlen) { struct amrr_node *amn = ATH_NODE_AMRR(an); @@ -116,6 +117,7 @@ ath_rate_findrate(struct ath_softc *sc, struct ath_nod else *txrate = amn->amn_tx_rate0; maxdur = -1; + maxpktlen = -1; } /* @@ -160,7 +162,7 @@ ath_rate_setupxtxdesc(struct ath_softc *sc, struct ath void ath_rate_tx_complete(struct ath_softc *sc, struct ath_node *an, const struct ath_rc_series *rc, const struct ath_tx_status *ts, - int frame_size, int nframes, int nbad) + int frame_size, int rc_framesize, int nframes, int nbad) { struct amrr_node *amn = ATH_NODE_AMRR(an); int sr = ts->ts_shortretry; Modified: head/sys/dev/ath/ath_rate/onoe/onoe.c ============================================================================== --- head/sys/dev/ath/ath_rate/onoe/onoe.c Fri May 15 17:37:08 2020 (r361084) +++ head/sys/dev/ath/ath_rate/onoe/onoe.c Fri May 15 18:51:20 2020 (r361085) @@ -112,8 +112,9 @@ ath_rate_node_cleanup(struct ath_softc *sc, struct ath void ath_rate_findrate(struct ath_softc *sc, struct ath_node *an, - int shortPreamble, size_t frameLen, int tid, bool is_aggr, - u_int8_t *rix, int *try0, u_int8_t *txrate, int *maxdur) + int shortPreamble, size_t frameLen, int tid, int is_aggr, + u_int8_t *rix, int *try0, u_int8_t *txrate, int *maxdur, + int *maxpktlen) { struct onoe_node *on = ATH_NODE_ONOE(an); @@ -124,6 +125,7 @@ ath_rate_findrate(struct ath_softc *sc, struct ath_nod else *txrate = on->on_tx_rate0; *maxdur = -1; + *maxpktlen = -1; } /* @@ -167,7 +169,7 @@ ath_rate_setupxtxdesc(struct ath_softc *sc, struct ath void ath_rate_tx_complete(struct ath_softc *sc, struct ath_node *an, const struct ath_rc_series *rc, const struct ath_tx_status *ts, - int frame_size, int nframes, int nbad) + int frame_size, int rc_framesize, int nframes, int nbad) { struct onoe_node *on = ATH_NODE_ONOE(an); Modified: head/sys/dev/ath/ath_rate/sample/sample.c ============================================================================== --- head/sys/dev/ath/ath_rate/sample/sample.c Fri May 15 17:37:08 2020 (r361084) +++ head/sys/dev/ath/ath_rate/sample/sample.c Fri May 15 18:51:20 2020 (r361085) @@ -107,6 +107,117 @@ __FBSDID("$FreeBSD$"); * a few different packet sizes independently for each link. */ +/* XXX TODO: move this into ath_hal/net80211 so it can be shared */ + +#define MCS_HT20 0 +#define MCS_HT20_SGI 1 +#define MCS_HT40 2 +#define MCS_HT40_SGI 3 + +/* + * This is currently a copy/paste from the 11n tx code. + * + * It's used to determine the maximum frame length allowed for the + * given rate. For now this ignores SGI/LGI and will assume long-GI. + * This only matters for lower rates that can't fill a full 64k A-MPDU. + * + * (But it's also important because right now rate control doesn't set + * flags like SGI/LGI, STBC, LDPC, TX power, etc.) + * + * When selecting a set of rates the rate control code will iterate + * over the HT20/HT40 max frame length and tell the caller the maximum + * length (@ LGI.) It will also choose a bucket that's the minimum + * of this value and the provided aggregate length. That way the + * rate selection will closely match what the eventual formed aggregate + * will be rather than "not at all". + */ + +static int ath_rate_sample_max_4ms_framelen[4][32] = { + [MCS_HT20] = { + 3212, 6432, 9648, 12864, 19300, 25736, 28952, 32172, + 6424, 12852, 19280, 25708, 38568, 51424, 57852, 64280, + 9628, 19260, 28896, 38528, 57792, 65532, 65532, 65532, + 12828, 25656, 38488, 51320, 65532, 65532, 65532, 65532, + }, + [MCS_HT20_SGI] = { + 3572, 7144, 10720, 14296, 21444, 28596, 32172, 35744, + 7140, 14284, 21428, 28568, 42856, 57144, 64288, 65532, + 10700, 21408, 32112, 42816, 64228, 65532, 65532, 65532, + 14256, 28516, 42780, 57040, 65532, 65532, 65532, 65532, + }, + [MCS_HT40] = { + 6680, 13360, 20044, 26724, 40092, 53456, 60140, 65532, + 13348, 26700, 40052, 53400, 65532, 65532, 65532, 65532, + 20004, 40008, 60016, 65532, 65532, 65532, 65532, 65532, + 26644, 53292, 65532, 65532, 65532, 65532, 65532, 65532, + }, + [MCS_HT40_SGI] = { + 7420, 14844, 22272, 29696, 44544, 59396, 65532, 65532, + 14832, 29668, 44504, 59340, 65532, 65532, 65532, 65532, + 22232, 44464, 65532, 65532, 65532, 65532, 65532, 65532, + 29616, 59232, 65532, 65532, 65532, 65532, 65532, 65532, + } +}; + +/* + * Given the (potentially MRR) transmit schedule, calculate the maximum + * allowed packet size for forming aggregates based on the lowest + * MCS rate in the transmit schedule. + * + * Returns -1 if it's a legacy rate or no MRR. + */ +static int +ath_rate_sample_find_min_pktlength(struct ath_softc *sc, + struct ath_node *an, uint8_t rix0) +{ +#define MCS_IDX(ix) (rt->info[ix].dot11Rate) + const HAL_RATE_TABLE *rt = sc->sc_currates; + struct sample_node *sn = ATH_NODE_SAMPLE(an); + const struct txschedule *sched = &sn->sched[rix0]; + int max_pkt_length = 65530; // ATH_AGGR_MAXSIZE + // Note: this may not be true in all cases; need to check? + int is_ht40 = (an->an_node.ni_chw == 40); + // Note: not great, but good enough.. + int idx = is_ht40 ? MCS_HT40 : MCS_HT20; + + if (rt->info[rix0].phy != IEEE80211_T_HT) { + return -1; + } + + if (! sc->sc_mrretry) { + return -1; + } + + KASSERT(rix0 == sched->r0, ("rix0 (%x) != sched->r0 (%x)!\n", + rix0, sched->r0)); + + /* + * Update based on sched->r{0,1,2,3} if sched->t{0,1,2,3} + * is not zero. + * + * Note: assuming all four PHYs are HT! + */ + if (sched->t0 != 0) { + max_pkt_length = MIN(max_pkt_length, + ath_rate_sample_max_4ms_framelen[idx][MCS_IDX(sched->r0)]); + } + if (sched->t1 != 0) { + max_pkt_length = MIN(max_pkt_length, + ath_rate_sample_max_4ms_framelen[idx][MCS_IDX(sched->r1)]); + } + if (sched->t2 != 0) { + max_pkt_length = MIN(max_pkt_length, + ath_rate_sample_max_4ms_framelen[idx][MCS_IDX(sched->r2)]); + } + if (sched->t3 != 0) { + max_pkt_length = MIN(max_pkt_length, + ath_rate_sample_max_4ms_framelen[idx][MCS_IDX(sched->r3)]); + } + + return max_pkt_length; +#undef MCS +} + static void ath_rate_ctl_reset(struct ath_softc *, struct ieee80211_node *); static __inline int @@ -125,6 +236,22 @@ size_to_bin(int size) return 2; #endif #if NUM_PACKET_SIZE_BINS > 4 + if (size <= packet_size_bins[3]) + return 3; +#endif +#if NUM_PACKET_SIZE_BINS > 5 + if (size <= packet_size_bins[4]) + return 4; +#endif +#if NUM_PACKET_SIZE_BINS > 6 + if (size <= packet_size_bins[5]) + return 5; +#endif +#if NUM_PACKET_SIZE_BINS > 7 + if (size <= packet_size_bins[6]) + return 6; +#endif +#if NUM_PACKET_SIZE_BINS > 8 #error "add support for more packet sizes" #endif return NUM_PACKET_SIZE_BINS-1; @@ -167,12 +294,12 @@ pick_best_rate(struct ath_node *an, const HAL_RATE_TAB int size_bin, int require_acked_before) { struct sample_node *sn = ATH_NODE_SAMPLE(an); - int best_rate_rix, best_rate_tt, best_rate_pct; + int best_rate_rix, best_rate_tt, best_rate_pct; uint64_t mask; int rix, tt, pct; - best_rate_rix = 0; - best_rate_tt = 0; + best_rate_rix = 0; + best_rate_tt = 0; best_rate_pct = 0; for (mask = sn->ratemask, rix = 0; mask != 0; mask >>= 1, rix++) { if ((mask & 1) == 0) /* not a supported rate */ @@ -194,8 +321,7 @@ pick_best_rate(struct ath_node *an, const HAL_RATE_TAB if (sn->stats[size_bin][rix].total_packets > 0) { pct = sn->stats[size_bin][rix].ewma_pct; } else { - /* XXX for now, assume 95% ok */ - pct = 95; + pct = -1; /* No percent yet to compare against! */ } /* don't use a bit-rate that has been failing */ @@ -203,18 +329,36 @@ pick_best_rate(struct ath_node *an, const HAL_RATE_TAB continue; /* - * For HT, Don't use a bit rate that is much more - * lossy than the best. + * For HT, Don't use a bit rate that is more + * lossy than the best. Give a bit of leeway. * - * XXX this isn't optimal; it's just designed to - * eliminate rates that are going to be obviously - * worse. + * Don't consider best rates that we haven't seen + * packets for yet; let sampling start inflence that. */ if (an->an_node.ni_flags & IEEE80211_NODE_HT) { + if (pct == -1) + continue; +#if 0 + IEEE80211_NOTE(an->an_node.ni_vap, + IEEE80211_MSG_RATECTL, + &an->an_node, + "%s: size %d comparing best rate 0x%x pkts/ewma/tt (%ju/%d/%d) " + "to 0x%x pkts/ewma/tt (%ju/%d/%d)", + __func__, + bin_to_size(size_bin), + rt->info[best_rate_rix].dot11Rate, + sn->stats[size_bin][best_rate_rix].total_packets, + best_rate_pct, + best_rate_tt, + rt->info[rix].dot11Rate, + sn->stats[size_bin][rix].total_packets, + pct, + tt); +#endif if (best_rate_pct > (pct + 50)) continue; } - +#if 1 /* * For non-MCS rates, use the current average txtime for * comparison. @@ -228,19 +372,19 @@ pick_best_rate(struct ath_node *an, const HAL_RATE_TAB } /* - * Since 2 stream rates have slightly higher TX times, + * Since 2 and 3 stream rates have slightly higher TX times, * allow a little bit of leeway. This should later * be abstracted out and properly handled. */ if (an->an_node.ni_flags & IEEE80211_NODE_HT) { - if (best_rate_tt == 0 || (tt * 8 <= best_rate_tt * 10)) { + if (best_rate_tt == 0 || ((tt * 10) <= (best_rate_tt * 10))) { best_rate_tt = tt; best_rate_rix = rix; best_rate_pct = pct; } } - } - return (best_rate_tt ? best_rate_rix : -1); + } + return (best_rate_tt ? best_rate_rix : -1); } /* @@ -260,7 +404,7 @@ pick_sample_rate(struct sample_softc *ssc , struct ath current_rix = sn->current_rix[size_bin]; if (current_rix < 0) { /* no successes yet, send at the lowest bit-rate */ - /* XXX should return MCS0 if HT */ + /* XXX TODO should return MCS0 if HT */ return 0; } @@ -316,10 +460,22 @@ pick_sample_rate(struct sample_softc *ssc , struct ath /* * For HT, only sample a few rates on either side of the * current rix; there's quite likely a lot of them. + * + * This is limited to testing rate indexes on either side of + * this MCS, but for all spatial streams. + * + * Otherwise we'll (a) never really sample higher MCS + * rates if we're stuck low, and we'll make weird moves + * like sample MCS8 if we're using MCS7. */ if (an->an_node.ni_flags & IEEE80211_NODE_HT) { - if (rix < (current_rix - 3) || - rix > (current_rix + 3)) { + uint8_t current_mcs, rix_mcs; + + current_mcs = MCS(current_rix) & 0x7; + rix_mcs = MCS(rix) & 0x7; + + if (rix_mcs < (current_mcs - 2) || + rix_mcs > (current_mcs + 2)) { mask &= ~((uint64_t) 1<stats[size_bin][rix].successive_failures == 0) { break; } @@ -483,8 +639,8 @@ ath_rate_pick_seed_rate_ht(struct ath_softc *sc, struc void ath_rate_findrate(struct ath_softc *sc, struct ath_node *an, int shortPreamble, size_t frameLen, int tid, - bool is_aggr, u_int8_t *rix0, int *try0, - u_int8_t *txrate, int *maxdur) + int is_aggr, u_int8_t *rix0, int *try0, + u_int8_t *txrate, int *maxdur, int *maxpktlen) { #define DOT11RATE(ix) (rt->info[ix].dot11Rate & IEEE80211_RATE_VAL) #define MCS(ix) (rt->info[ix].dot11Rate | IEEE80211_RATE_MCS) @@ -493,9 +649,10 @@ ath_rate_findrate(struct ath_softc *sc, struct ath_nod struct sample_softc *ssc = ATH_SOFTC_SAMPLE(sc); struct ieee80211com *ic = &sc->sc_ic; const HAL_RATE_TABLE *rt = sc->sc_currates; - const int size_bin = size_to_bin(frameLen); + int size_bin = size_to_bin(frameLen); int rix, mrr, best_rix, change_rates; unsigned average_tx_time; + int max_pkt_len; ath_rate_update_static_rix(sc, &an->an_node); @@ -503,6 +660,11 @@ ath_rate_findrate(struct ath_softc *sc, struct ath_nod /* Also for now don't calculate a max duration; that'll come later */ *maxdur = -1; + /* + * For now just set it to the frame length; we'll optimise it later. + */ + *maxpktlen = frameLen; + if (sn->currates != sc->sc_currates) { device_printf(sc->sc_dev, "%s: currates != sc_currates!\n", __func__); @@ -519,15 +681,34 @@ ath_rate_findrate(struct ath_softc *sc, struct ath_nod mrr = sc->sc_mrretry; /* XXX check HT protmode too */ + /* XXX turn into a cap; 11n MACs support MRR+RTSCTS */ if (mrr && (ic->ic_flags & IEEE80211_F_USEPROT && !sc->sc_mrrprot)) mrr = 0; best_rix = pick_best_rate(an, rt, size_bin, !mrr); + + /* + * At this point we've chosen the best rix, so now we + * need to potentially update our maximum packet length + * and size_bin if we're doing 11n rates. + */ + max_pkt_len = ath_rate_sample_find_min_pktlength(sc, an, best_rix); + if (max_pkt_len > 0) { +#if 0 + device_printf(sc->sc_dev, + "Limiting maxpktlen from %d to %d bytes\n", + (int) frameLen, max_pkt_len); +#endif + *maxpktlen = frameLen = MIN(frameLen, max_pkt_len); + size_bin = size_to_bin(frameLen); + } + if (best_rix >= 0) { average_tx_time = sn->stats[size_bin][best_rix].average_tx_time; } else { average_tx_time = 0; } + /* * Limit the time measuring the performance of other tx * rates to sample_rate% of the total transmission time. @@ -586,9 +767,9 @@ ath_rate_findrate(struct ath_softc *sc, struct ath_nod int cur_rix = sn->current_rix[size_bin]; int cur_att = sn->stats[size_bin][cur_rix].average_tx_time; /* - * If the node is HT, upgrade it if the MCS rate is - * higher and the average tx time is within 20% of - * the current rate. It can fail a little. + * If the node is HT, upgrade it if the MCS rate without + * the stream is higher and the average tx time is + * within 10% of the current rate. It can fail a little. * * This is likely not optimal! */ @@ -596,13 +777,20 @@ ath_rate_findrate(struct ath_softc *sc, struct ath_nod printf("cur rix/att %x/%d, best rix/att %x/%d\n", MCS(cur_rix), cur_att, MCS(best_rix), average_tx_time); #endif - if ((MCS(best_rix) > MCS(cur_rix)) && - (average_tx_time * 8) <= (cur_att * 10)) { +#if 0 + if (((MCS(best_rix) & 0x7) > (MCS(cur_rix) & 0x7)) && + (average_tx_time * 10) <= (cur_att * 10)) { +#else + if ((average_tx_time * 10) <= (cur_att * 10)) { +#endif IEEE80211_NOTE(an->an_node.ni_vap, IEEE80211_MSG_RATECTL, &an->an_node, - "%s: HT: best_rix 0x%d > cur_rix 0x%x, average_tx_time %d, cur_att %d", - __func__, - MCS(best_rix), MCS(cur_rix), average_tx_time, cur_att); + "%s: HT: size %d best_rix 0x%x > " + " cur_rix 0x%x, average_tx_time %d," + " cur_att %d", + __func__, bin_to_size(size_bin), + MCS(best_rix), MCS(cur_rix), + average_tx_time, cur_att); change_rates = 1; } } @@ -614,15 +802,19 @@ ath_rate_findrate(struct ath_softc *sc, struct ath_nod IEEE80211_NOTE(an->an_node.ni_vap, IEEE80211_MSG_RATECTL, &an->an_node, -"%s: size %d switch rate %d (%d/%d) -> %d (%d/%d) after %d packets mrr %d", +"%s: size %d switch rate %d %s (%d/%d) EWMA %d -> %d %s (%d/%d) EWMA %d after %d packets mrr %d", __func__, bin_to_size(size_bin), - RATE(sn->current_rix[size_bin]), + dot11rate(rt, sn->current_rix[size_bin]), + dot11rate_label(rt, sn->current_rix[size_bin]), sn->stats[size_bin][sn->current_rix[size_bin]].average_tx_time, sn->stats[size_bin][sn->current_rix[size_bin]].perfect_tx_time, - RATE(best_rix), + sn->stats[size_bin][sn->current_rix[size_bin]].ewma_pct, + dot11rate(rt, best_rix), + dot11rate_label(rt, best_rix), sn->stats[size_bin][best_rix].average_tx_time, sn->stats[size_bin][best_rix].perfect_tx_time, + sn->stats[size_bin][best_rix].ewma_pct, sn->packets_since_switch[size_bin], mrr); } @@ -659,6 +851,7 @@ done: *txrate = rt->info[rix].rateCode | (shortPreamble ? rt->info[rix].shortPreamble : 0); sn->packets_sent[size_bin]++; + #undef DOT11RATE #undef MCS #undef RATE @@ -720,9 +913,6 @@ static void update_stats(struct ath_softc *sc, struct ath_node *an, int frame_size, int rix0, int tries0, - int rix1, int tries1, - int rix2, int tries2, - int rix3, int tries3, int short_tries, int tries, int status, int nframes, int nbad) { @@ -739,33 +929,20 @@ update_stats(struct ath_softc *sc, struct ath_node *an if (!IS_RATE_DEFINED(sn, rix0)) return; + + /* + * If status is FAIL then we treat all frames as bad. + * This better accurately tracks EWMA and average TX time + * because even if the eventual transmission succeeded, + * transmission at this rate did not. + */ + if (status != 0) + nbad = nframes; + tt = calc_usecs_unicast_packet(sc, size, rix0, short_tries, MIN(tries0, tries) - 1, is_ht40); tries_so_far = tries0; - if (tries1 && tries_so_far < tries) { - if (!IS_RATE_DEFINED(sn, rix1)) - return; - tt += calc_usecs_unicast_packet(sc, size, rix1, short_tries, - MIN(tries1 + tries_so_far, tries) - tries_so_far - 1, is_ht40); - tries_so_far += tries1; - } - - if (tries2 && tries_so_far < tries) { - if (!IS_RATE_DEFINED(sn, rix2)) - return; - tt += calc_usecs_unicast_packet(sc, size, rix2, short_tries, - MIN(tries2 + tries_so_far, tries) - tries_so_far - 1, is_ht40); - tries_so_far += tries2; - } - - if (tries3 && tries_so_far < tries) { - if (!IS_RATE_DEFINED(sn, rix3)) - return; - tt += calc_usecs_unicast_packet(sc, size, rix3, short_tries, - MIN(tries3 + tries_so_far, tries) - tries_so_far - 1, is_ht40); - } - if (sn->stats[size_bin][rix0].total_packets < ssc->smoothing_minpackets) { /* just average the first few packets */ int avg_tx = sn->stats[size_bin][rix0].average_tx_time; @@ -777,34 +954,9 @@ update_stats(struct ath_softc *sc, struct ath_node *an ((sn->stats[size_bin][rix0].average_tx_time * ssc->smoothing_rate) + (tt * (100 - ssc->smoothing_rate))) / 100; } - - /* - * XXX Don't mark the higher bit rates as also having failed; as this - * unfortunately stops those rates from being tasted when trying to - * TX. This happens with 11n aggregation. - * - * This is valid for higher CCK rates, higher OFDM rates, and higher - * HT rates within the current number of streams (eg MCS0..7, 8..15, - * etc.) - */ + if (nframes == nbad) { -#if 0 - int y; -#endif sn->stats[size_bin][rix0].successive_failures += nbad; -#if 0 - for (y = size_bin+1; y < NUM_PACKET_SIZE_BINS; y++) { - /* - * Also say larger packets failed since we - * assume if a small packet fails at a - * bit-rate then a larger one will also. - */ - sn->stats[y][rix0].successive_failures += nbad; - sn->stats[y][rix0].last_tx = ticks; - sn->stats[y][rix0].tries += tries; - sn->stats[y][rix0].total_packets += nframes; - } -#endif } else { sn->stats[size_bin][rix0].packets_acked += (nframes - nbad); sn->stats[size_bin][rix0].successive_failures = 0; @@ -833,20 +985,31 @@ update_stats(struct ath_softc *sc, struct ath_node *an (pct * (100 - ssc->smoothing_rate))) / 100; } + /* + * Only update the sample time for the initial sample rix. + * We've updated the statistics on each of the other retries + * fine, but we should only update the sample_tt with what + * was actually sampled. + * + * However, to aide in debugging, log all the failures for + * each of the buckets + */ + IEEE80211_NOTE(an->an_node.ni_vap, IEEE80211_MSG_RATECTL, + &an->an_node, + "%s: size %d %s %s rate %d %s tries (%d/%d) tt %d " + "avg_tt (%d/%d) nfrm %d nbad %d", + __func__, + size, + status ? "FAIL" : "OK", + rix0 == sn->current_sample_rix[size_bin] ? "sample" : "mrr", + dot11rate(rt, rix0), + dot11rate_label(rt, rix0), + short_tries, tries, tt, + sn->stats[size_bin][rix0].average_tx_time, + sn->stats[size_bin][rix0].perfect_tx_time, + nframes, nbad); if (rix0 == sn->current_sample_rix[size_bin]) { - IEEE80211_NOTE(an->an_node.ni_vap, IEEE80211_MSG_RATECTL, - &an->an_node, -"%s: size %d %s sample rate %d %s tries (%d/%d) tt %d avg_tt (%d/%d) nfrm %d nbad %d", - __func__, - size, - status ? "FAIL" : "OK", - dot11rate(rt, rix0), - dot11rate_label(rt, rix0), - short_tries, tries, tt, - sn->stats[size_bin][rix0].average_tx_time, - sn->stats[size_bin][rix0].perfect_tx_time, - nframes, nbad); sn->sample_tt[size_bin] = tt; sn->current_sample_rix[size_bin] = -1; } @@ -864,7 +1027,7 @@ badrate(struct ath_softc *sc, int series, int hwrate, void ath_rate_tx_complete(struct ath_softc *sc, struct ath_node *an, const struct ath_rc_series *rc, const struct ath_tx_status *ts, - int frame_size, int nframes, int nbad) + int frame_size, int rc_framesize, int nframes, int nbad) { struct ieee80211com *ic = &sc->sc_ic; struct sample_node *sn = ATH_NODE_SAMPLE(an); @@ -884,7 +1047,41 @@ ath_rate_tx_complete(struct ath_softc *sc, struct ath_ if (frame_size == 0) /* NB: should not happen */ frame_size = 1500; + if (rc_framesize == 0) /* NB: should not happen */ + rc_framesize = 1500; + /* + * There are still some places where what rate control set as + * a limit but the hardware decided, for some reason, to transmit + * at a smaller size that fell into a different bucket. + * + * The eternal question here is - which size_bin should it go in? + * The one that was requested, or the one that was transmitted? + * + * Here's the problem - if we use the one that was transmitted, + * we may continue to hit corner cases where we make a rate + * selection using a higher bin but only update the smaller bin; + * thus never really "adapting". + * + * If however we update the larger bin, we're not accurately + * representing the channel state at that frame/aggregate size. + * However if we keep hitting the larger request but completing + * a smaller size, we at least updates based on what the + * request was /for/. + * + * I'm going to err on the side of caution and choose the + * latter. + */ + if (size_to_bin(frame_size) != size_to_bin(rc_framesize)) { +#if 0 + device_printf(sc->sc_dev, + "%s: completed but frame size buckets mismatch " + "(completed %d tx'ed %d)\n", + __func__, frame_size, rc_framesize); +#endif + frame_size = rc_framesize; + } + if (sn->ratemask == 0) { IEEE80211_NOTE(an->an_node.ni_vap, IEEE80211_MSG_RATECTL, &an->an_node, @@ -921,9 +1118,6 @@ ath_rate_tx_complete(struct ath_softc *sc, struct ath_ short_tries, long_tries, nframes, nbad); update_stats(sc, an, frame_size, final_rix, long_tries, - 0, 0, - 0, 0, - 0, 0, short_tries, long_tries, status, nframes, nbad); @@ -962,18 +1156,13 @@ ath_rate_tx_complete(struct ath_softc *sc, struct ath_ } /* - * NB: series > 0 are not penalized for failure - * based on the try counts under the assumption - * that losses are often bursty and since we - * sample higher rates 1 try at a time doing so - * may unfairly penalize them. + * This used to not penalise other tries because loss + * can be bursty, but it's then not accurately keeping + * the avg TX time and EWMA updated. */ if (rc[0].tries) { update_stats(sc, an, frame_size, rc[0].rix, rc[0].tries, - rc[1].rix, rc[1].tries, - rc[2].rix, rc[2].tries, - rc[3].rix, rc[3].tries, short_tries, long_tries, long_tries > rc[0].tries, nframes, nbad); @@ -983,11 +1172,8 @@ ath_rate_tx_complete(struct ath_softc *sc, struct ath_ if (rc[1].tries && finalTSIdx > 0) { update_stats(sc, an, frame_size, rc[1].rix, rc[1].tries, - rc[2].rix, rc[2].tries, - rc[3].rix, rc[3].tries, - 0, 0, short_tries, long_tries, - status, + long_tries > rc[1].tries, nframes, nbad); long_tries -= rc[1].tries; } @@ -995,11 +1181,8 @@ ath_rate_tx_complete(struct ath_softc *sc, struct ath_ if (rc[2].tries && finalTSIdx > 1) { update_stats(sc, an, frame_size, rc[2].rix, rc[2].tries, - rc[3].rix, rc[3].tries, - 0, 0, - 0, 0, short_tries, long_tries, - status, + long_tries > rc[2].tries, nframes, nbad); long_tries -= rc[2].tries; } @@ -1007,11 +1190,8 @@ ath_rate_tx_complete(struct ath_softc *sc, struct ath_ if (rc[3].tries && finalTSIdx > 2) { update_stats(sc, an, frame_size, rc[3].rix, rc[3].tries, - 0, 0, - 0, 0, - 0, 0, short_tries, long_tries, - status, + long_tries > rc[3].tries, nframes, nbad); } } Modified: head/sys/dev/ath/ath_rate/sample/sample.h ============================================================================== --- head/sys/dev/ath/ath_rate/sample/sample.h Fri May 15 17:37:08 2020 (r361084) +++ head/sys/dev/ath/ath_rate/sample/sample.h Fri May 15 18:51:20 2020 (r361085) @@ -76,12 +76,11 @@ struct txschedule { }; /* - * for now, we track performance for three different packet - * size buckets + * We track performance for eight different packet size buckets. */ -#define NUM_PACKET_SIZE_BINS 2 +#define NUM_PACKET_SIZE_BINS 7 -static const int packet_size_bins[NUM_PACKET_SIZE_BINS] = { 250, 1600 }; +static const int packet_size_bins[NUM_PACKET_SIZE_BINS] = { 250, 1600, 4096, 8192, 16384, 32768, 65536 }; static inline int bin_to_size(int index) Modified: head/sys/dev/ath/if_ath.c ============================================================================== --- head/sys/dev/ath/if_ath.c Fri May 15 17:37:08 2020 (r361084) +++ head/sys/dev/ath/if_ath.c Fri May 15 18:51:20 2020 (r361085) @@ -4301,7 +4301,7 @@ ath_tx_default_comp(struct ath_softc *sc, struct ath_b void ath_tx_update_ratectrl(struct ath_softc *sc, struct ieee80211_node *ni, struct ath_rc_series *rc, struct ath_tx_status *ts, int frmlen, - int nframes, int nbad) + int rc_framelen, int nframes, int nbad) { struct ath_node *an; @@ -4317,9 +4317,11 @@ ath_tx_update_ratectrl(struct ath_softc *sc, struct ie * see about handling it (eg see how many attempts were * made before it got filtered and account for that.) */ + if ((ts->ts_status & HAL_TXERR_FILT) == 0) { ATH_NODE_LOCK(an); - ath_rate_tx_complete(sc, an, rc, ts, frmlen, nframes, nbad); + ath_rate_tx_complete(sc, an, rc, ts, frmlen, rc_framelen, + nframes, nbad); ATH_NODE_UNLOCK(an); } } @@ -4366,7 +4368,9 @@ ath_tx_process_buf_completion(struct ath_softc *sc, st */ ath_tx_update_ratectrl(sc, ni, bf->bf_state.bfs_rc, ts, - bf->bf_state.bfs_pktlen, 1, + bf->bf_state.bfs_pktlen, + bf->bf_state.bfs_pktlen, + 1, (ts->ts_status == 0 ? 0 : 1)); } ath_tx_default_comp(sc, bf, 0); Modified: head/sys/dev/ath/if_ath_misc.h ============================================================================== --- head/sys/dev/ath/if_ath_misc.h Fri May 15 17:37:08 2020 (r361084) +++ head/sys/dev/ath/if_ath_misc.h Fri May 15 18:51:20 2020 (r361085) @@ -63,7 +63,8 @@ extern void ath_tx_default_comp(struct ath_softc *sc, int fail); extern void ath_tx_update_ratectrl(struct ath_softc *sc, struct ieee80211_node *ni, struct ath_rc_series *rc, - struct ath_tx_status *ts, int frmlen, int nframes, int nbad); + struct ath_tx_status *ts, int frmlen, int rc_framelen, + int nframes, int nbad); extern int ath_hal_gethangstate(struct ath_hal *ah, uint32_t mask, uint32_t *hangs); Modified: head/sys/dev/ath/if_ath_tx.c ============================================================================== --- head/sys/dev/ath/if_ath_tx.c Fri May 15 17:37:08 2020 (r361084) +++ head/sys/dev/ath/if_ath_tx.c Fri May 15 18:51:20 2020 (r361085) @@ -363,7 +363,7 @@ ath_tx_dmasetup(struct ath_softc *sc, struct ath_buf * */ static void ath_tx_chaindesclist(struct ath_softc *sc, struct ath_desc *ds0, - struct ath_buf *bf, int is_aggr, int is_first_subframe, + struct ath_buf *bf, bool is_aggr, int is_first_subframe, int is_last_subframe) { struct ath_hal *ah = sc->sc_ah; @@ -1377,11 +1377,12 @@ ath_tx_setds(struct ath_softc *sc, struct ath_buf *bf) */ static void ath_tx_do_ratelookup(struct ath_softc *sc, struct ath_buf *bf, int tid, - bool is_aggr) + int pktlen, int is_aggr) { uint8_t rate, rix; int try0; int maxdur; // Note: Unused for now + int maxpktlen; if (! bf->bf_state.bfs_doratelookup) return; @@ -1391,7 +1392,7 @@ ath_tx_do_ratelookup(struct ath_softc *sc, struct ath_ ATH_NODE_LOCK(ATH_NODE(bf->bf_node)); ath_rate_findrate(sc, ATH_NODE(bf->bf_node), bf->bf_state.bfs_shpream, - bf->bf_state.bfs_pktlen, tid, is_aggr, &rix, &try0, &rate, &maxdur); + pktlen, tid, is_aggr, &rix, &try0, &rate, &maxdur, &maxpktlen); /* In case MRR is disabled, make sure rc[0] is setup correctly */ bf->bf_state.bfs_rc[0].rix = rix; @@ -1407,6 +1408,7 @@ ath_tx_do_ratelookup(struct ath_softc *sc, struct ath_ sc->sc_lastdatarix = rix; /* for fast frames */ bf->bf_state.bfs_try0 = try0; bf->bf_state.bfs_txrate0 = rate; + bf->bf_state.bfs_rc_maxpktlen = maxpktlen; } /* @@ -1521,7 +1523,7 @@ ath_tx_xmit_normal(struct ath_softc *sc, struct ath_tx bf->bf_state.bfs_txflags |= HAL_TXDESC_CLRDMASK; /* Setup the descriptor before handoff */ - ath_tx_do_ratelookup(sc, bf, tid->tid, false); + ath_tx_do_ratelookup(sc, bf, tid->tid, bf->bf_state.bfs_pktlen, false); ath_tx_calc_duration(sc, bf); ath_tx_calc_protection(sc, bf); ath_tx_set_rtscts(sc, bf); @@ -3096,7 +3098,8 @@ ath_tx_xmit_aggr(struct ath_softc *sc, struct ath_node ath_tx_update_clrdmask(sc, tid, bf); /* Direct dispatch to hardware */ - ath_tx_do_ratelookup(sc, bf, tid->tid, false); + ath_tx_do_ratelookup(sc, bf, tid->tid, bf->bf_state.bfs_pktlen, + false); ath_tx_calc_duration(sc, bf); ath_tx_calc_protection(sc, bf); ath_tx_set_rtscts(sc, bf); @@ -4259,7 +4262,9 @@ ath_tx_normal_comp(struct ath_softc *sc, struct ath_bu */ if (fail == 0 && ((bf->bf_state.bfs_txflags & HAL_TXDESC_NOACK) == 0)) ath_tx_update_ratectrl(sc, ni, bf->bf_state.bfs_rc, - ts, bf->bf_state.bfs_pktlen, + ts, + bf->bf_state.bfs_pktlen, + bf->bf_state.bfs_pktlen, 1, (ts->ts_status == 0) ? 0 : 1); ath_tx_default_comp(sc, bf, fail); @@ -4688,15 +4693,11 @@ ath_tx_comp_aggr_error(struct ath_softc *sc, struct at /* * Update rate control - all frames have failed. - * - * XXX use the length in the first frame in the series; - * XXX just so things are consistent for now. - * - * XXX TODO: need to signal this is a large frame no matter what... */ ath_tx_update_ratectrl(sc, ni, bf_first->bf_state.bfs_rc, &bf_first->bf_status.ds_txstat, - bf_first->bf_state.bfs_pktlen, + bf_first->bf_state.bfs_al, + bf_first->bf_state.bfs_rc_maxpktlen, bf_first->bf_state.bfs_nframes, bf_first->bf_state.bfs_nframes); ATH_TX_LOCK(sc); @@ -4845,6 +4846,7 @@ ath_tx_aggr_comp_aggr(struct ath_softc *sc, struct ath int drops = 0; int nframes = 0, nbad = 0, nf; int pktlen; + int agglen, rc_agglen; /* XXX there's too much on the stack? */ struct ath_rc_series rc[ATH_RC_NUM]; int txseq; @@ -4857,6 +4859,8 @@ ath_tx_aggr_comp_aggr(struct ath_softc *sc, struct ath * has been completed and freed. */ ts = bf_first->bf_status.ds_txstat; + agglen = bf_first->bf_state.bfs_al; + rc_agglen = bf_first->bf_state.bfs_rc_maxpktlen; TAILQ_INIT(&bf_q); TAILQ_INIT(&bf_cq); @@ -5093,9 +5097,8 @@ ath_tx_aggr_comp_aggr(struct ath_softc *sc, struct ath * control code. */ if (fail == 0) { - /* XXX TODO: what's pktlen here? */ - ath_tx_update_ratectrl(sc, ni, rc, &ts, pktlen, nframes, - nbad); + ath_tx_update_ratectrl(sc, ni, rc, &ts, agglen, rc_agglen, + nframes, nbad); } /* @@ -5187,6 +5190,7 @@ ath_tx_aggr_comp_unaggr(struct ath_softc *sc, struct a ath_tx_update_ratectrl(sc, ni, bf->bf_state.bfs_rc, &bf->bf_status.ds_txstat, bf->bf_state.bfs_pktlen, + bf->bf_state.bfs_pktlen, 1, (ts.ts_status == 0) ? 0 : 1); /* @@ -5359,6 +5363,66 @@ ath_tx_aggr_comp(struct ath_softc *sc, struct ath_buf } /* + * Grab the software queue depth that we COULD transmit. + * + * This includes checks if it's in the BAW, whether it's a frame + * that is supposed to be in the BAW. Other checks could be done; + * but for now let's try and avoid doing the whole of ath_tx_form_aggr() + * here. + */ +static int +ath_tx_tid_swq_depth_bytes(struct ath_softc *sc, struct ath_node *an, + struct ath_tid *tid) +{ + struct ath_buf *bf; + struct ieee80211_tx_ampdu *tap; + int nbytes = 0; + + ATH_TX_LOCK_ASSERT(sc); + + tap = ath_tx_get_tx_tid(an, tid->tid); + + /* + * Iterate over each buffer and sum the pkt_len. + * Bail if we exceed ATH_AGGR_MAXSIZE bytes; we won't + * ever queue more than that in a single frame. + */ + TAILQ_FOREACH(bf, &tid->tid_q, bf_list) { + + /* + * TODO: I'm not sure if we're going to hit cases where + * no frames get sent because the list is empty. + */ + + /* Check if it's in the BAW */ + if (tap != NULL && (! BAW_WITHIN(tap->txa_start, tap->txa_wnd, + SEQNO(bf->bf_state.bfs_seqno)))) { + break; + } + + /* Check if it's even supposed to be in the BAW */ + if (! bf->bf_state.bfs_dobaw) { + break; + } + + nbytes += bf->bf_state.bfs_pktlen; + if (nbytes >= ATH_AGGR_MAXSIZE) + break; + + /* + * Check if we're likely going to leak a frame + * as part of a PSPOLL. Break out at this point; + * we're only going to send a single frame anyway. *** DIFF OUTPUT TRUNCATED AT 1000 LINES ***