Date: Mon, 12 Mar 2012 21:06:55 +0000 From: Vincent Hoffman <vince@unsane.co.uk> To: Adrian Chadd <adrian@freebsd.org> Cc: freebsd-wireless@freebsd.org Subject: Re: ath0 timeout was "Re: (more) bugs fixed in -HEAD, AP mode is now mostly (again) stable!" Message-ID: <4F5E656F.4040004@unsane.co.uk> In-Reply-To: <4F5CA45C.1010603@unsane.co.uk> References: <CAJ-VmokYNFnNrWxk=Sg%2BhRuOhkGj5%2Bi7TGB3ni_YBT9=pjs8AQ@mail.gmail.com> <4F59DD98.8080905@unsane.co.uk> <CAJ-Vmokurdn-FGfdFuuN84a9==fdoYjAPBOd4icT-eBJ2BuGpg@mail.gmail.com> <4F5AA149.8000904@unsane.co.uk> <CAJ-VmommaSh3Y=huxpfHRbVb0j3HGXTfDNi_OHJ5Tz8_AHqCSQ@mail.gmail.com> <4F5BDF3C.8070605@unsane.co.uk> <CAJ-VmomFfAXncDp48LYQvRTL5-HG4GpnDkkAy71ReTAFRyK41A@mail.gmail.com> <4F5C0302.8090403@unsane.co.uk> <CAJ-Vmo=HpBYR6ci-dyJWJK9_OkUVC2J_bj7YknNvKwR8to0q8w@mail.gmail.com> <4F5CA45C.1010603@unsane.co.uk>
next in thread | previous in thread | raw e-mail | index | archive | help
On 11/03/2012 13:10, Vincent Hoffman wrote: > On 11/03/2012 01:58, Adrian Chadd wrote: >> Hiya, >> >> Next time it happens, do the sysctl before the scan. I can force a hang by running iperf for a minute or 2. so to be certain, here is it again: ----- ath0: device timeout ar5212StopDmaReceive: dma failed to stop in 10ms AR_CR=0x00000024 AR_DIAG_SW=0x42000020 no tx bufs (empty list): 13422 no tx bufs (was busy): 0 aggr single packet: 40010 aggr single packet w/ BAW closed: 42 aggr non-baw packet: 70 aggr aggregate packet: 57698 aggr single packet low hwq: 507508 aggr sched, no work: 1481 0: 0 1: 0 2: 28283 3: 13179 4: 5948 5: 2900 6: 1846 7: 1354 8: 1238 9: 734 10: 464 11: 371 12: 275 13: 225 14: 205 15: 147 16: 130 17: 70 18: 101 19: 62 20: 27 21: 18 22: 20 23: 14 24: 6 25: 8 26: 9 27: 6 28: 6 29: 6 30: 12 31: 8 32: 26 33: 0 34: 0 35: 0 36: 0 37: 0 38: 0 39: 0 40: 0 41: 0 42: 0 43: 0 44: 0 45: 0 46: 0 47: 0 48: 0 49: 0 50: 0 51: 0 52: 0 53: 0 54: 0 55: 0 56: 0 57: 0 58: 0 59: 0 60: 0 61: 0 62: 0 63: 0 HW TXQ 0: axq_depth=0, axq_aggr_depth=0 HW TXQ 1: axq_depth=0, axq_aggr_depth=0 HW TXQ 2: axq_depth=0, axq_aggr_depth=0 HW TXQ 3: axq_depth=0, axq_aggr_depth=0 HW TXQ 8: axq_depth=0, axq_aggr_depth=0 Total TX buffers: 268; Total TX buffers busy: 0 ----- output of sysctl dev.ath.0.stats | grep -v ': 0$' while hung. [root@ostracod ~/ath-debugging/12-03-2012-43]# sysctl dev.ath.0.stats | grep -v ': 0$' dev.ath.0.stats.ast_watchdog: 46 dev.ath.0.stats.ast_bmiss: 339 dev.ath.0.stats.ast_bmiss_phantom: 257 dev.ath.0.stats.ast_mib: 1235358 dev.ath.0.stats.ast_tx_qstop: 16960 dev.ath.0.stats.ast_tx_xretries: 18 dev.ath.0.stats.ast_tx_longretry: 102841 dev.ath.0.stats.ast_tx_shortpre: 842011 dev.ath.0.stats.ast_tx_altrate: 3385 dev.ath.0.stats.ast_rx_crcerr: 259245 dev.ath.0.stats.ast_rx_badcrypt: 4 dev.ath.0.stats.ast_rx_phyerr: 5 dev.ath.0.stats.ast_per_cal: 5278 dev.ath.0.stats.ast_tx_raw: 446 dev.ath.0.stats.ast_tx_nobuf: 8 dev.ath.0.stats.ast_tx_raw_fail: 8 dev.ath.0.stats.ast_ani_cal: 1583194 dev.ath.0.stats.ast_rx_agg: 876080 dev.ath.0.stats.ast_rx_halfgi: 5 dev.ath.0.stats.ast_rx_2040: 7 dev.ath.0.stats.ast_rx_pre_crc_err: 14996 dev.ath.0.stats.ast_rx_post_crc_err: 951 dev.ath.0.stats.ast_tx_swretries: 2775 dev.ath.0.stats.ast_tx_aggr_ok: 239745 dev.ath.0.stats.ast_tx_aggr_fail: 2757 dev.ath.0.stats.ast_rx_intr: 4500090 dev.ath.0.stats.ast_tx_intr: 1115914 dev.ath.0.stats.rx_phy_err.6: 5 >> The sysctl will tell me how deep each hardware TX queue is. >> >> I should likely add some further debugging to tell me how deep the >> per-TID software queues are; that'd be helpful here. >> >> What you're seeing there is something weird which is causing the TX >> frames to be queued in software/hardware and not be transmitted, to >> the point of buffer exhaustion. See "total TX buffers: 0" ? That means >> the frames can't go out for some reason. There's nothing in the >> hardware queue, so that also has me slightly concerned. >> >> I wonder if this is a problem with aggregation and buffer exhaustion. >> Hm, can you do "wlandebug +11n" and see if it's trying to exchange >> ADDBA frames (and failing) ? There's a known bug where Good guess wlan0: link state changed to DOWN wlan0: [e0:91:f5:48:5b:b9] switch station to HT20 channel 2432/0x10480 wlan0: link state changed to UP wlan0: [e0:91:f5:48:5b:b9] recv ADDBA request: dialogtoken 1 baparamset 0x1002 (tid 0 bufsiz 64) batimeout 0 baseqctl 0:0 wlan0: [e0:91:f5:48:5b:b9] send ADDBA response: dialogtoken 1 status 0 baparamset 0x1002 (tid 0) batimeout 0x0 baseqctl 0x0 wlan0: [e0:91:f5:48:5b:b9] discard MPDU frame, BA win <497:560> (0 frames) rxseq 496 tid 0 (retransmit) Anything else I can give thats useful? cpu is a dual core atom with 4G ram running a pretty much up to date -HEAD amd64 Thanks for looking at this. Vince >> Thanks, >> >> >> Adrian > _______________________________________________ > freebsd-wireless@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-wireless > To unsubscribe, send any mail to "freebsd-wireless-unsubscribe@freebsd.org"
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4F5E656F.4040004>