From owner-freebsd-wireless@FreeBSD.ORG Mon Mar 12 21:06:57 2012 Return-Path: Delivered-To: freebsd-wireless@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 6F04C106566C; Mon, 12 Mar 2012 21:06:57 +0000 (UTC) (envelope-from vince@unsane.co.uk) Received: from unsane.co.uk (unsane-pt.tunnel.tserv5.lon1.ipv6.he.net [IPv6:2001:470:1f08:110::2]) by mx1.freebsd.org (Postfix) with ESMTP id D25988FC12; Mon, 12 Mar 2012 21:06:56 +0000 (UTC) Received: from badger.unsane.co.uk (badger.unsane.co.uk [85.233.185.165]) (authenticated bits=0) by unsane.co.uk (8.14.5/8.14.5) with ESMTP id q2CL6tAM029817 (version=TLSv1/SSLv3 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NO); Mon, 12 Mar 2012 21:06:55 GMT (envelope-from vince@unsane.co.uk) Message-ID: <4F5E656F.4040004@unsane.co.uk> Date: Mon, 12 Mar 2012 21:06:55 +0000 From: Vincent Hoffman User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:10.0.2) Gecko/20120216 Thunderbird/10.0.2 MIME-Version: 1.0 To: Adrian Chadd References: <4F59DD98.8080905@unsane.co.uk> <4F5AA149.8000904@unsane.co.uk> <4F5BDF3C.8070605@unsane.co.uk> <4F5C0302.8090403@unsane.co.uk> <4F5CA45C.1010603@unsane.co.uk> In-Reply-To: <4F5CA45C.1010603@unsane.co.uk> X-Enigmail-Version: 1.3.5 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: freebsd-wireless@freebsd.org Subject: Re: ath0 timeout was "Re: (more) bugs fixed in -HEAD, AP mode is now mostly (again) stable!" X-BeenThere: freebsd-wireless@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of 802.11 stack, tools device driver development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 12 Mar 2012 21:06:57 -0000 On 11/03/2012 13:10, Vincent Hoffman wrote: > On 11/03/2012 01:58, Adrian Chadd wrote: >> Hiya, >> >> Next time it happens, do the sysctl before the scan. I can force a hang by running iperf for a minute or 2. so to be certain, here is it again: ----- ath0: device timeout ar5212StopDmaReceive: dma failed to stop in 10ms AR_CR=0x00000024 AR_DIAG_SW=0x42000020 no tx bufs (empty list): 13422 no tx bufs (was busy): 0 aggr single packet: 40010 aggr single packet w/ BAW closed: 42 aggr non-baw packet: 70 aggr aggregate packet: 57698 aggr single packet low hwq: 507508 aggr sched, no work: 1481 0: 0 1: 0 2: 28283 3: 13179 4: 5948 5: 2900 6: 1846 7: 1354 8: 1238 9: 734 10: 464 11: 371 12: 275 13: 225 14: 205 15: 147 16: 130 17: 70 18: 101 19: 62 20: 27 21: 18 22: 20 23: 14 24: 6 25: 8 26: 9 27: 6 28: 6 29: 6 30: 12 31: 8 32: 26 33: 0 34: 0 35: 0 36: 0 37: 0 38: 0 39: 0 40: 0 41: 0 42: 0 43: 0 44: 0 45: 0 46: 0 47: 0 48: 0 49: 0 50: 0 51: 0 52: 0 53: 0 54: 0 55: 0 56: 0 57: 0 58: 0 59: 0 60: 0 61: 0 62: 0 63: 0 HW TXQ 0: axq_depth=0, axq_aggr_depth=0 HW TXQ 1: axq_depth=0, axq_aggr_depth=0 HW TXQ 2: axq_depth=0, axq_aggr_depth=0 HW TXQ 3: axq_depth=0, axq_aggr_depth=0 HW TXQ 8: axq_depth=0, axq_aggr_depth=0 Total TX buffers: 268; Total TX buffers busy: 0 ----- output of sysctl dev.ath.0.stats | grep -v ': 0$' while hung. [root@ostracod ~/ath-debugging/12-03-2012-43]# sysctl dev.ath.0.stats | grep -v ': 0$' dev.ath.0.stats.ast_watchdog: 46 dev.ath.0.stats.ast_bmiss: 339 dev.ath.0.stats.ast_bmiss_phantom: 257 dev.ath.0.stats.ast_mib: 1235358 dev.ath.0.stats.ast_tx_qstop: 16960 dev.ath.0.stats.ast_tx_xretries: 18 dev.ath.0.stats.ast_tx_longretry: 102841 dev.ath.0.stats.ast_tx_shortpre: 842011 dev.ath.0.stats.ast_tx_altrate: 3385 dev.ath.0.stats.ast_rx_crcerr: 259245 dev.ath.0.stats.ast_rx_badcrypt: 4 dev.ath.0.stats.ast_rx_phyerr: 5 dev.ath.0.stats.ast_per_cal: 5278 dev.ath.0.stats.ast_tx_raw: 446 dev.ath.0.stats.ast_tx_nobuf: 8 dev.ath.0.stats.ast_tx_raw_fail: 8 dev.ath.0.stats.ast_ani_cal: 1583194 dev.ath.0.stats.ast_rx_agg: 876080 dev.ath.0.stats.ast_rx_halfgi: 5 dev.ath.0.stats.ast_rx_2040: 7 dev.ath.0.stats.ast_rx_pre_crc_err: 14996 dev.ath.0.stats.ast_rx_post_crc_err: 951 dev.ath.0.stats.ast_tx_swretries: 2775 dev.ath.0.stats.ast_tx_aggr_ok: 239745 dev.ath.0.stats.ast_tx_aggr_fail: 2757 dev.ath.0.stats.ast_rx_intr: 4500090 dev.ath.0.stats.ast_tx_intr: 1115914 dev.ath.0.stats.rx_phy_err.6: 5 >> The sysctl will tell me how deep each hardware TX queue is. >> >> I should likely add some further debugging to tell me how deep the >> per-TID software queues are; that'd be helpful here. >> >> What you're seeing there is something weird which is causing the TX >> frames to be queued in software/hardware and not be transmitted, to >> the point of buffer exhaustion. See "total TX buffers: 0" ? That means >> the frames can't go out for some reason. There's nothing in the >> hardware queue, so that also has me slightly concerned. >> >> I wonder if this is a problem with aggregation and buffer exhaustion. >> Hm, can you do "wlandebug +11n" and see if it's trying to exchange >> ADDBA frames (and failing) ? There's a known bug where Good guess wlan0: link state changed to DOWN wlan0: [e0:91:f5:48:5b:b9] switch station to HT20 channel 2432/0x10480 wlan0: link state changed to UP wlan0: [e0:91:f5:48:5b:b9] recv ADDBA request: dialogtoken 1 baparamset 0x1002 (tid 0 bufsiz 64) batimeout 0 baseqctl 0:0 wlan0: [e0:91:f5:48:5b:b9] send ADDBA response: dialogtoken 1 status 0 baparamset 0x1002 (tid 0) batimeout 0x0 baseqctl 0x0 wlan0: [e0:91:f5:48:5b:b9] discard MPDU frame, BA win <497:560> (0 frames) rxseq 496 tid 0 (retransmit) Anything else I can give thats useful? cpu is a dual core atom with 4G ram running a pretty much up to date -HEAD amd64 Thanks for looking at this. Vince >> Thanks, >> >> >> Adrian > _______________________________________________ > freebsd-wireless@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-wireless > To unsubscribe, send any mail to "freebsd-wireless-unsubscribe@freebsd.org"