From owner-freebsd-bugs@FreeBSD.ORG Thu Mar 8 23:40:10 2012 Return-Path: Delivered-To: freebsd-bugs@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id A10091065673 for ; Thu, 8 Mar 2012 23:40:10 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id 739408FC13 for ; Thu, 8 Mar 2012 23:40:10 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.5/8.14.5) with ESMTP id q28NeArT039895 for ; Thu, 8 Mar 2012 23:40:10 GMT (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.5/8.14.5/Submit) id q28NeAnM039894; Thu, 8 Mar 2012 23:40:10 GMT (envelope-from gnats) Resent-Date: Thu, 8 Mar 2012 23:40:10 GMT Resent-Message-Id: <201203082340.q28NeAnM039894@freefall.freebsd.org> Resent-From: FreeBSD-gnats-submit@FreeBSD.org (GNATS Filer) Resent-To: freebsd-bugs@FreeBSD.org Resent-Reply-To: FreeBSD-gnats-submit@FreeBSD.org, Adrian Chadd Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id C2D33106566C for ; Thu, 8 Mar 2012 23:32:58 +0000 (UTC) (envelope-from nobody@FreeBSD.org) Received: from red.freebsd.org (red.freebsd.org [IPv6:2001:4f8:fff6::22]) by mx1.freebsd.org (Postfix) with ESMTP id 98D6E8FC0A for ; Thu, 8 Mar 2012 23:32:58 +0000 (UTC) Received: from red.freebsd.org (localhost [127.0.0.1]) by red.freebsd.org (8.14.4/8.14.4) with ESMTP id q28NWwFl050069 for ; Thu, 8 Mar 2012 23:32:58 GMT (envelope-from nobody@red.freebsd.org) Received: (from nobody@localhost) by red.freebsd.org (8.14.4/8.14.4/Submit) id q28NWw6E050059; Thu, 8 Mar 2012 23:32:58 GMT (envelope-from nobody) Message-Id: <201203082332.q28NWw6E050059@red.freebsd.org> Date: Thu, 8 Mar 2012 23:32:58 GMT From: Adrian Chadd To: freebsd-gnats-submit@FreeBSD.org X-Send-Pr-Version: www-3.1 Cc: Subject: kern/165866: [ath] TX hangs, requiring a "scan" to properly reset the interface X-BeenThere: freebsd-bugs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Bug reports List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 08 Mar 2012 23:40:10 -0000 >Number: 165866 >Category: kern >Synopsis: [ath] TX hangs, requiring a "scan" to properly reset the interface >Confidential: no >Severity: non-critical >Priority: medium >Responsible: freebsd-bugs >State: open >Quarter: >Keywords: >Date-Required: >Class: sw-bug >Submitter-Id: current-users >Arrival-Date: Thu Mar 08 23:40:10 UTC 2012 >Closed-Date: >Last-Modified: >Originator: Adrian Chadd >Release: FreeBSD-HEAD >Organization: >Environment: FreeBSD home-11bg-ap 10.0-CURRENT FreeBSD 10.0-CURRENT #18 r232400:232625M: Wed Dec 31 16:00:00 PST 1969 adrian@dummy:/home/adrian/work/freebsd/svn/obj/mipseb/mips.mipseb/usr/home/adrian/work/freebsd/svn/src/sys/TP-WN1043ND mips >Description: I've been seeing TX hangs during my tests. Investigating showed that the TX queue would grow and busy buffers would stay busy. Eg, from sysctl dev.ath.0.txagg=1: HW TXQ 0: axq_depth=0, axq_aggr_depth=0 HW TXQ 1: axq_depth=184, axq_aggr_depth=0 HW TXQ 2: axq_depth=0, axq_aggr_depth=0 HW TXQ 3: axq_depth=0, axq_aggr_depth=0 HW TXQ 8: axq_depth=1, axq_aggr_depth=0 Busy: 14 Total TX buffers: 15; Total TX buffers busy: 1 This occured even with a completely idle access point that only responded to probe requests - ie, no active associations. the only way to flush things was a 'scan' - this forcibly flushes the TX queue and pending frames are either handled or deleted. I then flipped on reset debugging (sysctl dev.ath.0.debug=0x20) and forced a scan whenever I saw this occur. I also dumped the relevant registers when this occured. I found that the TXDP for this queue was completely in the wrong place. I also found that the TX descriptor list made no sense - there were incomplete and complete descriptor lists in the same TX queue, as well as NULL link pointers half way through the list. So, I figured something is splicing the list together incorrectly. >How-To-Repeat: This kernel was compiled with TDMA support, so the ATH_BUF_BUSY flag would be set. * set it up on a 2.4GHz channel; * make sure there's lots of STAs and APs around; * notice the high level of probe request traffic; * .. wait. >Fix: This particular patch seems to quieten down the issues. I'm going to run this a bit more and see what happens. Index: if_ath_tx.c =================================================================== --- if_ath_tx.c (revision 232400) +++ if_ath_tx.c (working copy) @@ -623,19 +623,22 @@ ath_txq_restart_dma(struct ath_softc *sc, struct ath_txq *txq) { struct ath_hal *ah = sc->sc_ah; - struct ath_buf *bf; + struct ath_buf *bf, *bf_last; ATH_TXQ_LOCK_ASSERT(txq); /* This is always going to be cleared, empty or not */ txq->axq_flags &= ~ATH_TXQ_PUTPENDING; + /* XXX make this ATH_TXQ_FIRST */ bf = TAILQ_FIRST(&txq->axq_q); + bf_last = ATH_TXQ_LAST(txq, axq_q_s); + if (bf == NULL) return; ath_hal_puttxbuf(ah, txq->axq_qnum, bf->bf_daddr); - txq->axq_link = &bf->bf_lastds->ds_link; + txq->axq_link = &bf_last->bf_lastds->ds_link; ath_hal_txstart(ah, txq->axq_qnum); } >Release-Note: >Audit-Trail: >Unformatted: