From owner-svn-src-head@FreeBSD.ORG Mon Apr 21 01:02:50 2014 Return-Path: Delivered-To: svn-src-head@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 33A3A821; Mon, 21 Apr 2014 01:02:50 +0000 (UTC) Received: from svn.freebsd.org (svn.freebsd.org [IPv6:2001:1900:2254:2068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 1F9C21785; Mon, 21 Apr 2014 01:02:50 +0000 (UTC) Received: from svn.freebsd.org ([127.0.1.70]) by svn.freebsd.org (8.14.8/8.14.8) with ESMTP id s3L12n3h039425; Mon, 21 Apr 2014 01:02:49 GMT (envelope-from adrian@svn.freebsd.org) Received: (from adrian@localhost) by svn.freebsd.org (8.14.8/8.14.8/Submit) id s3L12nWm039424; Mon, 21 Apr 2014 01:02:49 GMT (envelope-from adrian@svn.freebsd.org) Message-Id: <201404210102.s3L12nWm039424@svn.freebsd.org> From: Adrian Chadd Date: Mon, 21 Apr 2014 01:02:49 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org Subject: svn commit: r264708 - head/sys/dev/ath X-SVN-Group: head MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-head@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: SVN commit messages for the src tree for head/-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 21 Apr 2014 01:02:50 -0000 Author: adrian Date: Mon Apr 21 01:02:49 2014 New Revision: 264708 URL: http://svnweb.freebsd.org/changeset/base/264708 Log: Fix a cleanup hang if cleanup gets called _during_ an active cleanup. During power save testing I noticed that the cleanup code is being called during a RUN->RUN state transition. It's because the net80211 stack is treating that (for reasons I don't quitey know yet) as a reassociation and this calls the node cleanup code. The reason it's seeing a RUN->RUN transition is because during active power save stuff it's possible that the RUN->SLEEP and SLEEP->RUN transitions happen so quickly that the deferred net80211 vap state code "loses" a transition, namely the intermediary SLEEP transition. So, this was causing the node reassociation code to sometimes be called twice in quick succession and this would result in ath_tx_tid_cleanup() to be called again. The code calling it would always call pause, and then only call resume if the TID didn't have "cleanup_inprogress" set. Unfortunately it didn't check if it was already set on entry, so it would pause but not call resume. Thus, paused would be called more than once (once before each entry into ath-tx_tid_cleanup()) but resume would only be called once when the cleanup state was finished. This doesn't entirely fix all of the issues seen in the cleanup path but it's a necessary first step. Since this is a stability fix, it should be merged to stable/10 at some point. Tested: * AR5416, STA mode MFC after: 7 days Modified: head/sys/dev/ath/if_ath_tx.c Modified: head/sys/dev/ath/if_ath_tx.c ============================================================================== --- head/sys/dev/ath/if_ath_tx.c Sun Apr 20 23:01:56 2014 (r264707) +++ head/sys/dev/ath/if_ath_tx.c Mon Apr 21 01:02:49 2014 (r264708) @@ -5789,12 +5789,26 @@ ath_addba_stop(struct ieee80211_node *ni */ TAILQ_INIT(&bf_cq); ATH_TX_LOCK(sc); - ath_tx_tid_cleanup(sc, an, tid, &bf_cq); + /* - * Unpause the TID if no cleanup is required. + * In case there's a followup call to this, only call it + * if we don't have a cleanup in progress. + * + * Since we've paused the queue above, we need to make + * sure we unpause if there's already a cleanup in + * progress - it means something else is also doing + * this stuff, so we don't need to also keep it paused. */ - if (! atid->cleanup_inprogress) + if (atid->cleanup_inprogress) { ath_tx_tid_resume(sc, atid); + } else { + ath_tx_tid_cleanup(sc, an, tid, &bf_cq); + /* + * Unpause the TID if no cleanup is required. + */ + if (! atid->cleanup_inprogress) + ath_tx_tid_resume(sc, atid); + } ATH_TX_UNLOCK(sc); /* Handle completing frames and fail them */ @@ -5828,19 +5842,25 @@ ath_tx_node_reassoc(struct ath_softc *sc tid = &an->an_tid[i]; if (tid->hwq_depth == 0) continue; - ath_tx_tid_pause(sc, tid); DPRINTF(sc, ATH_DEBUG_NODE, "%s: %6D: TID %d: cleaning up TID\n", __func__, an->an_node.ni_macaddr, ":", i); - ath_tx_tid_cleanup(sc, an, i, &bf_cq); /* - * Unpause the TID if no cleanup is required. + * In case there's a followup call to this, only call it + * if we don't have a cleanup in progress. */ - if (! tid->cleanup_inprogress) - ath_tx_tid_resume(sc, tid); + if (! tid->cleanup_inprogress) { + ath_tx_tid_pause(sc, tid); + ath_tx_tid_cleanup(sc, an, i, &bf_cq); + /* + * Unpause the TID if no cleanup is required. + */ + if (! tid->cleanup_inprogress) + ath_tx_tid_resume(sc, tid); + } } ATH_TX_UNLOCK(sc);