From owner-freebsd-bugs@FreeBSD.ORG Sun Apr 13 23:30:00 2014 Return-Path: Delivered-To: freebsd-bugs@smarthost.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 91D807D4 for ; Sun, 13 Apr 2014 23:30:00 +0000 (UTC) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:1900:2254:206c::16:87]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 6D91912B6 for ; Sun, 13 Apr 2014 23:30:00 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.8/8.14.8) with ESMTP id s3DNU09T065785 for ; Sun, 13 Apr 2014 23:30:00 GMT (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.8/8.14.8/Submit) id s3DNU001065782; Sun, 13 Apr 2014 23:30:00 GMT (envelope-from gnats) Resent-Date: Sun, 13 Apr 2014 23:30:00 GMT Resent-Message-Id: <201404132330.s3DNU001065782@freefall.freebsd.org> Resent-From: FreeBSD-gnats-submit@FreeBSD.org (GNATS Filer) Resent-To: freebsd-bugs@FreeBSD.org Resent-Reply-To: FreeBSD-gnats-submit@FreeBSD.org, Adrian Chadd Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 6EFA5482 for ; Sun, 13 Apr 2014 23:22:26 +0000 (UTC) Received: from cgiserv.freebsd.org (cgiserv.freebsd.org [IPv6:2001:1900:2254:206a::50:4]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 40F1D1259 for ; Sun, 13 Apr 2014 23:22:26 +0000 (UTC) Received: from cgiserv.freebsd.org ([127.0.1.6]) by cgiserv.freebsd.org (8.14.8/8.14.8) with ESMTP id s3DNMQOV076520 for ; Sun, 13 Apr 2014 23:22:26 GMT (envelope-from nobody@cgiserv.freebsd.org) Received: (from nobody@localhost) by cgiserv.freebsd.org (8.14.8/8.14.8/Submit) id s3DNMQmU076513; Sun, 13 Apr 2014 23:22:26 GMT (envelope-from nobody) Message-Id: <201404132322.s3DNMQmU076513@cgiserv.freebsd.org> Date: Sun, 13 Apr 2014 23:22:26 GMT From: Adrian Chadd To: freebsd-gnats-submit@FreeBSD.org X-Send-Pr-Version: www-3.1 Subject: kern/188576: [ath] traffic hangs in station mode when downgrading from AMPDU TX or reassociating X-BeenThere: freebsd-bugs@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Bug reports List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 13 Apr 2014 23:30:00 -0000 >Number: 188576 >Category: kern >Synopsis: [ath] traffic hangs in station mode when downgrading from AMPDU TX or reassociating >Confidential: no >Severity: non-critical >Priority: low >Responsible: freebsd-bugs >State: open >Quarter: >Keywords: >Date-Required: >Class: sw-bug >Submitter-Id: current-users >Arrival-Date: Sun Apr 13 23:30:00 UTC 2014 >Closed-Date: >Last-Modified: >Originator: Adrian Chadd >Release: HEAD >Organization: >Environment: FreeBSD lucy-11i386 11.0-CURRENT FreeBSD 11.0-CURRENT #1 r263418M: Tue Apr 1 11:33:21 PDT 2014 adrian@lucy-11i386:/usr/home/adrian/work/freebsd/head/obj/usr/home/adrian/work/freebsd/head/src/sys/LUCY_11_i386 i386 >Description: Whenever an ath(4) 11n station reassociates or downgrades from aggregation to no aggregation, there's a chance that it'll hang and refuse to queue more frames. The session needs to be fully torn down (eg ifconfig wlanX down) for things to go back to normal. >How-To-Repeat: >Fix: I actually have debugged this a little already. So the problem seems to be that there's more than one entry point into ath_tx_tid_cleanup(). It's likely a couple of calls into the reassociation path or one into reassociate and one into aggregation teardown. I'll go figure that bit out soon. But what it leads to is thus: * the caller causes ath_tx_tid_pause(); * ath_tx_tid_cleanup() is called; * the first time this happens it sees there's 1 or more frames to cleanup, so it sets tid->cleanup_inprogress; * the caller then checks if that's set to 1 - if so, it assumes that it should wait until the cleanup is finished; * otherwise it calls ath_tx_tid_resume(). If tid->cleanup_inprogress is set to 1 then the normal TX completion path will eventually call ath_tx_comp_cleanup_unaggr() or ath_tx_comp_cleanup_aggr() which will clear the flag and resume the TID. If a second path through ath_tx_tid_cleanup() occurs, then: * the caller pauses; * ath_tx_tid_cleanup() is called; * tid->cleanup_inprogress is set to 1, but there's no code to check whether this call actually set it or not - so it doesn't call ath_tx_tid_resume(). So once the frames complete and ath_tx_tid_resume() is called, there's still a pending paused reference and thus traffic never continues flowing. >Release-Note: >Audit-Trail: >Unformatted: