From owner-freebsd-wireless@FreeBSD.ORG Fri Mar 16 23:24:00 2012 Return-Path: Delivered-To: freebsd-wireless@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 76225106566B for ; Fri, 16 Mar 2012 23:24:00 +0000 (UTC) (envelope-from adrian.chadd@gmail.com) Received: from mail-pz0-f54.google.com (mail-pz0-f54.google.com [209.85.210.54]) by mx1.freebsd.org (Postfix) with ESMTP id 42B598FC0C for ; Fri, 16 Mar 2012 23:23:59 +0000 (UTC) Received: by dald2 with SMTP id d2so7188274dal.13 for ; Fri, 16 Mar 2012 16:23:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type; bh=d6guPJkGzkkto9OCGLBheJ3KhKjIeLe3Y6P29911trw=; b=K8fSz8rbx52KbvKqULDVMIqiV8M/YNTGDSp9jJczQQho7y9Cq+u2tosOvv+mAuQe3L RpEImRJ5gQ+HCbLehS/nP24rQ5fxruCUkarN9WDbQEEtX0rUf3OcyndxzjPbYUuq//7r KM+dENX8JObihbO83/jLoClTrGYdT5EK1+EzJQwRdE5LZekH2zCshtF4FHAr3LMwT01W 3dREH+LXA9AmW4cruBUhn7lgGT6YPp4o4Zd5Qz32HYYyRPGoGqp9VeNSGghSXsJ7OXGF i7WkUElD+Br1Wz1/iH1Qk8u2sSWAPzpzggt2+eDFoFWZKfR3PrBPw7G80LA3k5O7KDmg /Ilg== MIME-Version: 1.0 Received: by 10.68.232.2 with SMTP id tk2mr19059471pbc.68.1331940239821; Fri, 16 Mar 2012 16:23:59 -0700 (PDT) Sender: adrian.chadd@gmail.com Received: by 10.143.33.5 with HTTP; Fri, 16 Mar 2012 16:23:59 -0700 (PDT) In-Reply-To: References: <4F59DD98.8080905@unsane.co.uk> <4F5AA149.8000904@unsane.co.uk> <4F5BDF3C.8070605@unsane.co.uk> <4F5C0302.8090403@unsane.co.uk> <4F5CA45C.1010603@unsane.co.uk> <4F5E656F.4040004@unsane.co.uk> <4F5FC022.3040202@unsane.co.uk> Date: Fri, 16 Mar 2012 16:23:59 -0700 X-Google-Sender-Auth: WNS70IRWyjESXOJeVtnByWKAVOs Message-ID: From: Adrian Chadd To: Vincent Hoffman Content-Type: text/plain; charset=ISO-8859-1 Cc: freebsd-wireless@freebsd.org Subject: Re: ath0 timeout was "Re: (more) bugs fixed in -HEAD, AP mode is now mostly (again) stable!" X-BeenThere: freebsd-wireless@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussions of 802.11 stack, tools device driver development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 16 Mar 2012 23:24:00 -0000 Hi, Yup, this seems like a concurrency issue with the driver. I'm trying to debug exactly what's going on, but it seems that multiple concurrent threads are doing TX and they're overlapping. I was under the impression that all TX'ing via ath_start() would be serialised via ifnet but apparently not. It's also possible some frames that are legitimately going into the aggregation session (ie, they get a sequence number and want to be ACKed) are being sent via the ath_tx_raw() method, which bypasses the ifnet serialisation entirely. I was reproducing it at home and in the office within 30 seconds: * have chrome running with say, 15 tabs; * kill -9 it so it dies very quickly; * fire up the interface; * fire up debug logging, which is REALLY VERBOSE btw; * fire up chrome again; * reload all the tabs at once; * watch LOTS of concurrent IO go on from lots of multiple sending threads/processes; * then, see some buffers get "stuck" in the software queue, just like you found. Now, why the heck is that happening.. :) Adrian