From owner-freebsd-wireless@FreeBSD.ORG Sun Jan 6 04:53:18 2013 Return-Path: Delivered-To: freebsd-wireless@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id A8132DEB for ; Sun, 6 Jan 2013 04:53:18 +0000 (UTC) (envelope-from adrian.chadd@gmail.com) Received: from mail-wg0-f43.google.com (mail-wg0-f43.google.com [74.125.82.43]) by mx1.freebsd.org (Postfix) with ESMTP id 29E3B209 for ; Sun, 6 Jan 2013 04:53:17 +0000 (UTC) Received: by mail-wg0-f43.google.com with SMTP id e12so8395480wge.34 for ; Sat, 05 Jan 2013 20:53:16 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:date:x-google-sender-auth:message-id:subject :from:to:content-type; bh=BHams3Q7qaI3MB3bM+UyJ0HNqu/C50VzERTlf9G2hrw=; b=FGoD6bzRaYHEud/CnpAbLBk4EJ4dAhorVRZra9XKKSTdzFrXIjNFSixK+5JfkbcBoG pQpypvTahaouoP8fvDimTMk6GOqOU3KOZrvSL0hbInirR8y+EtIxdeCysonxRq88THWM /6DGcvHsGcptgnsqLlM/m4/qE/tewYnRUD0mG1+FrpOPRllkxXj7P1973jcof9TA/1/1 BAFN2HdpFoP2gKItGmAeDN86wkIJnskEzpKQ0EGcEQ2H1YqFKf64pZhzKB0CYW/VI3JK A+6Z1YlfIVxK/RFbkWfamPaIsAgX3N7eoHedSRoR6ChIg4FldjiH5v3y9bZbECkpLDO4 kj8g== MIME-Version: 1.0 Received: by 10.180.8.130 with SMTP id r2mr3943284wia.28.1357447996704; Sat, 05 Jan 2013 20:53:16 -0800 (PST) Sender: adrian.chadd@gmail.com Received: by 10.217.57.9 with HTTP; Sat, 5 Jan 2013 20:53:16 -0800 (PST) Date: Sat, 5 Jan 2013 20:53:16 -0800 X-Google-Sender-Auth: PSxoxXdyC7W1U1p7vTzP3C8H_lo Message-ID: Subject: [CFT] ath(4) migration to if_transmit() and a transmit tasklet, rather than direct dispatch From: Adrian Chadd To: freebsd-wireless@freebsd.org Content-Type: text/plain; charset=ISO-8859-1 X-BeenThere: freebsd-wireless@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "Discussions of 802.11 stack, tools device driver development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 06 Jan 2013 04:53:18 -0000 Hi, I've written up a replacement ath(4) TX path that does a bunch of things. The patch I'd like everyone to try: http://people.freebsd.org/~adrian/ath/20130105-if_transmit_txfrag_2.diff What it does: * It implements a driver staging queue for frames from if_start and if_transmit(); * It populates that staging queue with ath_buf's, that contain the mbuf and node ref; * The actual TX occurs in a taskqueue (the default ath taskqueue for now, but I'll move it to another one soon) in order to serialise things; * The tx fragment list is populated correctly (but doesn't quite work, see below); * the reliance on "peeking" at the next mbuf in a fragment list is gone - instead, I now store the data length of the next fragment in the current buffer and fire that across. Now, in station mode I get exactly the same throughput as before - 150-180mbit TCP iperf tests. It's great. I need to do some more (single core) MIPS testing - if it drops in throughput there, it'll likely be the ridiculously huge calls to taskqueue_enqueue() and/or some lock contention. I'd like to commit this to -HEAD and then begin next the next phase, which is: * Figure out why TX fragments are transmitted but dropped by (some) receivers. FreeBSD -> FreeBSD works fine, but FreeBSD -> (some random 11g cable modem router) just plain drops the fragments in question. Sigh; * Tidy up some more of the locking, which likely involves separating out the TX queue lock from the TX taskqueue lock; * Push raw xmit frames into the same queue mechanism, so they are queued in the same fashion and obey the same sequence number / CCMP IV allocation ordering that data frames do (which is important as things like EAPOL frames are encrypted and have sequence numbers, but come in the raw xmit path. Grr.) * Finish tidying up when things are called - specifically, I'd like to make sure all the buffer completions occur outside of the locks behind held, so I can finally avoid a bunch of potential LORs when doing things like transmitting BAR frames from the TX completion path. I'd really appreciate any testing that can be done for this. It doesn't matter which mode you're in - adhoc, hostap, mesh, sta, 11n or non-11n - all ath(4) chips share the same TX path code and this all happens before the software TX queue and aggregation handling. Once this is all verified and working, I'll work on migrating the net80211 TX path to actually use if_transmit() itself and use a TX taskqueue to serialise all TX. That should fix a whole bunch of subtle, niggling little TX side bugs that have been the bane of my existence since I took this code on a couple years ago. Finally - although I'd like to _fix_ TX fragment handling, it still has its .. quirks. I'm still worried that a very active STA or AP with multiple traffic sessions will end up with TX fragments in the software queue that aren't kept 100% in order (ie, other frames from other sessions get interspersed with other session traffic. For now it won't happen - the TX lock is held for the duration of running the TX queuing and so (in theory!) nothing should appear in the TX queue in between TX fragments. But I don't trust it. Chances are the correct fix is a lot more nasty than the current net80211 way of "just do fragmentation in the net80211 layer and the driver will figure it out" lets me do cleanly. (Ie, I think the clean way is to do fragmentation at the point where you're about to queue it to the actual hardware and not in net80211, but I digress.) Adrian