From owner-freebsd-wireless@FreeBSD.ORG Thu Feb 14 05:14:56 2013 Return-Path: Delivered-To: freebsd-wireless@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 201F4943 for ; Thu, 14 Feb 2013 05:14:56 +0000 (UTC) (envelope-from adrian.chadd@gmail.com) Received: from mail-wi0-x229.google.com (mail-wi0-x229.google.com [IPv6:2a00:1450:400c:c05::229]) by mx1.freebsd.org (Postfix) with ESMTP id 9D789678 for ; Thu, 14 Feb 2013 05:14:55 +0000 (UTC) Received: by mail-wi0-f169.google.com with SMTP id l13so6820708wie.4 for ; Wed, 13 Feb 2013 21:14:53 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:x-received:sender:date:x-google-sender-auth:message-id :subject:from:to:content-type; bh=4FIN8GDd6d1Qid4b6DdrHN/e/qLFFOzK9eA05Y3nvRo=; b=maYO7rVAHwzdMn1Lt5UWwdXRNNh/5w/wqfRtOCMvL/5ntnnOtN5+RjXaIM7uM8OsKh fWR6Kd78ya8+awU3Hr+FbQnSHZvnspPplWWnQTPn4XnHvRtD3NxsMFHEnOtWMrd5laTt PmSowPZxr5QJhLssN/2BaQsra3YOltW4bbz21TOYKL4mNGU1Ikuxvi5+JjB1hp/CyrTP sBdvnz+ZthBp2EbnUw+tQFfEtxFXYZQwM589pWVp4LLU9qovPQi4xfenqLo8i3tHZjzZ 1uat8sXVqYGDA1u3LkSniU0mumTCiMUf9uX/15eSSHEa3nimCQ1qMJrAlMdjrW42Xgcw nQ9Q== MIME-Version: 1.0 X-Received: by 10.181.12.103 with SMTP id ep7mr14327990wid.12.1360818893560; Wed, 13 Feb 2013 21:14:53 -0800 (PST) Sender: adrian.chadd@gmail.com Received: by 10.216.236.88 with HTTP; Wed, 13 Feb 2013 21:14:53 -0800 (PST) Date: Wed, 13 Feb 2013 21:14:53 -0800 X-Google-Sender-Auth: Vjbsh_ndMO0sIOtMsENYpSYHU4w Message-ID: Subject: [RFC] serialising net80211 TX From: Adrian Chadd To: freebsd-wireless@freebsd.org Content-Type: text/plain; charset=ISO-8859-1 X-BeenThere: freebsd-wireless@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "Discussions of 802.11 stack, tools device driver development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 14 Feb 2013 05:14:56 -0000 Hi, I'd like to work on the net80211 TX serialisation now. I'll worry about driver serialisation and if_transmit methods later. The 30 second version - it happens in parallel, which means preemption and multi-core devices can and will hit lots of subtle and hard-to-debug races in the TX path. We actually need an end-to-end serialisation method - not only for the 802.11 state (sequence number, correct aggregation handling, etc) but to line up 802.11 sequence number allocation with the encryption IV/PN values. Otherwise you end up with lots of crazy subtle out of order packets occuring. The other is the seqno/CCMP IV race between the raw transmit path and the normal transmit path. There are other nagging issues that I'm trying to resolve - but, one thing at a time. So there are three current contenders: * wrap everything in net80211 TX in a per-vap TX lock; grab it at the beginning of ieee80211_output() and ieee80211_start(), and don't release it until the frame is queued to something (a power save queue, an age queue, the driver.) That guarantees that the driver is called in lock-step with each frame being processed. * do deferred transmit- ie, the net80211 entry points simply queue mbufs to a queue, and a taskqueue runs over the ifnet queue and runs those frames in-order. There's no need for a lock here as there's only one sending context (either per-VAP or per-IC). * A hybrid setup - use a per-vap TX lock; do a try-acquire on it and direct dispatch from the queue head if we have it; otherwise defer frames into a queue and have a taskqueue handle those. 1) is what drivers like iwn(4) do internally. 2) is what I've tinkered with - but we become a slave to the scheduler. Task switching is expensive and unpredictable; doubly so for a non-preemption kernel. We'd have to run the TX taskqueue at some very high priority to get something resembling direct-dispatch behaviour. 3) is what the gige/10ge drivers do. They hold a big TX lock for each TX (from xxx_transmit() to hardware dispatch) and if they can't acquire the TX lock, they defer it to a drbd lockless ring buffer and service that via a taskqueue. I can implement any of the above. architecturally I'd prefer 2) - it massively simplifies and streamlines things, but the scheduling latency is just plain stab-worthy.I'm tempted to just do 1) for now and turn it into 3) if we need to. The main reason against doing 1) (and why 2) is nicer) is recursion - if the TX path wants to call the net80211 TX code for some odd reason, we'll hit lock recursion. I'd rather have the system crash at this point (and then fix the misbehaving driver) but that's just me. So - what do people think? Once this is done I'd like to make sure that the wifi chipset drivers do the same - ie, ensure that the frame order is preserved both between the normal and the raw xmit paths. That should fix all of the odd CCMP out of order crap that I see under heavy, heavy test conditions. Thanks, Adrian