From owner-svn-src-all@FreeBSD.ORG Sun Oct 14 20:31:38 2012 Return-Path: Delivered-To: svn-src-all@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id C1D25874; Sun, 14 Oct 2012 20:31:38 +0000 (UTC) (envelope-from adrian@FreeBSD.org) Received: from svn.freebsd.org (svn.freebsd.org [IPv6:2001:4f8:fff6::2c]) by mx1.freebsd.org (Postfix) with ESMTP id A07828FC17; Sun, 14 Oct 2012 20:31:38 +0000 (UTC) Received: from svn.freebsd.org (localhost [127.0.0.1]) by svn.freebsd.org (8.14.4/8.14.4) with ESMTP id q9EKVc1P037121; Sun, 14 Oct 2012 20:31:38 GMT (envelope-from adrian@svn.freebsd.org) Received: (from adrian@localhost) by svn.freebsd.org (8.14.4/8.14.4/Submit) id q9EKVcpD037119; Sun, 14 Oct 2012 20:31:38 GMT (envelope-from adrian@svn.freebsd.org) Message-Id: <201210142031.q9EKVcpD037119@svn.freebsd.org> From: Adrian Chadd Date: Sun, 14 Oct 2012 20:31:38 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org Subject: svn commit: r241558 - head/sys/dev/ath X-SVN-Group: head MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-all@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "SVN commit messages for the entire src tree \(except for " user" and " projects" \)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 14 Oct 2012 20:31:38 -0000 Author: adrian Date: Sun Oct 14 20:31:38 2012 New Revision: 241558 URL: http://svn.freebsd.org/changeset/base/241558 Log: Break the RX processing up into smaller chunks of 128 frames each. Right now processing a full 512 frame queue takes quite a while (measured on the order of milliseconds.) Because of this, the TX processing ends up sometimes preempting the taskqueue: * userland sends a frame * it goes in through net80211 and out to ath_start() * ath_start() will end up either direct dispatching or software queuing a frame. If TX had to wait for RX to finish, it would add quite a few ms of additional latency to the packet transmission. This in the past has caused issues with TCP throughput. Now, as part of my attempt to bring sanity to the TX/RX paths, the first step is to make the RX processing happen in smaller 'parts'. That way when TX is pushed into the ath taskqueue, there won't be so much latency in the way of things. The bigger scale change (which will come much later) is to actually process the frames in the ath_intr taskqueue but process _frames_ in the ath driver taskqueue. That would reduce the latency between processing and requeuing new descriptors. But that'll come later. The actual work: * Add ATH_RX_MAX at 128 (static for now); * break out of the processing loop if npkts reaches ATH_RX_MAX; * if we processed ATH_RX_MAX or more frames during the processing loop, immediately reschedule another RX taskqueue run. This will handle the further frames in the taskqueue. This should have very minimal impact on the general throughput case, unless the scheduler is being very very strange or the ath taskqueue ends up spending a lot of time on non-RX operations (such as TX completion.) Modified: head/sys/dev/ath/if_ath_rx.c Modified: head/sys/dev/ath/if_ath_rx.c ============================================================================== --- head/sys/dev/ath/if_ath_rx.c Sun Oct 14 20:00:00 2012 (r241557) +++ head/sys/dev/ath/if_ath_rx.c Sun Oct 14 20:31:38 2012 (r241558) @@ -797,6 +797,8 @@ rx_next: return (is_good); } +#define ATH_RX_MAX 128 + static void ath_rx_proc(struct ath_softc *sc, int resched) { @@ -832,6 +834,15 @@ ath_rx_proc(struct ath_softc *sc, int re sc->sc_stats.ast_rx_noise = nf; tsf = ath_hal_gettsf64(ah); do { + /* + * Don't process too many packets at a time; give the + * TX thread time to also run - otherwise the TX + * latency can jump by quite a bit, causing throughput + * degredation. + */ + if (npkts >= ATH_RX_MAX) + break; + bf = TAILQ_FIRST(&sc->sc_rxbuf); if (sc->sc_rxslink && bf == NULL) { /* NB: shouldn't happen */ if_printf(ifp, "%s: no buffer!\n", __func__); @@ -942,11 +953,22 @@ rx_proc_next: } #undef PA2DESC + /* + * If we hit the maximum number of frames in this round, + * reschedule for another immediate pass. This gives + * the TX and TX completion routines time to run, which + * will reduce latency. + */ + if (npkts >= ATH_RX_MAX) + taskqueue_enqueue(sc->sc_tq, &sc->sc_rxtask); + ATH_PCU_LOCK(sc); sc->sc_rxproc_cnt--; ATH_PCU_UNLOCK(sc); } +#undef ATH_RX_MAX + /* * Only run the RX proc if it's not already running. * Since this may get run as part of the reset/flush path,