Skip site navigation (1)Skip section navigation (2)
Date:      Mon,  6 Feb 2006 08:30:16 +0000 (GMT)
From:      Nate Nielsen <nielsen-list@memberwebs.com>
To:        Sam Leffler <sam@errno.com>
Cc:        "freebsd-net@FreeBSD.org" <freebsd-net@freebsd.org>
Subject:   Re: Polling for ath driver
Message-ID:  <20060206083013.99D32DCAA14@mail.npubs.com>
References:  <20060205011931.E5F08DCA99E@mail.npubs.com> <43E63FC3.2030409@errno.com> <20060206050301.AC895DCAA3F@mail.npubs.com> <43E6E4B6.50100@errno.com>

next in thread | previous in thread | raw e-mail | index | archive | help
This is a multi-part message in MIME format.
--------------010604040400040500050206
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit

Sam Leffler wrote:
> Nate Nielsen wrote:
>> Adding polling to this driver does increase performance on embedded
>> systems. With my current patch (on a 233Mhz system), the throughput (in
>> this case a simple TCP stream) goes up by ~6Mbits, from 18Mbits to
>> 24Mbits.
> 
> I routinely get >20 Mb/s for a single client running upstream TCP
> netperf through a soekris 4511.  If you are seeing 6Mb/s you have
> something else wrong.

Note I was talking about an *increase* of 6Mb/s.

In addition my TCP stream ends on the box in question, which obviously
increases the load on the system, which brings the numbers we are both
seeing roughly in the same ballpark.

> I've not seen livelock in any situations though there are some issues
> with the priority of the taskqueue thread.

With a small number of simple self regulating packet streams (such as
TCP) livelock is not really an issue, as the the streams will slow their
transmit rate when the box gets near livelock and packets start dropping.

However on more complex links where traffic is not (or not completely)
self regulating (ie: VOIP, other datagram streams, a high number of TCP
streams, asymetric routing) livelock because of interrupt performance is
a common occurance.

> Polling is not a panacea;
> you are potentially increasing latency which has ramifications.

Correct, whenever polling is in use (on ethernet as well) we see the
latency go up. In addition the system is under higher load when there is
no traffic. These are tradeoffs that one accepts when using polling.
Obviously DEVICE_POLLING is not configured by default.

>> However it should be noted, that the default behaviour (in 6.0 release)
>> seems to be that the hardware generates about around 2000 interrupts per
>> second at around 15 - 18 Mbits throughput.
> 
> You need to identify what kind of interrupts there are and what type of
> ath hardware you are using.  

The interrupts are the RX and TX interrupts. My polling additions don't
mask out any other interrupts.

> You can trivially reduce the tx interrupt
> load by turning off interrupts on EOL and just using the periodic
> interrupts generated every N tx descriptors.  

Thanks for the tip. I'm sure that would help. I'll look into that.

> But if you profile I
> suspect you will find the interrupt overhead is not significant relative
> to other costs.

The very act of the CPU servicing the interrupt (ie: saving registers,
switching stacks, etc...) causes overhead. In addition the interrupt
handling does not fall under the domain of the scheduler.

This isn't just theory it has a tested real life impact. As I noted
earlier for a simple TCP stream over a wireless link throughput went up
by 6Mb/s.

In my real world case: On a 233Mhz net4826 running GIF encapsulation,
IPSEC encryption, and wireless backhaul, throughput went from being
livelocked at 3.5Mb/s (and userland barely functioning) to over 10Mb/s
(with userland scheduled properly). This is with polling used in the sis
ethernet driver, the hifn crypto card driver, and the ath driver.

Instead of each of these devices generating interrupts, polling (in my
case at 256 Hz) allows the system to function smoothly. Yes, there is
latency of up to 20ms, but that's a small tradeoff for the throughput
and stability increases.

> I'm not convinced polling is worthwhile w/o a major restructuring of the
> driver.  OTOH this shouldn't stop you from pushing forward...

As you noted there are other performance enhancements that could be
made, and while I'd love to see them implemented, I fear they may be out
of my scope and available time. This polling addition is a simple
performance enhancement that helps my clients, and perhaps others would
also be interested.

I'll attach the rough patch, to give an idea of the direction I'm
working in. Note that this patch is incomplete. It locks after a while,
which is probably due to the way I call the taskqueue callbacks. I'll
continue to work on this.

Please understand that by posting this patch I'm not pressuring you to
help. Thanks again for your advice so far.

Cheers,
Nate

--------------010604040400040500050206
Content-Type: text/x-patch;
 name="ath-polling-will-lock-your-system.patch"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline; filename="ath-polling-will-lock-your-system.patch"

Index: if_ath.c
===================================================================
RCS file: /home/ncvs/src/sys/dev/ath/if_ath.c,v
retrieving revision 1.94.2.8
diff -u -3 -p -r1.94.2.8 if_ath.c
--- if_ath.c	29 Jan 2006 07:33:27 -0000	1.94.2.8
+++ if_ath.c	4 Feb 2006 22:42:08 -0000
@@ -44,6 +44,7 @@ __FBSDID("$FreeBSD: src/sys/dev/ath/if_a
  * is greatly appreciated.
  */
 
+#include "opt_device_polling.h"
 #include "opt_inet.h"
 
 #include <sys/param.h>
@@ -174,6 +175,12 @@ static void	ath_setcurmode(struct ath_so
 static void	ath_sysctlattach(struct ath_softc *);
 static void	ath_bpfattach(struct ath_softc *);
 static void	ath_announce(struct ath_softc *);
+static void	ath_update_intr(struct ath_softc *);
+static void	ath_mask_intr(struct ath_softc *, HAL_INT mask);
+
+#ifdef DEVICE_POLLING
+static poll_handler_t ath_poll;
+#endif
 
 SYSCTL_DECL(_hw_ath);
 
@@ -476,6 +483,10 @@ ath_attach(u_int16_t devid, struct ath_s
 	ifp->if_snd.ifq_drv_maxlen = IFQ_MAXLEN;
 	IFQ_SET_READY(&ifp->if_snd);
 
+#ifdef DEVICE_POLLING
+	ifp->if_capabilities |= IFCAP_POLLING;
+#endif
+
 	ic->ic_ifp = ifp;
 	ic->ic_reset = ath_reset;
 	ic->ic_newassoc = ath_newassoc;
@@ -609,6 +620,11 @@ ath_detach(struct ath_softc *sc)
 	DPRINTF(sc, ATH_DEBUG_ANY, "%s: if_flags %x\n",
 		__func__, ifp->if_flags);
 
+#ifdef DEVICE_POLLING
+	if (ifp->if_capenable & IFCAP_POLLING)
+		ether_poll_deregister(ifp);
+#endif
+
 	ath_stop(ifp);
 	bpfdetach(ifp);
 	/* 
@@ -674,6 +690,22 @@ ath_shutdown(struct ath_softc *sc)
 	ath_stop(ifp);
 }
 
+static void	
+ath_update_intr(struct ath_softc *sc)
+{
+	/* When polling suppress certain interrupts */
+	ath_hal_intrset(sc->sc_ah, 
+		sc->sc_imask & ~sc->sc_ipolling);
+}
+
+static void	
+ath_mask_intr(struct ath_softc *sc, HAL_INT mask)
+{
+	/* When polling suppress certain interrupts */
+	ath_hal_intrset(sc->sc_ah, 
+		(sc->sc_imask & ~sc->sc_ipolling) & mask);
+}
+
 /*
  * Interrupt handler.  Most of the actual processing is deferred.
  */
@@ -700,7 +732,7 @@ ath_intr(void *arg)
 		DPRINTF(sc, ATH_DEBUG_ANY, "%s: if_flags 0x%x\n",
 			__func__, ifp->if_flags);
 		ath_hal_getisr(ah, &status);	/* clear ISR */
-		ath_hal_intrset(ah, 0);		/* disable further intr's */
+		ath_mask_intr(sc, 0);		/* disable further intr's */
 		return;
 	}
 	/*
@@ -720,11 +752,11 @@ ath_intr(void *arg)
 		 * by the hal.
 		 */
 		sc->sc_stats.ast_hardware++;
-		ath_hal_intrset(ah, 0);		/* disable intr's until reset */
+		ath_mask_intr(sc, 0);		/* disable intr's until reset */
 		taskqueue_enqueue(taskqueue_swi, &sc->sc_fataltask);
 	} else if (status & HAL_INT_RXORN) {
 		sc->sc_stats.ast_rxorn++;
-		ath_hal_intrset(ah, 0);		/* disable intr's until reset */
+		ath_mask_intr(sc, 0);		/* disable intr's until reset */
 		taskqueue_enqueue(taskqueue_swi, &sc->sc_rxorntask);
 	} else {
 		if (status & HAL_INT_SWBA) {
@@ -764,18 +796,39 @@ ath_intr(void *arg)
 			 * Disable interrupts until we service the MIB
 			 * interrupt; otherwise it will continue to fire.
 			 */
-			ath_hal_intrset(ah, 0);
+			ath_mask_intr(sc, 0);
 			/*
 			 * Let the hal handle the event.  We assume it will
 			 * clear whatever condition caused the interrupt.
 			 */
 			ath_hal_mibevent(ah,
 				&ATH_NODE(sc->sc_ic.ic_bss)->an_halstats);
-			ath_hal_intrset(ah, sc->sc_imask);
+			ath_update_intr(sc);
 		}
 	}
 }
 
+#ifdef DEVICE_POLLING
+
+static void 
+ath_poll(struct ifnet *ifp, enum poll_cmd cmd, int count)
+{
+	struct ath_softc *sc = ifp->if_softc;
+
+	if (sc->sc_invalid)
+		return;
+
+	if (!((ifp->if_flags & IFF_UP) && (ifp->if_drv_flags &
+	    IFF_DRV_RUNNING)))
+		return;		
+
+	/* XXX Locks. Call the transmit and receive functions */
+	(sc->sc_rxtask.ta_func) (sc, -1); /* taskqueue_enqueue(taskqueue_swi, &sc->sc_rxtask); */
+	(sc->sc_txtask.ta_func) (sc, -1); /* taskqueue_enqueue(taskqueue_swi, &sc->sc_txtask); */
+}
+
+#endif /* DEVICE_POLLING */
+
 static void
 ath_fatal_proc(void *arg, int pending)
 {
@@ -898,6 +951,10 @@ ath_init(void *arg)
 
 	/*
 	 * Enable interrupts.
+	 * 
+	 * sc_imask holds the current interrupts we're interested in.
+	 * This can be masked by sc_ipolling which suppresses certain 
+         * interrupts when in polling mode. 
 	 */
 	sc->sc_imask = HAL_INT_RX | HAL_INT_TX
 		  | HAL_INT_RXEOL | HAL_INT_RXORN
@@ -908,7 +965,7 @@ ath_init(void *arg)
 	 */
 	if (sc->sc_needmib && ic->ic_opmode == IEEE80211_M_STA)
 		sc->sc_imask |= HAL_INT_MIB;
-	ath_hal_intrset(ah, sc->sc_imask);
+	ath_update_intr(sc);
 
 	ifp->if_drv_flags |= IFF_DRV_RUNNING;
 	ic->ic_state = IEEE80211_S_INIT;
@@ -965,7 +1022,7 @@ ath_stop_locked(struct ifnet *ifp)
 					!sc->sc_ledon);
 				sc->sc_blinking = 0;
 			}
-			ath_hal_intrset(ah, 0);
+			ath_mask_intr(sc, 0);
 		}
 		ath_draintxq(sc);
 		if (!sc->sc_invalid) {
@@ -1024,7 +1081,7 @@ ath_reset(struct ifnet *ifp)
 	sc->sc_curchan.channel = c->ic_freq;
 	sc->sc_curchan.channelFlags = ath_chan2flags(ic, c);
 
-	ath_hal_intrset(ah, 0);		/* disable interrupts */
+	ath_mask_intr(sc, 0);		/* disable interrupts */
 	ath_draintxq(sc);		/* stop xmit side */
 	ath_stoprecv(sc);		/* stop recv side */
 	/* NB: indicate channel change so we do a full reset */
@@ -1043,7 +1100,7 @@ ath_reset(struct ifnet *ifp)
 	ath_chan_change(sc, c);
 	if (ic->ic_state == IEEE80211_S_RUN)
 		ath_beacon_config(sc);	/* restart beacons */
-	ath_hal_intrset(ah, sc->sc_imask);
+	ath_update_intr(sc);
 
 	ath_start(ifp);			/* restart xmit */
 	return 0;
@@ -2178,12 +2235,12 @@ ath_beacon_config(struct ath_softc *sc)
 			, bs.bs_cfpnext
 			, bs.bs_timoffset
 		);
-		ath_hal_intrset(ah, 0);
+		ath_mask_intr(sc, 0);
 		ath_hal_beacontimers(ah, &bs);
 		sc->sc_imask |= HAL_INT_BMISS;
-		ath_hal_intrset(ah, sc->sc_imask);
+		ath_update_intr(sc);
 	} else {
-		ath_hal_intrset(ah, 0);
+		ath_mask_intr(sc, 0);
 		if (nexttbtt == intval)
 			intval |= HAL_BEACON_RESET_TSF;
 		if (ic->ic_opmode == IEEE80211_M_IBSS) {
@@ -2209,7 +2266,7 @@ ath_beacon_config(struct ath_softc *sc)
 		}
 		ath_hal_beaconinit(ah, nexttbtt, intval);
 		sc->sc_bmisscount = 0;
-		ath_hal_intrset(ah, sc->sc_imask);
+		ath_update_intr(sc);
 		/*
 		 * When using a self-linked beacon descriptor in
 		 * ibss mode load it once here.
@@ -3975,7 +4032,7 @@ ath_chan_set(struct ath_softc *sc, struc
 		 * hardware at the new frequency, and then re-enable
 		 * the relevant bits of the h/w.
 		 */
-		ath_hal_intrset(ah, 0);		/* disable interrupts */
+		ath_mask_intr(sc, 0);		/* disable interrupts */
 		ath_draintxq(sc);		/* clear pending tx frames */
 		ath_stoprecv(sc);		/* turn off frame recv */
 		if (!ath_hal_reset(ah, ic->ic_opmode, &hchan, AH_TRUE, &status)) {
@@ -4007,7 +4064,7 @@ ath_chan_set(struct ath_softc *sc, struc
 		/*
 		 * Re-enable interrupts.
 		 */
-		ath_hal_intrset(ah, sc->sc_imask);
+		ath_update_intr(sc);
 	}
 	return 0;
 }
@@ -4089,7 +4146,7 @@ ath_newstate(struct ieee80211com *ic, en
 		/*
 		 * NB: disable interrupts so we don't rx frames.
 		 */
-		ath_hal_intrset(ah, sc->sc_imask &~ HAL_INT_GLOBAL);
+		ath_mask_intr(sc, ~HAL_INT_GLOBAL);
 		/*
 		 * Notify the rate control algorithm.
 		 */
@@ -4179,9 +4236,8 @@ ath_newstate(struct ieee80211com *ic, en
 		 */
 		ath_beacon_config(sc);
 	} else {
-		ath_hal_intrset(ah,
-			sc->sc_imask &~ (HAL_INT_SWBA | HAL_INT_BMISS));
 		sc->sc_imask &= ~(HAL_INT_SWBA | HAL_INT_BMISS);
+		ath_update_intr(sc);
 	}
 done:
 	/*
@@ -4627,7 +4683,7 @@ ath_ioctl(struct ifnet *ifp, u_long cmd,
 	struct ath_softc *sc = ifp->if_softc;
 	struct ieee80211com *ic = &sc->sc_ic;
 	struct ifreq *ifr = (struct ifreq *)data;
-	int error = 0;
+	int mask, error = 0;
 
 	ATH_LOCK(sc);
 	switch (cmd) {
@@ -4664,6 +4720,33 @@ ath_ioctl(struct ifnet *ifp, u_long cmd,
 		if (ifp->if_drv_flags & IFF_DRV_RUNNING)
 			ath_mode_init(sc);
 		break;
+	case SIOCSIFCAP:
+		mask = ifr->ifr_reqcap ^ ifp->if_capenable;
+#ifdef DEVICE_POLLING
+		if (mask & IFCAP_POLLING) {
+			if (ifr->ifr_reqcap & IFCAP_POLLING) {
+				error = ether_poll_register(ath_poll, ifp);
+				if (error)
+					return(error);
+				ATH_LOCK(sc);
+				/* Interrupts to suppress while polling */
+				sc->sc_ipolling = HAL_INT_RX | HAL_INT_TX;
+				ath_update_intr(sc);
+				ifp->if_capenable |= IFCAP_POLLING;
+				ATH_UNLOCK(sc);
+			} else {
+				error = ether_poll_deregister(ifp);
+				/* Enable interrupt even in error case */
+				ATH_LOCK(sc);
+				/* Don't suppress any interrupts */
+				sc->sc_ipolling = 0;
+				ath_update_intr(sc);
+				ifp->if_capenable &= ~IFCAP_POLLING;
+				ATH_UNLOCK(sc);
+			}
+		}
+#endif
+		break;
 	case SIOCGATHSTATS:
 		/* NB: embed these numbers to get a consistent view */
 		sc->sc_stats.ast_tx_packets = ifp->if_opackets;
Index: if_athvar.h
===================================================================
RCS file: /home/ncvs/src/sys/dev/ath/if_athvar.h,v
retrieving revision 1.27.2.4
diff -u -3 -p -r1.27.2.4 if_athvar.h
--- if_athvar.h	29 Jan 2006 07:17:02 -0000	1.27.2.4
+++ if_athvar.h	4 Feb 2006 22:42:08 -0000
@@ -209,6 +209,7 @@ struct ath_softc {
 	u_int8_t		sc_protrix;	/* protection rate index */
 	u_int			sc_txantenna;	/* tx antenna (fixed or auto) */
 	HAL_INT			sc_imask;	/* interrupt mask copy */
+	HAL_INT			sc_ipolling;	/* interrupts to mask out when polling */
 	u_int			sc_keymax;	/* size of key cache */
 	u_int8_t		sc_keymap[ATH_KEYBYTES];/* key use bit map */
 

--------------010604040400040500050206--




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20060206083013.99D32DCAA14>